# 64k Demo Project

Goal:
- Produce a <=64k native demo binary
- Same C++ codebase for Windows, macOS, Linux

Graphics:
- WebGPU via wgpu-native
- WGSL shaders
- Hybrid rendering: Rasterized proxy geometry + SDF raymarching

Audio:
- 32 kHz, 16-bit stereo
- Procedurally generated samples
- Real-time additive synthesis from spectrograms (IDCT)
- Variable tempo system with music time abstraction
- Event-based pattern triggering for dynamic tempo scaling
- Modifiable Loops and Patterns, w/ script to generate them (like a Tracker)
- Unified AudioEngine for lifecycle management (eliminates initialization fragility)

Constraints:
- Size-sensitive
- Minimal dependencies
- Explicit control over all allocations

Style:
- Demoscene
- No engine abstractions

---
## Project Roadmap

### Recently Completed

#### Milestone: Audio Playback Stability & Debug Infrastructure (February 4, 2026)
- **Core Audio Backend Optimization**: Resolved critical audio playback issues (stop-and-go, glitches, eventual silence) caused by timing mismatches in miniaudio's Core Audio backend. Root cause: Core Audio optimized for 44.1kHz with 10ms periods, but our 32kHz system expected uniform ~13.78ms callbacks, causing resampling jitter. Fix: Added `allowNominalSampleRateChange = TRUE` to force OS-level 32kHz native and `performanceProfile = conservative` for larger buffers (4096 frames = 128ms). Result: Stable ~128ms callbacks with <1ms jitter, zero underruns during variable tempo (1.0x → 2.0x).

- **Ring Buffer Capacity Tuning**: Increased ring buffer from 200ms to 400ms (25,600 samples) to handle tempo scaling headroom. Added comprehensive bounds checking with abort() on violations to catch buffer corruption early. Fixed critical bug: tempo-scaled rendering wasn't scaling dt when pre-filling buffer (`audio_render_ahead(g_music_time, dt * g_tempo_scale)`). Buffer now maintains 400ms fullness throughout playback, including 2.0x tempo acceleration.

- **NOTE_ Parsing Bug Fix & Sample Caching**: Fixed critical `is_note_name()` bug that only checked first letter (A-G), causing ASSET_KICK_1 to be misidentified as A0 (27.5 Hz). Solution: Required "NOTE_" prefix (NOTE_E2, NOTE_G4) to distinguish from ASSET_* samples. Discovered resource exhaustion: every note event created NEW spectrogram (14 unique samples → 228 registrations). Implemented comprehensive caching in `tracker_init()`: pre-register all asset samples (loaded once from AssetManager) and generated notes (created once, stored in persistent pool). Result: 256 → 32 MAX_SPECTROGRAMS (88% memory reduction), zero spectrogram_id=-1 errors, perfect audio playback.

- **Debug Logging Infrastructure**: Created systematic debug logging system (`src/util/debug.h`) with 7 category macros (DEBUG_LOG_AUDIO, DEBUG_LOG_RING_BUFFER, DEBUG_LOG_TRACKER, DEBUG_LOG_SYNTH, DEBUG_LOG_3D, DEBUG_LOG_ASSETS, DEBUG_LOG_GPU). Added `DEMO_ENABLE_DEBUG_LOGS` CMake option that defines `DEBUG_LOG_ALL` to enable all categories. Converted all diagnostic code in audio subsystem (miniaudio callbacks, ring buffer tracking, tracker validation, synth parameter checks) to use category macros. Default build: macros compile to `((void)0)` for zero runtime cost. Debug build: comprehensive logging (timing, buffer state, caching, validation). Updated `CONTRIBUTING.md` with pre-commit policy requiring debug build verification to ensure diagnostic code remains maintainable.

- **Resource Analysis Tool**: Enhanced `tracker_compiler` to report required vs recommended pool sizes, cache potential, and memory usage. Analysis showed 152 required / 228 recommended spectrograms without caching, only 14 unique samples with caching. Tool now generates actionable optimization recommendations during music data compilation.

#### Milestone: Audio System Robustness & Variable Tempo (February 4, 2026)
- **Event-Based Tracker for Tempo Scaling**: Refactored tracker system from pattern compositing to individual event triggering. Previously, pattern events were baked into single spectrograms at fixed positions, so tempo scaling only affected pattern trigger timing. Now each TrackerEvent triggers as a separate voice with timing calculated dynamically based on music_time. Result: Notes within patterns correctly accelerate/decelerate with tempo changes. At 2.0x tempo, both pattern triggering AND note spacing play 2x faster. Verified with WAV dump showing 61.24s music time in 60s physical time during tempo transitions. All 17 tests pass (100%).

- **WAV Dump Backend for Debugging**: Added `WavDumpBackend` implementing `AudioBackend` interface to render audio offline to .wav files instead of playing on device. Enabled via `--dump_wav output.wav` flag. Critical bug fix: Synth outputs STEREO (interleaved L/R) but initial implementation wrote MONO, causing severe distortion. Fixed by allocating `frames * 2` samples and writing stereo format (num_channels = 2). Added regression test (`test_wav_dump.cc`) with critical assertion `assert(header.num_channels == 2)` to prevent future mono/stereo mismatches. WAV format now matches live audio exactly: 16-bit PCM, stereo, 32kHz.

- **Variable Tempo System**: Implemented unified music time abstraction in `main.cc` that decouples tracker timing from physical time. Music time advances at configurable `tempo_scale` rate (default 1.0), enabling dynamic tempo changes without pitch shifting. Created comprehensive test suite (`test_variable_tempo.cc`) verifying 2x speed-up and 2x slow-down "reset tricks" work perfectly. All 6 test scenarios pass with mathematical precision. System ready for expressive tempo control in demo with zero size impact.

- **Task #51: Tracker Timing Verification System**: Created robust audio testing infrastructure with mock backend abstraction. Implemented `AudioBackend` interface separating synthesis from output, `MiniaudioBackend` for production, and `MockAudioBackend` for testing. Added event recording with precise timestamp tracking (32kHz sample rate). Created comprehensive test suite (`test_tracker_timing.cc`) that **verified simultaneous pattern triggers have 0.000ms delta** (perfect synchronization). All test infrastructure under `#if !defined(STRIP_ALL)` for zero production impact. 17/17 tests passing.

#### Milestone: Audio Lifecycle Refactor (February 5, 2026)
- **Task #56: AudioEngine Implementation**: Completed 4-phase refactor to eliminate fragile initialization order dependency between synth and tracker. **Phase 1 (Design & Prototype)**: Created `AudioEngine` class and `SpectrogramResourceManager` for unified lifecycle management with lazy loading. **Phase 2 (Test Migration)**: Migrated all tracker tests to use AudioEngine (test_tracker.cc, test_tracker_timing.cc, test_variable_tempo.cc, test_wav_dump.cc). **Phase 3 (Production Integration)**: Updated main.cc to use AudioEngine, fixed pre-existing procedural texture crash (flash_cube_effect.cc, hybrid_3d_effect.cc). **Phase 4 (Cleanup & Documentation)**: Removed backwards compatibility (synth_init from audio_init), updated HOWTO.md and CONTRIBUTING.md with usage patterns. Result: All 20 tests pass, binary size impact <500 bytes, initialization fragility eliminated.

#### Milestone: Build System Optimization (February 6, 2026)
- **Task C: CMake Dependency Graph Optimization**: Resolved critical build correctness bugs and improved developer iteration speed. **Header Split**: Refactored monolithic `asset_manager.h` (61 lines) into three focused headers: `asset_manager_dcl.h` (forward declarations for AssetId), `asset_manager.h` (core GetAsset/DropAsset API), and `asset_manager_utils.h` (typed helpers for TextureAsset/MeshAsset). Updated 17 source files to use appropriate headers. **Asset Dependency Tracking**: Implemented file-level dependency tracking for all 42 demo assets and 17 test assets. CMake now correctly tracks individual `.wgsl` shaders, `.spec` audio files, and `.obj` mesh files. **Critical Bug Fixed**: Changing shader files was NOT triggering asset regeneration, resulting in stale code in binaries. Developers had to manually `touch demo_assets.txt` as workaround. Now works correctly. **Performance**: Editing TextureAsset/MeshAsset helpers improved from 4.82s to 2.01s (58% faster). Shader edits now trigger correct 3.5s rebuild (was 0.28s with no rebuild - incorrect). **Implementation**: Added `parse_asset_list()` CMake function that parses `demo_assets.txt` format (`ASSET_NAME, COMPRESSION, FILENAME, DESC`) and extracts individual file paths for dependency tracking. All 20 tests pass. Zero functionality regressions.

#### Milestone: Critical Shader Stability & Test Infrastructure (February 6, 2026) ✅
- **Shader Crash Resolution**: Fixed three critical WGSL validation errors causing demo64k and test_3d_render to crash on startup. **Bug 1 (renderer_3d.wgsl)**: Removed dead code using non-existent `inverse()` function - WGSL doesn't provide matrix inverse, and validator checks all code paths even unreachable ones. Also removed reference to undefined `in.normal` vertex input. **Bug 2 (sdf_utils.wgsl, lighting.wgsl)**: Fixed `get_normal_basic()` function signature mismatch - changed from `obj_type: f32` to `obj_params: vec4<f32>` to match `get_dist()` calls. **Bug 3 (scene_query_linear.wgsl - ROOT CAUSE)**: Fixed linear scene query mode incorrectly declaring binding 2 (BVH storage buffer). Linear version was identical to BVH version due to copy-paste error. Replaced BVH traversal with proper linear object iteration loop. **Impact**: When `use_bvh=false`, pipeline created without binding 2, but shader expected it → validation error → crash.

- **Comprehensive Shader Compilation Tests**: Created `test_shader_compilation.cc` to prevent regression. Test compiles all production shaders through WebGPU (`wgpuDeviceCreateShaderModule`), validates both BVH and Linear composition modes using `ShaderComposer`, catches WGSL syntax errors, binding mismatches, type errors, and function signature issues. **Test Gap Analysis**: Existing `test_shader_assets.cc` only checked for keywords (`@vertex`, `fn`, etc.) without actual compilation - would NOT have caught any of the three bugs. New test fills this gap with real GPU validation. **Graceful Degradation**: Test handles platforms where WebGPU device unavailable by skipping GPU tests but still validating shader composition.

- **Results**: demo64k runs cleanly without WebGPU errors, test_3d_render no longer crashes, 22/23 tests pass (FftTest unrelated to this work), comprehensive regression prevention infrastructure in place. All shader composition modes (BVH/Linear) now validated in CI. Files modified: renderer_3d.wgsl, sdf_utils.wgsl, lighting.wgsl, scene_query_linear.wgsl, test_shader_compilation.cc (new), CMakeLists.txt.

#### Milestone: FFT-based DCT/IDCT Complete (February 6, 2026) ✅
- **Core FFT Implementation**: Replaced failing double-and-mirror method with Numerical Recipes reordering method for DCT via FFT. Fixed reference IDCT to use DCT-III (inverse of DCT-II, not IDCT-II). All transforms now use orthonormal normalization: sqrt(1/N) for DC term, sqrt(2/N) for AC terms. Result: Perfect round-trip accuracy for impulse signals, <5e-3 error for sinusoidals (acceptable for FFT).

- **Audio Pipeline Integration**: Integrated FFT-based DCT/IDCT into audio engine (`src/audio/idct.cc`, `fdct.cc`) and both web editors (`tools/spectral_editor/dct.js`, `tools/editor/dct.js`). All synthesis paths now use fast O(N log N) FFT instead of O(N²) naive DCT. Regenerated all 14 spectrogram assets with orthonormal DCT to match new synthesis engine.

- **Critical Fixes**: **(1) Normalization Mismatch**: Old non-orthonormal DCT produced 16× larger values. Solution: Regenerated all `.spec` assets with orthonormal DCT. **(2) Procedural Notes**: NOTE_* generation inaudible after normalization change. Solution: Added sqrt(DCT_SIZE/2) = 16× scaling compensation in `gen.cc`. **(3) Windowing Error**: Hamming window incorrectly applied to spectrum before IDCT (should only be used for analysis, not synthesis). Solution: Removed windowing from `synth.cc` and both editors. Result: Correct volume, no distortion, clean frequency spectrum.

- **Testing & Verification**: All 23 tests pass (100% success rate). Round-trip accuracy verified (impulse at index 0: perfect). Audio playback: correct volume, no distortion. Procedural notes: audible at correct levels. Web editors: clean spectrum, no comb artifacts. WAV dumps match expected output.

- **Key Technical Insights**: (1) DCT-III is the inverse of DCT-II, not IDCT-II. (2) Hamming window is ONLY for analysis (before DCT), NOT synthesis (before IDCT). (3) Orthonormal DCT produces sqrt(N/2) smaller values than non-orthonormal. (4) Reordering method is more accurate than double-and-mirror for DCT via FFT. (5) Round-trip accuracy is more important than absolute DCT accuracy.

#### Milestone: Interactive Timeline Editor (February 5, 2026) 🎉
- **Task #57 Phase 1: Production-Ready Timeline Editor**: Created fully functional web-based editor for `demo.seq` timeline files. **Core Features**: Load/save demo.seq with BPM parsing, Gantt-style visual timeline, drag & drop sequences/effects with snap-to-beat, resize handles (left/right) allowing negative relative times, stack-order based priority system (Up/Down buttons + "Same as Above" toggle), floating auto-apply properties panel, diagonal mouse wheel scroll with 10% viewport slack, dynamic sequence bounds calculation, delete/add sequences, re-order by time. **Audio Visualization**: WAV waveform display above timeline using Web Audio API, scales with zoom (pixels per second), documented integration with `--dump_wav` flag for aligning sequences with actual demo audio output. **UI Polish**: Hoverable sequence names (large centered, fades on hover), hidden scrollbars, crosshair cursor on waveform, flash animation on active sequence change, clean minimal interface. **Bug Fixes**: Resolved critical e.target vs e.currentTarget drag offset bug, fixed sequence overlap with cumulative Y positioning, corrected effect relative time calculations. **Files**: `tools/timeline_editor/index.html` (~1200 lines pure HTML/CSS/JS, no dependencies), `README.md` (usage guide, wav_dump integration docs), `ROADMAP.md` (3 phases, 117-161 hour estimate). **Next**: Phase 1.2 (Add Effect button), Phase 2.5 (music.track visualization overlay). **Impact**: Visual timeline editing now production-ready, eliminates manual text editing for sequence placement and timing adjustments.

- **Task #50: WGSL Modularization**: Updated `ShaderComposer` to support recursive `#include` directives, refactored the entire shader library into granular snippets (shapes, utils, lighting), and updated the 3D renderer to use this modular system. This resolved macOS shader compilation issues and significantly improved shader maintainability.
- **Task #48: Improve Audio Coverage**: Achieved 93% coverage for `src/audio/` by adding dedicated tests for DCT transforms, procedural generation, and synthesis rendering.
- **Task #47: Improve Asset Manager Coverage**: Increased `asset_manager.cc` coverage to 88% by testing runtime error paths (unknown functions, generation failure).
- **Task #46: Enhance Coverage Script**: Updated coverage report script to support directory filtering (e.g., `./scripts/gen_coverage_report.sh src/procedural`).
- **Task #45: Improve Procedural Generation Coverage**: Achieved 96% coverage for `src/procedural/` by implementing comprehensive tests for Perlin noise, periodic blending, and parameter handling.
- **Task #44: Developer Tooling (Coverage)**: Added `DEMO_ENABLE_COVERAGE` CMake option and created `scripts/gen_coverage_report.sh` to generate HTML coverage reports using `lcov` on macOS.
- **Skybox & Two-pass Rendering Stability**: Resolved "black screen" and validation errors by implementing a robust two-pass rendering architecture (Pass 1: Skybox/Clear, Pass 2: Scene Objects). Implemented a rotating skybox using world-space ray unprojection (`inv_view_proj`) and a multi-octave procedural noise generator.
- **Task #20: Platform & Code Hygiene**: Consolidated platform-specific shims and WebGPU headers into `platform.h`. Refactored `platform_init` and `platform_poll` for better abstraction. Removed STL containers from initial hot paths (`AssetManager`, `procedural`). Full STL removal for CRT replacement is deferred to the final optimization phase.
- **Task #26: Shader Asset Testing & Validation**: Developed comprehensive tests for `ShaderComposer` and WGSL asset loading/composition. Added a shader validation test to ensure production assets are valid.
- **Asset Pipeline Improvement**: Created a robust `gen_spectrograms.sh` script to automate the conversion of `.wav` and `.aif` files to `.spec` format, replacing the old, fragile script. Added 13 new drum and bass samples to the project.
- **Build System Consolidation (Task #25)**: Modularized the build by creating subsystem libraries (audio, gpu, 3d, util, procedural) and implemented helper macros to reduce boilerplate in `CMakeLists.txt`. This improves build maintenance and prepares for future CRT replacement.
- **Asset System Robustness**: Resolved "static initialization order fiasco" by wrapping the asset table in a "Construct On First Use" getter (`GetAssetRecordTable()`), ensuring assets are available during dynamic global initialization (e.g., shader strings).
- **Shader Asset Integration (Task #24)**: Extracted all hardcoded WGSL strings into `.wgsl` assets, registered them in `demo_assets.txt`, and updated `Renderer3D`, `VisualDebug`, and `Effects` to use `GetAsset` and `ShaderComposer`.
- **WebGPU Stabilization**: Resolved `WGPUSurface` creation failures on macOS by adding platform-specific `GLFW_EXPOSE_NATIVE_COCOA` definitions and fixed validation errors in the render pass configuration.
- **Final Build Stripping (Task #8)**: Implemented the `STRIP_ALL` macro to remove non-essential code (CLI parsing, debug labels, iostream) and refined size optimization flags (`-dead_strip`) for macOS.
- **Minimal Audio Tracker (Task 21.3)**: Finalized a pattern-based audio tracker supporting both procedural notes and asset-based spectrograms with a unified "one-voice-per-pattern" pasting strategy.
- **WGSL Library (Task 21.1)**: Implemented `ShaderComposer` for modular WGSL snippet management.
- **Tight Ray Bounds (Task 21.2)**: Implemented local-space ray-box intersection to optimize SDF raymarching.
- **High-DPI Fix**: Resolved viewport "squishing" via dynamic resolution uniforms and explicit viewports.
- **Unified 3D Shadows**: Implemented robust SDF shadows across all objects using `inv_model` transforms.
- **test_mesh tool**: Implemented a standalone `test_mesh` tool for visualizing OBJ files with debug normal display.

---
## Next Up

- **Task #5: Spectral Brush Editor** [IN PROGRESS - February 6, 2026]
    - Create web-based tool for procedurally tracing audio spectrograms
    - Replace large .spec assets with tiny C++ code (50-100× compression)
    - Phase 1: C++ runtime (`spectral_brush.h/cc` - Bezier curves + Gaussian profiles)
    - Phase 2: Editor UI (HTML/JS canvas, dual-layer visualization, keyboard shortcuts)
    - Phase 3: File I/O (load .wav/.spec, export procedural_params.txt + C++ code)
    - See `doc/SPECTRAL_BRUSH_EDITOR.md` for complete design

- **Task #18: 3D System Enhancements**
    - [ ] **Task #18.0: Basic OBJ Asset Pipeline**: Implement `ASSET_MESH` type, `asset_packer` OBJ support, and `Renderer3D` mesh rendering.
    - [ ] **Task #37: Asset Ingestion**: Update `asset_packer` to handle the new 3D binary format.
    - [ ] **Task #38: Runtime Loader**: Implement a minimal C++ parser to load the scene data into the ECS/Renderer.

- **Visuals & Content**
    - [ ] **Task #52: Procedural SDF Font**: Minimal bezier/spline set for [A-Z, 0-9] and SDF rendering.
    - [ ] **Task #53: Particles Shader Polish**: Improve visual quality of particles.
    - [ ] **Task #55: SDF Random Planes Intersection**: Implement `sdPolyhedron` (crystal/gem shapes) via plane intersection.

- **Tooling & Optimization**
    - [ ] **Task #54: Tracy Integration**: Integrate Tracy debugger for performance profiling.

---
## Future Goals
- **Task #36: Blender Exporter**: Create script to export scenes to internal binary format. (Deprioritized)
- **Task #5: Implement Spectrogram Editor**
    - [ ] Develop a web-based tool (`tools/editor`) for creating and editing `.spec` files visually.
- **Task #21: Shader Optimization**
    - [ ] Use macros or code generation to factorize common WGSL code (normals, bump, lighting).
    - [ ] Implement Tri-planar mapping for better procedural textures.
- [ ] **Task #18-B: GPU BVH & Shadows**: Optimize scene queries with a GPU-based BVH.
- **Phase 2: Advanced Size Optimization**
    - [ ] **Task #22: Windows Native Platform**: Replace GLFW with minimal native Windows API.
    - [ ] **Task #28: Spectrogram Quantization**: Quantize spectrograms to logarithmic frequency and uint16_t.
    - [ ] **Task #35: CRT Replacement**: Investigation and implementation of CRT-free entry point.

---
*For a detailed list of all completed tasks, see the git history.*

## Architectural Overview

### Hybrid 3D Renderer
- **Core Idea**: Uses standard rasterization to draw proxy hulls (boxes), then raymarches inside the fragment shader to find the exact SDF surface.
- **Transforms**: Uses `inv_model` matrices to perform all raymarching in local object space, handling rotation and non-uniform scaling correctly.
- **Shadows**: Instance-based shadow casting with self-shadowing prevention (`skip_idx`).

### Sequence & Effect System
- **Effect**: Abstract base for visual elements. Supports `compute` and `render` phases.
- **Sequence**: Timeline of effects with start/end times.
- **MainSequence**: Top-level coordinator and framebuffer manager.
- **seq_compiler**: Transpiles `assets/demo.seq` into C++ `timeline.cc`.

### Asset & Build System
- **asset_packer**: Embeds binary assets (like `.spec` files) into C++ arrays.
- **Runtime Manager**: O(1) retrieval with lazy procedural generation support.
- **Automation**: `gen_assets.sh`, `build_win.sh`, and `check_all.sh` for multi-platform validation.

### Audio Engine
- **Synthesis**: Real-time additive synthesis from spectrograms via FFT-based IDCT (O(N log N)). Stereo output (32kHz, 16-bit, interleaved L/R). Uses orthonormal DCT-II/DCT-III transforms with Numerical Recipes reordering method.
- **Variable Tempo**: Music time abstraction with configurable tempo_scale. Tempo changes don't affect pitch.
- **Event-Based Tracker**: Individual TrackerEvents trigger as separate voices with dynamic beat calculation. Notes within patterns respect tempo scaling.
- **Backend Abstraction**: `AudioBackend` interface with `MiniaudioBackend` (production), `MockAudioBackend` (testing), and `WavDumpBackend` (offline rendering).
- **Dynamic Updates**: Double-buffered spectrograms for live thread-safe updates.
- **Procedural Library**: Melodies and spectral filters (noise, comb) generated at runtime.
- **Pattern System**: TrackerPatterns contain lists of TrackerEvents (beat, sample_id, volume, pan). Events trigger individually based on elapsed music time.