1 files changed, 52 insertions, 202 deletions
diff --git a/TODO.md b/TODO.md
index c942fb5..10f0661 100644
--- a/TODO.md
+++ b/TODO.md
@@ -8,44 +8,33 @@ This file tracks prioritized tasks with detailed attack plans.
 
 ## Priority 1: Spectral Brush Editor (Task #5) [IN PROGRESS]
 
-**Goal:** Create a web-based tool for procedurally tracing audio spectrograms. Replaces large `.spec` binary assets with tiny procedural C++ code (50-100× compression).
+**Goal:** Web-based tool for procedurally tracing audio spectrograms. Replaces large `.spec` binary assets with tiny procedural C++ code (50-100× compression).
 
 **Design Document:** See `doc/SPECTRAL_BRUSH_EDITOR.md` for complete architecture.
 
-**Core Concept: "Spectral Brush"**
-- **Central Curve** (Bezier): Traces time-frequency path through spectrogram
-- **Vertical Profile**: Shapes "brush stroke" around curve (Gaussian, Decaying Sinusoid, Noise)
+**Core Concept:** Bezier curves trace time-frequency paths. Gaussian profiles shape "brush strokes" around curves.
 
-**Workflow:**
-```
-.wav → Load in editor → Trace with Bezier curves → Export procedural_params.txt + C++ code
-```
+**Workflow:** `.wav` → Load in editor → Trace with Bezier curves → Export `procedural_params.txt` + C++ code
 
 ### Phase 1: C++ Runtime (Foundation)
-- [ ] **Files:** `src/audio/spectral_brush.h`, `src/audio/spectral_brush.cc`
+- [ ] Files: `src/audio/spectral_brush.h`, `src/audio/spectral_brush.cc`
 - [ ] Define API (`ProfileType`, `draw_bezier_curve()`, `evaluate_profile()`)
 - [ ] Implement linear Bezier interpolation
 - [ ] Implement Gaussian profile evaluation
-- [ ] Implement home-brew deterministic RNG (for future noise support)
+- [ ] Implement home-brew deterministic RNG
 - [ ] Add unit tests (`src/tests/test_spectral_brush.cc`)
 - [ ] **Deliverable:** Compiles, tests pass
 
 ### Phase 2: Editor Core
-- [ ] **Files:** `tools/spectral_editor/index.html`, `script.js`, `style.css`, `dct.js` (reuse from old editor)
+- [ ] Files: `tools/spectral_editor/index.html`, `script.js`, `style.css`, `dct.js`
 - [ ] HTML structure (canvas, controls, file input)
 - [ ] Canvas rendering (dual-layer: reference + procedural)
-- [ ] Bezier curve editor (click to place, drag to adjust, delete control points)
+- [ ] Bezier curve editor (click, drag, delete control points)
 - [ ] Profile controls (Gaussian sigma slider)
 - [ ] Real-time spectrogram rendering
 - [ ] Audio playback (IDCT → Web Audio API)
-- [ ] Undo/Redo system (action history with snapshots)
-- [ ] **Keyboard shortcuts:**
-  - Key '1': Play procedural sound
-  - Key '2': Play original .wav
-  - Space: Play/pause
-  - Ctrl+Z: Undo
-  - Ctrl+Shift+Z: Redo
-  - Delete: Remove control point
+- [ ] Undo/Redo system
+- [ ] Keyboard shortcuts (1=play procedural, 2=play original, Space, Ctrl+Z, Delete)
 - [ ] **Deliverable:** Interactive editor, can trace .wav files
 
 ### Phase 3: File I/O
@@ -61,213 +50,74 @@ This file tracks prioritized tasks with detailed attack plans.
 - [ ] Decaying sinusoid profile (metallic sounds)
 - [ ] Noise profile (textured sounds)
 - [ ] Composite profiles (add/subtract/multiply)
-- [ ] Multi-dimensional Bezier ({freq, amplitude, decay, ...})
-- [ ] Frequency snapping (snap to musical notes)
-- [ ] Generic `gen_from_params()` code generation
 
-**Design Decisions:**
-- Linear Bezier interpolation (Phase 1), cubic later
-- Soft parameter limits in UI (not enforced)
-- Home-brew RNG (small, deterministic)
-- Single function per sound (generic loader later)
-- Start with Bezier + Gaussian only
+**Design Decisions:** Linear Bezier (Phase 1), cubic later. Soft parameter limits. Home-brew RNG. Single function per sound initially.
 
 **Size Impact:** 50-100× compression (5 KB .spec → ~100 bytes C++ code)
 
 ---
 
 ## Priority 2: 3D System Enhancements (Task #18)
-**Goal:** Establish a pipeline for importing complex 3D scenes to replace hardcoded geometry. **Progress:** C++ pipeline for loading and processing object-specific data (like plane_distance) is now in place. Shader integration for SDFs is pending.
+
+**Goal:** Establish pipeline for importing complex 3D scenes to replace hardcoded geometry.
+
+**Progress:** C++ pipeline for loading object-specific data (plane_distance) is in place. Shader integration for SDFs pending.
+
+---
 
 ## Priority 3: WGSL Modularization (Task #50) [RECURRENT]
 
-**Goal**: Refactor `ShaderComposer` and WGSL assets to support granular, reusable snippets and `#include` directives. This is an ongoing task to maintain shader code hygiene as new features are added.
+**Goal:** Refactor `ShaderComposer` and WGSL assets to support granular, reusable snippets. Ongoing task for shader code hygiene.
 
 ### Sub-task: Split common_uniforms.wgsl (Low Priority)
-- **Current**: `common_uniforms.wgsl` contains 4 structs (CommonUniforms, GlobalUniforms, ObjectData, ObjectsBuffer)
-- **Goal**: Split into separate files for granular #include
-  - `common_uniforms/common.wgsl` - CommonUniforms only
-  - `common_uniforms/global.wgsl` - GlobalUniforms only
-  - `common_uniforms/object.wgsl` - ObjectData + ObjectsBuffer
-- **Benefit**: Shaders only include what they need, reducing compiled size
-- **Impact**: Minimal (most shaders only use CommonUniforms anyway)
-- **Priority**: Low (nice-to-have for code organization)
+**Current:** `common_uniforms.wgsl` contains 4 structs (CommonUniforms, GlobalUniforms, ObjectData, ObjectsBuffer)
 
-### Sub-task: Type-safe shader composition (Low Priority)
-- **Problem**: Recurrent error of forgetting `ShaderComposer::Get().Compose({}, code)` and using raw `code` directly
-  - Happened in: `create_post_process_pipeline`, `gpu_create_render_pass`, `gpu_create_compute_pass`, `CircleMaskEffect`
-  - Runtime error only (crashes demo, but tests may pass)
-- **Solution**: Use strong typing to make it a compile-time error
-  ```cpp
-  class ComposedShader {
-   private:
-    std::string code_;
-    friend class ShaderComposer;
-    explicit ComposedShader(std::string code) : code_(std::move(code)) {}
-   public:
-    const char* c_str() const { return code_.c_str(); }
-  };
-  ```
-- **Changes needed**:
-  - `ShaderComposer::Compose()` returns `ComposedShader` instead of `std::string`
-  - All shader creation functions take `const ComposedShader&` instead of `const char*`
-  - Cannot pass raw string to shader functions (compile error)
-- **Benefits**:
-  - Impossible to forget composition (type mismatch)
-  - Self-documenting API (signature shows shader must be composed)
-  - Catches errors at compile-time instead of runtime
-- **Trade-offs**:
-  - More verbose code (`auto shader = ShaderComposer::Get().Compose(...)`)
-  - Small overhead (extra std::string copy, but negligible)
-- **Priority**: Low (recurrent but rare, easy to catch in testing)
+**Goal:** Split into separate files:
+- `common_uniforms/common.wgsl` - CommonUniforms only
+- `common_uniforms/global.wgsl` - GlobalUniforms only
+- `common_uniforms/object.wgsl` - ObjectData + ObjectsBuffer
 
-## Phase 2: Size Optimization (Final Goal)
-
-- [ ] **Task #34: Full STL Removal**: Replace all remaining `std::vector`, `std::map`, and `std::string` usage with custom minimal containers or C-style arrays to allow for CRT replacement. (Minimal Priority - deferred to end).
+**Benefit:** Shaders only include what they need, reducing compiled size
 
-- [ ] **Task #22: Windows Native Platform**: Replace GLFW with direct Win32 API calls for the final 64k push.
+**Impact:** Minimal (most shaders only use CommonUniforms)
 
-- [ ] **Task #28: Spectrogram Quantization**: Research optimal frequency bin distribution and implement quantization.
+**Priority:** Low (nice-to-have)
 
-- [ ] **Task #35: CRT Replacement**: investigation and implementation of CRT-free entry point.
+### Sub-task: Type-safe shader composition (Low Priority)
+**Problem:** Recurrent error of forgetting `ShaderComposer::Get().Compose({}, code)` and using raw `code` directly. Runtime error only (crashes demo, tests may pass).
 
-## Future Goals & Ideas (Untriaged)
+**Solution:** Use strong typing to make it compile-time error:
+```cpp
+class ComposedShader {
+ private:
+  std::string code_;
+  friend class ShaderComposer;
+  explicit ComposedShader(std::string code) : code_(std::move(code)) {}
+ public:
+  const char* c_str() const { return code_.c_str(); }
+};
+```
 
-### Audio Tools
-- [ ] **Task #64: specplay Enhancements**: Extend audio analysis tool with new features
-  - **Priority 1**: Spectral visualization (ASCII art), waveform display, frequency analysis, dynamic range
-  - **Priority 2**: Diff mode (compare .wav vs .spec), batch mode (CSV report, find clipping)
-  - **Priority 3**: WAV export (.spec → .wav), normalization
-  - **Priority 4**: Spectral envelope, harmonic analysis, onset detection
-  - **Priority 5**: Interactive mode (seek, loop, volume control)
-  - See `tools/specplay_README.md` for detailed feature list
+**Changes:**
+- `ShaderComposer::Compose()` returns `ComposedShader` instead of `std::string`
+- All shader creation functions take `const ComposedShader&` instead of `const char*`
+- Cannot pass raw string to shader functions (compile error)
 
-- [ ] **Task #65: Data-Driven Tempo Control**: Move tempo variation from code to data files
-  - **Current**: `g_tempo_scale` is hardcoded in `main.cc` with manual animation curves
-  - **Goal**: Define tempo curves in `.seq` or `.track` files for data-driven tempo control
-  - **Approach A**: Add TEMPO directive to `.seq` format
-    - Example: `TEMPO 0.0 1.0`, `TEMPO 10.0 2.0`, `TEMPO 20.0 1.0` (time, scale pairs)
-    - seq_compiler generates tempo curve array in timeline.cc
-  - **Approach B**: Add tempo column to music.track
-    - Each pattern trigger can specify tempo_scale override
-    - tracker_compiler generates tempo events in music_data.cc
-  - **Benefits**: Non-programmers can edit tempo, easier iteration, version control friendly
-  - **Priority**: Low (current hardcoded approach works, but less flexible)
+**Benefits:** Impossible to forget composition (type mismatch). Self-documenting API. Compile-time error.
 
-- [ ] **Task #67: DCT/FFT Performance Benchmarking**: Add timing measurements to audio tests
-  - **Goal**: Compare performance of different DCT/IDCT implementations
-  - **Location**: Add timing code to `test_dct.cc` or `test_fft.cc`
-  - **Measurements**:
-    - Reference IDCT/FDCT (naive O(N²) implementation)
-    - FFT-based DCT/IDCT (current O(N log N) implementation)
-    - Future x86_64 SIMD-optimized versions (when implemented)
-  - **Output Format**:
-    - Average time per transform (microseconds)
-    - Throughput (transforms per second)
-    - Speedup factor vs reference implementation
-  - **Test Sizes**: DCT_SIZE=512 (production), plus 128, 256, 1024 for scaling analysis
-  - **Implementation**:
-    - Use `std::chrono::high_resolution_clock` for timing
-    - Run each test 1000+ iterations to reduce noise
-    - Report min/avg/max times
-    - Guard with `#if !defined(STRIP_ALL)` to avoid production overhead
-  - **Benefits**: Quantify FFT speedup, validate SIMD optimizations, identify regressions
-  - **Priority**: Very Low (nice-to-have for future optimization work)
+**Trade-offs:** More verbose code. Small overhead (extra std::string copy, negligible).
 
-- [ ] **Task #69: Convert Audio Pipeline to Clipped Int16**: Use clipped int16 for all audio processing
-  - **Current**: Audio pipeline uses float32 throughout (generation, mixing, synthesis, output)
-  - **Goal**: Convert to clipped int16 for faster/easier processing and reduced memory footprint
-  - **Rationale**:
-    - Simpler arithmetic (no float operations)
-    - Smaller memory footprint (2 bytes vs 4 bytes per sample)
-    - Hardware-native format (most audio devices use int16)
-    - Eliminates float→int16 conversion at output stage
-    - Natural clipping behavior (overflow wraps/clips automatically)
-  - **Scope**:
-    - Output path: Definitely convert (backends, WAV dump)
-    - Synthesis: Consider keeping float32 for quality (IDCT produces float)
-    - Mixing: Could use int16 with proper overflow handling
-    - Asset storage: Already int16 in .spec files
-  - **Implementation Phases**:
-    1. **Phase 1: Output Only** (Minimal change, ~50 lines)
-       - Convert `synth_render()` output from float to int16
-       - Update `MiniaudioBackend` and `WavDumpBackend` to accept int16
-       - Keep all internal processing as float
-       - **Benefit**: Eliminates final conversion step
-    2. **Phase 2: Mixing Stage** (Moderate change, ~200 lines)
-       - Convert voice mixing to int16 arithmetic
-       - Add saturation/clipping logic
-       - Keep IDCT output as float, convert after synthesis
-       - **Benefit**: Faster mixing, reduced memory bandwidth
-    3. **Phase 3: Full Pipeline** (Large change, ~500+ lines)
-       - Convert spectrograms from float to int16 storage
-       - Modify IDCT to output int16 directly
-       - All synthesis in int16
-       - **Benefit**: Maximum size reduction and performance
-  - **Trade-offs**:
-    - Quality loss: 16-bit resolution vs 32-bit float precision
-    - Dynamic range: Limited to [-32768, 32767]
-    - Clipping: Must handle overflow carefully in mixing stage
-    - Code complexity: Saturation arithmetic more complex than float
-  - **Testing Requirements**:
-    - Verify no audible quality degradation
-    - Ensure clipping behavior matches float version
-    - Check mixing overflow doesn't cause artifacts
-    - Validate WAV dumps bit-identical to hardware output
-  - **Size Impact**:
-    - Phase 1: Negligible (~50 bytes)
-    - Phase 2: Small reduction (~100-200 bytes, faster code)
-    - Phase 3: Large reduction (50% memory, ~1-2KB code savings)
-  - **Priority**: Low (final optimization, after size budget is tight)
-  - **Notes**:
-    - This is a FINAL optimization task, only if 64k budget requires it
-    - Quality must be validated - may not be worth the trade-off
-    - Consider keeping float for procedural generation quality
+**Priority:** Low (recurrent but rare, easy to catch in testing)
 
-### Developer Tools
-- [ ] **Task #66: External Asset Loading for Debugging**: mmap() asset files instead of embedded data
-  - **Current**: All assets embedded in `assets_data.cc` (regenerate on every asset change)
-  - **Goal**: Load assets from external files in debug builds for faster iteration
-  - **Scope**: macOS only, non-STRIP_ALL builds only
-  - **Implementation**:
-    - Add `DEMO_ENABLE_EXTERNAL_ASSETS` CMake option
-    - Modify `GetAsset()` to check for external file first (e.g., `assets/final/<name>`)
-    - Use `mmap()` to map file into memory (replaces `uint8_t asset[]` array)
-    - Fallback to embedded data if file not found
-  - **Benefits**: Edit shaders/assets without regenerating assets_data.cc (~10s rebuild)
-  - **Trade-offs**: Adds runtime file I/O, only useful during development
-  - **Priority**: Low (current workflow acceptable, but nice-to-have for rapid iteration)
+---
 
-### Visual Effects
-- [ ] **Task #73: Extend Shader Parametrization** [IN PROGRESS - 2/4 complete]
-  - **Goal**: Extend uniform parameter system to ChromaAberrationEffect, GaussianBlurEffect, DistortEffect, SolarizeEffect
-  - **Pattern**: Follow FlashEffect implementation (UniformHelper, params struct, .seq syntax)
-  - **Completed**: ChromaAberrationEffect (offset_scale, angle), GaussianBlurEffect (strength)
-  - **Remaining**: DistortEffect, SolarizeEffect
-  - **Priority**: Medium (quality-of-life improvement for artists)
-  - **Estimated Impact**: ~200-300 bytes per effect
-- [ ] **Task #52: Procedural SDF Font**: Minimal bezier/spline set for [A-Z, 0-9] and SDF rendering.
-- [ ] **Task #55: SDF Random Planes Intersection**: Implement `sdPolyhedron` (crystal/gem shapes) via plane intersection.
-- [ ] **Task #54: Tracy Integration**: Integrate Tracy debugger for performance profiling.
-- [ ] **Task #58: Advanced Shader Factorization**: Further factorize WGSL code into smaller, reusable snippets.
-- [ ] **Task #59: Comprehensive RNG Library**: Add WGSL snippets for float/vec2/vec3 noise (Perlin, Gyroid, etc.) and random number generators.
-- [ ] **Task #60: OOP Refactoring**: Investigate if more C++ code can be made object-oriented without size penalty (vs functional style).
-- [ ] **Task #61: GPU Procedural Generation**: Implement system to generate procedural data (textures, geometry) on GPU and read back to CPU.
-- [ ] **Task #62: Physics Engine Enhancements (PBD & Rotation)**:
-    - [ ] **Task #62.1: Quaternion Rotation**: Implement quaternion-based rotation for `Object3D` and incorporate angular momentum into physics.
-    - [ ] **Task #62.2: Position Based Dynamics (PBD)**: Refactor solver to re-evaluate velocity after resolving all collisions and constraints.
-- [ ] **Task #63: Refactor large files**: Split `src/gpu/gpu.cc`, `src/3d/visual_debug.cc` and `src/gpu/effect.cc` into sub-functionalities. (`src/3d/renderer.cc` was also over 500 lines and was taken care of in the past)
+## Phase 2: Size Optimization (Final Goal)
 
-### Performance Optimization
-- [ ] **Task #70: SIMD x86_64 Implementation**: Implement critical functions using intrinsics for x86_64 platforms.
-  - **Goal**: Optimize hot paths for audio and procedural generation.
-  - **Scope**:
-    - IDCT/FDCT transforms
-    - Audio mixing and voice synthesis
-    - CPU-side procedural texture/geometry generation
-  - **Constraint**: Non-critical; fallback to generic C++ must be maintained.
-  - **Priority**: Very Low
+- [ ] **Task #34: Full STL Removal** - Replace remaining `std::vector`, `std::map`, `std::string` with custom containers
+- [ ] **Task #22: Windows Native Platform** - Replace GLFW with Win32 API
+- [ ] **Task #28: Spectrogram Quantization** - Research optimal frequency distribution
+- [ ] **Task #35: CRT Replacement** - Investigation and implementation of CRT-free entry
 
 ---
 
-## Future Goals
+For untriaged future goals and ideas, see `doc/BACKLOG.md`.