1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
|
# To-Do List
This file tracks prioritized tasks with detailed attack plans.
**Note:** For a history of recently completed tasks, see `COMPLETED.md`.
## Priority 1: Spectral Brush Editor (Task #5) [IN PROGRESS]
**Goal:** Create a web-based tool for procedurally tracing audio spectrograms. Replaces large `.spec` binary assets with tiny procedural C++ code (50-100× compression).
**Design Document:** See `doc/SPECTRAL_BRUSH_EDITOR.md` for complete architecture.
**Core Concept: "Spectral Brush"**
- **Central Curve** (Bezier): Traces time-frequency path through spectrogram
- **Vertical Profile**: Shapes "brush stroke" around curve (Gaussian, Decaying Sinusoid, Noise)
**Workflow:**
```
.wav → Load in editor → Trace with Bezier curves → Export procedural_params.txt + C++ code
```
### Phase 1: C++ Runtime (Foundation)
- [ ] **Files:** `src/audio/spectral_brush.h`, `src/audio/spectral_brush.cc`
- [ ] Define API (`ProfileType`, `draw_bezier_curve()`, `evaluate_profile()`)
- [ ] Implement linear Bezier interpolation
- [ ] Implement Gaussian profile evaluation
- [ ] Implement home-brew deterministic RNG (for future noise support)
- [ ] Add unit tests (`src/tests/test_spectral_brush.cc`)
- [ ] **Deliverable:** Compiles, tests pass
### Phase 2: Editor Core
- [ ] **Files:** `tools/spectral_editor/index.html`, `script.js`, `style.css`, `dct.js` (reuse from old editor)
- [ ] HTML structure (canvas, controls, file input)
- [ ] Canvas rendering (dual-layer: reference + procedural)
- [ ] Bezier curve editor (click to place, drag to adjust, delete control points)
- [ ] Profile controls (Gaussian sigma slider)
- [ ] Real-time spectrogram rendering
- [ ] Audio playback (IDCT → Web Audio API)
- [ ] Undo/Redo system (action history with snapshots)
- [ ] **Keyboard shortcuts:**
- Key '1': Play procedural sound
- Key '2': Play original .wav
- Space: Play/pause
- Ctrl+Z: Undo
- Ctrl+Shift+Z: Redo
- Delete: Remove control point
- [ ] **Deliverable:** Interactive editor, can trace .wav files
### Phase 3: File I/O
- [ ] Load .wav (decode, FFT/STFT → spectrogram)
- [ ] Load .spec (binary format parser)
- [ ] Save procedural_params.txt (human-readable, re-editable)
- [ ] Generate C++ code (ready to compile)
- [ ] Load procedural_params.txt (re-editing workflow)
- [ ] **Deliverable:** Full save/load cycle works
### Phase 4: Future Extensions (Post-MVP)
- [ ] Cubic Bezier interpolation (smoother curves)
- [ ] Decaying sinusoid profile (metallic sounds)
- [ ] Noise profile (textured sounds)
- [ ] Composite profiles (add/subtract/multiply)
- [ ] Multi-dimensional Bezier ({freq, amplitude, decay, ...})
- [ ] Frequency snapping (snap to musical notes)
- [ ] Generic `gen_from_params()` code generation
**Design Decisions:**
- Linear Bezier interpolation (Phase 1), cubic later
- Soft parameter limits in UI (not enforced)
- Home-brew RNG (small, deterministic)
- Single function per sound (generic loader later)
- Start with Bezier + Gaussian only
**Size Impact:** 50-100× compression (5 KB .spec → ~100 bytes C++ code)
---
## Priority 2: 3D System Enhancements (Task #18)
**Goal:** Establish a pipeline for importing complex 3D scenes to replace hardcoded geometry.
## Priority 3: WGSL Modularization (Task #50) [RECURRENT]
**Goal**: Refactor `ShaderComposer` and WGSL assets to support granular, reusable snippets and `#include` directives. This is an ongoing task to maintain shader code hygiene as new features are added.
## Phase 2: Size Optimization (Final Goal)
- [ ] **Task #34: Full STL Removal**: Replace all remaining `std::vector`, `std::map`, and `std::string` usage with custom minimal containers or C-style arrays to allow for CRT replacement. (Minimal Priority - deferred to end).
- [ ] **Task #22: Windows Native Platform**: Replace GLFW with direct Win32 API calls for the final 64k push.
- [ ] **Task #28: Spectrogram Quantization**: Research optimal frequency bin distribution and implement quantization.
- [ ] **Task #35: CRT Replacement**: investigation and implementation of CRT-free entry point.
## Future Goals & Ideas (Untriaged)
### Audio Tools
- [ ] **Task #64: specplay Enhancements**: Extend audio analysis tool with new features
- **Priority 1**: Spectral visualization (ASCII art), waveform display, frequency analysis, dynamic range
- **Priority 2**: Diff mode (compare .wav vs .spec), batch mode (CSV report, find clipping)
- **Priority 3**: WAV export (.spec → .wav), normalization
- **Priority 4**: Spectral envelope, harmonic analysis, onset detection
- **Priority 5**: Interactive mode (seek, loop, volume control)
- See `tools/specplay_README.md` for detailed feature list
- [ ] **Task #65: Data-Driven Tempo Control**: Move tempo variation from code to data files
- **Current**: `g_tempo_scale` is hardcoded in `main.cc` with manual animation curves
- **Goal**: Define tempo curves in `.seq` or `.track` files for data-driven tempo control
- **Approach A**: Add TEMPO directive to `.seq` format
- Example: `TEMPO 0.0 1.0`, `TEMPO 10.0 2.0`, `TEMPO 20.0 1.0` (time, scale pairs)
- seq_compiler generates tempo curve array in timeline.cc
- **Approach B**: Add tempo column to music.track
- Each pattern trigger can specify tempo_scale override
- tracker_compiler generates tempo events in music_data.cc
- **Benefits**: Non-programmers can edit tempo, easier iteration, version control friendly
- **Priority**: Low (current hardcoded approach works, but less flexible)
- [ ] **Task #67: DCT/FFT Performance Benchmarking**: Add timing measurements to audio tests
- **Goal**: Compare performance of different DCT/IDCT implementations
- **Location**: Add timing code to `test_dct.cc` or `test_fft.cc`
- **Measurements**:
- Reference IDCT/FDCT (naive O(N²) implementation)
- FFT-based DCT/IDCT (current O(N log N) implementation)
- Future x86_64 SIMD-optimized versions (when implemented)
- **Output Format**:
- Average time per transform (microseconds)
- Throughput (transforms per second)
- Speedup factor vs reference implementation
- **Test Sizes**: DCT_SIZE=512 (production), plus 128, 256, 1024 for scaling analysis
- **Implementation**:
- Use `std::chrono::high_resolution_clock` for timing
- Run each test 1000+ iterations to reduce noise
- Report min/avg/max times
- Guard with `#if !defined(STRIP_ALL)` to avoid production overhead
- **Benefits**: Quantify FFT speedup, validate SIMD optimizations, identify regressions
- **Priority**: Very Low (nice-to-have for future optimization work)
- [ ] **Task #69: Convert Audio Pipeline to Clipped Int16**: Use clipped int16 for all audio processing
- **Current**: Audio pipeline uses float32 throughout (generation, mixing, synthesis, output)
- **Goal**: Convert to clipped int16 for faster/easier processing and reduced memory footprint
- **Rationale**:
- Simpler arithmetic (no float operations)
- Smaller memory footprint (2 bytes vs 4 bytes per sample)
- Hardware-native format (most audio devices use int16)
- Eliminates float→int16 conversion at output stage
- Natural clipping behavior (overflow wraps/clips automatically)
- **Scope**:
- Output path: Definitely convert (backends, WAV dump)
- Synthesis: Consider keeping float32 for quality (IDCT produces float)
- Mixing: Could use int16 with proper overflow handling
- Asset storage: Already int16 in .spec files
- **Implementation Phases**:
1. **Phase 1: Output Only** (Minimal change, ~50 lines)
- Convert `synth_render()` output from float to int16
- Update `MiniaudioBackend` and `WavDumpBackend` to accept int16
- Keep all internal processing as float
- **Benefit**: Eliminates final conversion step
2. **Phase 2: Mixing Stage** (Moderate change, ~200 lines)
- Convert voice mixing to int16 arithmetic
- Add saturation/clipping logic
- Keep IDCT output as float, convert after synthesis
- **Benefit**: Faster mixing, reduced memory bandwidth
3. **Phase 3: Full Pipeline** (Large change, ~500+ lines)
- Convert spectrograms from float to int16 storage
- Modify IDCT to output int16 directly
- All synthesis in int16
- **Benefit**: Maximum size reduction and performance
- **Trade-offs**:
- Quality loss: 16-bit resolution vs 32-bit float precision
- Dynamic range: Limited to [-32768, 32767]
- Clipping: Must handle overflow carefully in mixing stage
- Code complexity: Saturation arithmetic more complex than float
- **Testing Requirements**:
- Verify no audible quality degradation
- Ensure clipping behavior matches float version
- Check mixing overflow doesn't cause artifacts
- Validate WAV dumps bit-identical to hardware output
- **Size Impact**:
- Phase 1: Negligible (~50 bytes)
- Phase 2: Small reduction (~100-200 bytes, faster code)
- Phase 3: Large reduction (50% memory, ~1-2KB code savings)
- **Priority**: Low (final optimization, after size budget is tight)
- **Notes**:
- This is a FINAL optimization task, only if 64k budget requires it
- Quality must be validated - may not be worth the trade-off
- Consider keeping float for procedural generation quality
### Developer Tools
- [ ] **Task #66: External Asset Loading for Debugging**: mmap() asset files instead of embedded data
- **Current**: All assets embedded in `assets_data.cc` (regenerate on every asset change)
- **Goal**: Load assets from external files in debug builds for faster iteration
- **Scope**: macOS only, non-STRIP_ALL builds only
- **Implementation**:
- Add `DEMO_ENABLE_EXTERNAL_ASSETS` CMake option
- Modify `GetAsset()` to check for external file first (e.g., `assets/final/<name>`)
- Use `mmap()` to map file into memory (replaces `uint8_t asset[]` array)
- Fallback to embedded data if file not found
- **Benefits**: Edit shaders/assets without regenerating assets_data.cc (~10s rebuild)
- **Trade-offs**: Adds runtime file I/O, only useful during development
- **Priority**: Low (current workflow acceptable, but nice-to-have for rapid iteration)
### Visual Effects
- [ ] **Task #52: Procedural SDF Font**: Minimal bezier/spline set for [A-Z, 0-9] and SDF rendering.
- [ ] **Task #55: SDF Random Planes Intersection**: Implement `sdPolyhedron` (crystal/gem shapes) via plane intersection.
- [ ] **Task #54: Tracy Integration**: Integrate Tracy debugger for performance profiling.
- [ ] **Task #58: Advanced Shader Factorization**: Further factorize WGSL code into smaller, reusable snippets.
- [ ] **Task #59: Comprehensive RNG Library**: Add WGSL snippets for float/vec2/vec3 noise (Perlin, Gyroid, etc.) and random number generators.
- [ ] **Task #60: OOP Refactoring**: Investigate if more C++ code can be made object-oriented without size penalty (vs functional style).
- [ ] **Task #61: GPU Procedural Generation**: Implement system to generate procedural data (textures, geometry) on GPU and read back to CPU.
- [ ] **Task #62: Physics Engine Enhancements (PBD & Rotation)**:
- [ ] **Task #62.1: Quaternion Rotation**: Implement quaternion-based rotation for `Object3D` and incorporate angular momentum into physics.
- [ ] **Task #62.2: Position Based Dynamics (PBD)**: Refactor solver to re-evaluate velocity after resolving all collisions and constraints.
- [ ] **Task #63: Refactor large files**: Split `src/3d/renderer.cc` (currently > 500 lines) into sub-functionalities.
---
## Future Goals
|