diff options
| author | skal <pascal.massimino@gmail.com> | 2026-03-02 09:40:18 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-03-02 09:40:18 +0100 |
| commit | 11486d634c865aca944f9691e3bdb2be83adc79a (patch) | |
| tree | aaed68c6a25d54942b093c607238535e61840cce | |
| parent | bb8197075161f9c9ded51beab913150b43954e2c (diff) | |
docs: update PROJECT_CONTEXT, TODO, COMPLETED for OLA-IDCT
- PROJECT_CONTEXT: audio section reflects OLA-IDCT (Hann, 50% overlap);
test count 35->34; Next Up notes .spec regen needed
- TODO: remove stale MP3 sub-task (done), trim test TODOs, add .spec
regen as Priority 3, update test count to 34/34
- COMPLETED: archive OLA-IDCT task with implementation summary
| -rw-r--r-- | PROJECT_CONTEXT.md | 10 | ||||
| -rw-r--r-- | TODO.md | 39 | ||||
| -rw-r--r-- | doc/COMPLETED.md | 9 |
3 files changed, 27 insertions, 31 deletions
diff --git a/PROJECT_CONTEXT.md b/PROJECT_CONTEXT.md index bce4d67..29395e7 100644 --- a/PROJECT_CONTEXT.md +++ b/PROJECT_CONTEXT.md @@ -12,7 +12,7 @@ ## Audio - 32 kHz, 16-bit stereo - Procedurally generated samples -- Real-time additive synthesis from spectrograms (IDCT) +- Real-time additive synthesis from spectrograms (OLA-IDCT, Hann window, 50% overlap) - Variable tempo system with music time abstraction - Event-based pattern triggering for dynamic tempo scaling - Modifiable Loops and Patterns, w/ script to generate them (like a Tracker) @@ -33,21 +33,21 @@ - **Timing System:** **Beat-based timelines** for musical synchronization. Sequences defined in beats, converted to seconds at runtime. Effects receive both physical time (constant) and beat time (musical). Variable tempo affects audio only. See `doc/BEAT_TIMING.md`. - **Workspace system:** Multi-workspace support. Easy switching with `-DDEMO_WORKSPACE=<name>`. Organized structure: `music/`, `weights/`, `obj/`, `shaders/`. Shared common shaders in `src/shaders/`. See `doc/WORKSPACE_SYSTEM.md`. -- **Audio:** Sample-accurate sync. Zero heap allocations per frame. Variable tempo. Comprehensive tests. +- **Audio:** Sample-accurate sync. Zero heap allocations per frame. Variable tempo. OLA-IDCT synthesis (v2 .spec): Hann window, 50% overlap, click-free. V1 (raw DCT-512) preserved for compatibility. MP3→spec encoder updated to match. Existing .spec files need regen to activate v2. - **Shaders:** Parameterized effects (UniformHelper, .seq syntax). Beat-synchronized animation support (`beat_time`, `beat_phase`). Modular WGSL composition with ShaderComposer. 24 shared common shaders (math, render, compute). - **3D:** Hybrid SDF/rasterization with BVH. Binary scene loader. Blender pipeline. - **Effects:** CNN post-processing: CNNEffect (v1) and CNNv2Effect operational. CNN v2: sigmoid activation, storage buffer weights (~3.2 KB), 7D static features, dynamic layers. Training stable, convergence validated. - **Tools:** CNN test tool operational. Texture readback utility functional. Timeline editor (web-based, beat-aligned, audio playback). - **Build:** Asset dependency tracking. Size measurement. Hot-reload (debug-only). WSL (Windows 10) supported: native Linux build and cross-compile to `.exe` via `mingw-w64`. - **Sequence:** DAG-based effect routing with explicit node system. Python compiler with topological sort and ping-pong optimization. 10 effects operational (Passthrough, Placeholder, GaussianBlur, Heptagon, Particles, RotatingCube, Hybrid3D, Flash, PeakMeter, Scene1). Effect times are absolute (seq_compiler adds sequence start offset). See `doc/SEQUENCE.md`. -- **Testing:** **35/35 passing**. Fixed intermittent SIGTRAP in effect lifecycle tests. +- **Testing:** **34/34 passing**. --- ## Next Up -**Active:** Spectral Brush Editor (procedural compression), CNN v2 quantization -**Ongoing:** Test infrastructure maintenance (35/35 passing) +**Active:** Spectral Brush Editor (procedural compression), CNN v2 quantization, .spec v2 regen (OLA) +**Ongoing:** Test infrastructure maintenance (34/34 passing) **Future:** Size optimization (64k target), 3D enhancements See `TODO.md` for details. @@ -21,44 +21,31 @@ Reduce weights from f16 (~3.2 KB) to i8 (~1.6 KB). --- -## Priority 3: Test Infrastructure Maintenance [ONGOING] +## Priority 3: Regenerate .spec files as v2 [REQUIRED] -**Status:** 35/35 tests passing +Existing `.spec` files in `workspaces/main/music/` were encoded with v1 (no overlap). +Rebuild with the MP3 assets to produce v2 (OLA, Hann, hop=256) — click-free output. -**Outstanding TODOs:** - -1. **test_effect_base.cc:250** - [FIXED] Fix SIGTRAP in `test_sequence_render()` - - Added `wgpuDeviceTick()` to wait for GPU to finish, resolving the intermittent crash. - - All other tests validate the same functionality - - Issue: Hangs/crashes during render with external sink view - -2. **test_sequence.cc** - Port legacy sequence tests (currently disabled) - - Uses legacy Effect/MainSequence system - - Lines 168, 173, 182: Re-enable lifecycle and simulation tests after port +--- -3. **test_audio_engine.cc:152** - Re-enable commented test after debugging +## Priority 4: Test Infrastructure Maintenance [ONGOING] -4. **test_fft.cc:87** - Investigate FFT-DCT algorithm discrepancy - - May need different algorithm or fix existing one +**Status:** 34/34 tests passing +**Outstanding TODOs:** -## Priority 4: Audio System Enhancements [LOW PRIORITY] +1. **test_sequence.cc** - Port legacy sequence tests (currently disabled) + - Lines 168, 173, 182: Re-enable lifecycle and simulation tests after port -Extended audio capabilities for sample assets and procedural synthesis. +2. **test_audio_engine.cc:152** - Re-enable commented test after debugging -**Sub-tasks:** +3. **test_fft.cc:87** - Investigate FFT-DCT algorithm discrepancy -1. **MP3 Sample Assets:** - - Integrate miniaudio for MP3 decoding - - Add ASSET_*.mp3 support to asset_packer - - Convert to PCM in AssetManager or synth on load - - Use case: Compressed sample libraries +## Priority 4: Audio System Enhancements [LOW PRIORITY] -2. **GPU-Accelerated PCM Synthesis:** +1. **GPU-Accelerated PCM Synthesis:** - Compute shader for direct PCM generation (bypass spectrogram) - Write to compute buffer, readback to synth - - Use case: Real-time modulation, complex waveforms - - Lower latency than spectrogram path --- diff --git a/doc/COMPLETED.md b/doc/COMPLETED.md index 0dba307..e37ddbf 100644 --- a/doc/COMPLETED.md +++ b/doc/COMPLETED.md @@ -29,6 +29,15 @@ Detailed historical documents have been moved to `doc/archive/` for reference: Use `read @doc/archive/FILENAME.md` to access archived documents. +## Recently Completed (March 2, 2026) + +- [x] **OLA-IDCT Synthesis — click-free .spec decoding** + - **Goal**: Eliminate frame-boundary clicks in spectrogram→PCM synthesis. + - **Implementation**: Added v2 spectrogram format (`SPEC_VERSION_V2_OLA`). Synthesis uses Hann-windowed IDCT with 50% overlap-add (hop=256, overlap=256). Per-voice `overlap_buf[256]` accumulates the tail from the previous IDCT frame. V1 path (raw DCT-512) preserved for generated notes and old `.spec` files. Hann window precomputed at `synth_init()`. MP3 encoder switched from Hamming to Hann, now slides a 512-sample analysis window by 256 samples per frame (OLA analysis), emitting ~2× as many frames. `SpecHeader.version` field propagated through `Spectrogram.version` to the voice's `ola_mode` flag at trigger time. + - **Files**: `src/audio/dct.h`, `src/audio/synth.h`, `src/audio/synth.cc`, `src/audio/window.h`, `src/audio/window.cc`, `src/audio/tracker.cc` + - **Tests**: 34/34 passing + - **Pending**: Regenerate `.spec` files from MP3 assets to activate v2 encoding. + ## Recently Completed (February 21, 2026) - [x] **WGSL Refactor: getScreenCoord Helper** |
