diff options
Diffstat (limited to 'TODO.md')
| -rw-r--r-- | TODO.md | 107 |
1 files changed, 37 insertions, 70 deletions
@@ -12,32 +12,41 @@ Procedural spectrogram tool: 50-100× compression (5 KB .spec → ~100 bytes C++ --- -## Priority 2: Test Infrastructure Maintenance [ONGOING] +## Priority 2: CNN v3 Training [IN PROGRESS] -**Status:** 38/38 tests passing +**Design:** `cnn_v3/docs/CNN_V3.md` | Phases 1–9 complete. Runtime pipeline operational. + +**Pipelines:** +- `cnn_v3_test`: `GBufferEffect` → `GBufDeferredEffect` +- `cnn_v3_debug`: `GBufferEffect` → `CNNv3Effect` → `GBufViewEffect` + +**Active:** +- [ ] Restore full scene in `GBufferEffect::set_scene()` (20 cubes + 4 spheres, 2 lights) +- [ ] Collect ≥50 training samples (currently 11) — see `cnn_v3/docs/HOWTO.md` §2 +- [ ] Retrain from scratch — see `cnn_v3/docs/HOWTO.md` §3 -### ✅ Fix FFT twiddle factor accumulation bug (`src/audio/fft.cc`) — DONE +**Pending (lower priority):** +- [ ] GBufferEffect: Pass 3 transparency (transp=0 placeholder) +- [ ] GBufferEffect: `resize()` support +- [ ] Web tool (`cnn_v3/tools/shaders.js`): `prev_cnn` hardcoded to 0 in both JS pack shaders + (lines ~313 / ~39). Fix: add `prev` texture binding and wire in `tester.js`. -`fft_radix2` now computes `wr = cosf(angle*k); wi = sinf(angle*k);` directly per k. -Tests A–E added to `test_fft.cc`. `arrays_match` default tolerance reverted to 5e-3. +--- -## ✅ Audio Timing Drift — DONE +## Priority 3: Test Infrastructure [ONGOING] -Events triggered ~180ms early over 63 beats @ BPM=90. Root causes fixed: -1. `chunk_frames` truncation accumulation replaced by accurate double-precision integration. -2. `tracker` updated to double-precision time representations for exact sample-accurate scheduling. +**Status:** 38/38 tests passing -## ✅ Audio System Enhancements — DONE +--- -1. **`synth.cc`: use `ola_decode()` from `src/audio/ola.h`** — `ola_decode_frame` extracted and used for per-frame OLA-IDCT synthesis, deduplicating the IDCT + overlap handling logic. +## Priority 4: GPU-Accelerated PCM Synthesis -2. **GPU-Accelerated PCM Synthesis:** - - Compute shader for direct PCM generation (bypass spectrogram) - - Write to compute buffer, readback to synth +Compute shader for direct PCM generation (bypasses spectrogram decode). +Write to compute buffer, readback to synth. No design doc yet. --- -## Priority 4: 3D System Enhancements (Task #18) +## Priority 5: 3D System Enhancements (Task #18) Pipeline for importing complex 3D scenes to replace hardcoded geometry. @@ -45,76 +54,34 @@ Pipeline for importing complex 3D scenes to replace hardcoded geometry. --- -## Priority 4: WGSL Modularization (Task #50) [RECURRENT] +## Priority 5: WGSL Modularization (Task #50) [RECURRENT] Ongoing shader code hygiene for granular, reusable snippets. --- -## Priority 4: Wine/Windows Black Screen +## Priority 5: Wine/Windows Black Screen -`demo64k.exe` runs under Wine (wgpu-native v27, Vulkan/MoltenVK) but shows a black window — no visuals rendered. Audio and timeline progress correctly. GPU device/adapter init succeeds. +`demo64k.exe` opens under Wine but shows a black window. Audio runs correctly. -**Likely causes to investigate:** +**Likely causes:** - Swapchain format mismatch (Wine Vulkan may prefer BGRA8 over RGBA8) -- Surface present failing silently (check `WGPUSurfaceGetCurrentTexture` status) -- Render pass output not reaching the surface (missing present call or wrong texture view) +- Surface present failing silently (`WGPUSurfaceGetCurrentTexture` status) +- Render pass output not reaching the surface -**To reproduce:** `./scripts/run_win.sh` — window opens, stays black. +**To reproduce:** `./scripts/run_win.sh` --- -## CNN v3 — U-Net + FiLM [IN PROGRESS] +## Future -**Design:** `cnn_v3/docs/CNN_V3.md` | All phases 1–9 complete. Runtime pipeline operational. - -**Current pipeline:** `GBufferEffect` → `GBufDeferredEffect` → `GBufViewEffect` → sink - -**Training bugs fixed (2026-03-27):** -- ✅ dec0 ReLU removed: output now spans full [0,1] range (was stuck ≥0.5) -- ✅ FiLM MLP loaded from `cnn_v3_film_mlp.bin` at runtime (was hardcoded heuristics) - -**Active work:** -- [ ] Restore full scene in `GBufferEffect::set_scene()` (20 cubes + 4 spheres, 2 lights) -- [ ] Collect ≥50 training samples (currently 11) — see `cnn_v3/docs/HOWTO.md` §2 -- [ ] Retrain from scratch — see `cnn_v3/docs/HOWTO.md` §3 - -**Pending (lower priority):** -- [ ] GBufferEffect: Pass 3 transparency (transp=0 placeholder) -- [ ] GBufferEffect: `resize()` support -- [ ] Web tool (`cnn_v3/tools/shaders.js`): `prev_cnn` always zero in both pack shaders - (`FULL_PACK_SHADER` line ~313 and simple pack line ~39 hardcode `prev=0`). - C++ `gbuf_pack.wgsl` reads a real `prev_cnn` texture (binding 6). - Fix: add a `prev` texture binding to both JS pack shaders and wire it up in `tester.js`. - -## Future: CNN v3 "2D Mode" (G-buffer-free) - -Allow `CNNv3Effect` to run on a plain screen buffer / photo without a real G-buffer. -Fake the missing feature vectors (normals, depth, material IDs, shadow, transp) from -the RGB input alone: -- normals: approximate from local luminance gradient (Sobel) -- depth: constant (e.g. 0.5) or estimated from a simple heuristic -- material IDs / shadow / transp: neutral defaults (e.g. 0) - -This would let the effect be applied to any rendered frame (post-NTSC, post-Scratch, etc.) -without requiring a 3D G-buffer pass upstream, and enable training/inference on photos. - -Implementation sketch: -- New `CNNv3Effect2D` subclass (or a mode flag) that synthesizes `feat_tex0`/`feat_tex1` - internally from a single `rgba8unorm` input, then runs the same 5-pass U-Net. -- Separate `gbuf_pack_2d.wgsl` compute shader that fills feat0/feat1 from a photo buffer. - -## Future: CNN v2 8-bit Quantization - -Reduce weights from f16 (~3.2 KB) to i8 (~1.6 KB). - -**Requirements:** Quantization-aware training (QAT) -**Design:** `cnn_v2/docs/CNN_V2.md` - ---- +### CNN v3 "2D Mode" (G-buffer-free) +Run `CNNv3Effect` on a plain screen buffer / photo — fake normals via Sobel, constant depth, neutral material defaults. New `gbuf_pack_2d.wgsl` + `CNNv3Effect2D` subclass or mode flag. -## Future: Size Optimization (64k Target) +### CNN v2 8-bit Quantization +Reduce weights f16 (~3.2 KB) → i8 (~1.6 KB). Requires QAT. See `cnn_v2/docs/CNN_V2.md`. +### Size Optimization (64k Target) - Task #22: Windows Native Platform (Win32) - Task #28: Spectrogram Quantization - Task #34: Full STL Removal |
