TODO.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132

# To-Do List

**High-level task tracker.** See design docs for details. Completed: `doc/COMPLETED.md`

---

## Priority 1: Spectral Brush Editor (Task #5) [IN PROGRESS]

Procedural spectrogram tool: 50-100× compression (5 KB .spec → ~100 bytes C++).

**Design:** `doc/SPECTRAL_BRUSH_2.md` (MQ-based v2)

---

## Priority 2: Test Infrastructure Maintenance [ONGOING]

**Status:** 38/38 tests passing

### ✅ Fix FFT twiddle factor accumulation bug (`src/audio/fft.cc`) — DONE

`fft_radix2` now computes `wr = cosf(angle*k); wi = sinf(angle*k);` directly per k.
Tests A–E added to `test_fft.cc`. `arrays_match` default tolerance reverted to 5e-3.

## Priority 4: Audio Timing Drift [LOW PRIORITY]

Events trigger ~180ms early over 63 beats @ BPM=90. Observed: beat 63 snare at
41.82s in WAV, should be 42.00s. Root cause unknown — suspects:
1. `chunk_frames = (int)(dt * sample_rate)` truncation (~27ms cumulative, not 180ms)
2. Systematic bias in `unit_duration_sec` BPM calculation
3. Mismatch between tracker time and actual sample rendering

## Priority 4: Audio System Enhancements [LOW PRIORITY]

1. **`synth.cc`: use `ola_decode()` from `src/audio/ola.h`** — the OLA decode logic in
   `synth_render()` is currently inlined for frame-by-frame lazy decoding. Refactor to
   call `ola_decode()` for consistency with `spectool` and the test (requires decoupling
   the per-frame lazy path, e.g. decode a full block on demand then serve samples).

2. **GPU-Accelerated PCM Synthesis:**
   - Compute shader for direct PCM generation (bypass spectrogram)
   - Write to compute buffer, readback to synth

---

## Priority 4: 3D System Enhancements (Task #18)

Pipeline for importing complex 3D scenes to replace hardcoded geometry.

**Status:** C++ object data loading complete. Shader SDF integration pending.

---

## Priority 4: WGSL Modularization (Task #50) [RECURRENT]

Ongoing shader code hygiene for granular, reusable snippets.

---

## Priority 4: Wine/Windows Black Screen

`demo64k.exe` runs under Wine (wgpu-native v27, Vulkan/MoltenVK) but shows a black window — no visuals rendered. Audio and timeline progress correctly. GPU device/adapter init succeeds.

**Likely causes to investigate:**
- Swapchain format mismatch (Wine Vulkan may prefer BGRA8 over RGBA8)
- Surface present failing silently (check `WGPUSurfaceGetCurrentTexture` status)
- Render pass output not reaching the surface (missing present call or wrong texture view)

**To reproduce:** `./scripts/run_win.sh` — window opens, stays black.

---

## CNN v3 — U-Net + FiLM [IN PROGRESS]

**Design:** `cnn_v3/docs/CNN_V3.md` | All phases 1–9 complete. Runtime pipeline operational.

**Current pipeline:** `GBufferEffect` → `GBufDeferredEffect` → `GBufViewEffect` → sink

**Training bugs fixed (2026-03-27):**
- ✅ dec0 ReLU removed: output now spans full [0,1] range (was stuck ≥0.5)
- ✅ FiLM MLP loaded from `cnn_v3_film_mlp.bin` at runtime (was hardcoded heuristics)

**Active work:**
- [ ] Restore full scene in `GBufferEffect::set_scene()` (20 cubes + 4 spheres, 2 lights)
- [ ] Collect ≥50 training samples (currently 11) — see `cnn_v3/docs/HOWTO.md` §2
- [ ] Retrain from scratch — see `cnn_v3/docs/HOWTO.md` §3

**Pending (lower priority):**
- [ ] GBufferEffect: Pass 3 transparency (transp=0 placeholder)
- [ ] GBufferEffect: `resize()` support
- [ ] Web tool (`cnn_v3/tools/shaders.js`): `prev_cnn` always zero in both pack shaders
  (`FULL_PACK_SHADER` line ~313 and simple pack line ~39 hardcode `prev=0`).
  C++ `gbuf_pack.wgsl` reads a real `prev_cnn` texture (binding 6).
  Fix: add a `prev` texture binding to both JS pack shaders and wire it up in `tester.js`.

## Future: CNN v3 "2D Mode" (G-buffer-free)

Allow `CNNv3Effect` to run on a plain screen buffer / photo without a real G-buffer.
Fake the missing feature vectors (normals, depth, material IDs, shadow, transp) from
the RGB input alone:
- normals: approximate from local luminance gradient (Sobel)
- depth: constant (e.g. 0.5) or estimated from a simple heuristic
- material IDs / shadow / transp: neutral defaults (e.g. 0)

This would let the effect be applied to any rendered frame (post-NTSC, post-Scratch, etc.)
without requiring a 3D G-buffer pass upstream, and enable training/inference on photos.

Implementation sketch:
- New `CNNv3Effect2D` subclass (or a mode flag) that synthesizes `feat_tex0`/`feat_tex1`
  internally from a single `rgba8unorm` input, then runs the same 5-pass U-Net.
- Separate `gbuf_pack_2d.wgsl` compute shader that fills feat0/feat1 from a photo buffer.

## Future: CNN v2 8-bit Quantization

Reduce weights from f16 (~3.2 KB) to i8 (~1.6 KB).

**Requirements:** Quantization-aware training (QAT)
**Design:** `cnn_v2/docs/CNN_V2.md`

---

## Future: Size Optimization (64k Target)

- Task #22: Windows Native Platform (Win32)
- Task #28: Spectrogram Quantization
- Task #34: Full STL Removal
- Task #35: CRT Replacement

**Measure:** `./scripts/measure_size.sh`

---

**Backlog:** `doc/BACKLOG.md` for untriaged ideas