TODO.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144

# To-Do List

**High-level task tracker.** See design docs for details. Completed: `doc/COMPLETED.md`

---

## Priority 1: Spectral Brush Editor (Task #5) [IN PROGRESS]

Procedural spectrogram tool: 50-100× compression (5 KB .spec → ~100 bytes C++).

**Design:** `doc/SPECTRAL_BRUSH_2.md` (MQ-based v2)

---

## Priority 2: Test Infrastructure Maintenance [ONGOING]

**Status:** 38/38 tests passing

### Fix FFT twiddle factor accumulation bug (`src/audio/fft.cc`)

**Root cause:** `fft_radix2` updates twiddle factors iteratively over 256 iterations
in the final stage (N=512). Accumulated floating-point drift causes sign flips on
specific DCT coefficients. Round-trip (DCT+IDCT) still works because both sides
make the same errors symmetrically, masking the bug.

**Fix:** Replace iterative twiddle update with direct `cosf/sinf` per k:
```cpp
// fft_radix2, inner loop — replace:
wr = wr_old * wr_delta - wi * wi_delta;
wi = wr_old * wi_delta + wi * wr_delta;
// with:
wr = cosf(angle * (float)k);
wi = sinf(angle * (float)k);
```

**Test plan (`src/tests/audio/test_fft.cc`):** add one test per component, in order:

- [ ] **A: `bit_reverse_permute`** — for N=8 verify exact index mapping:
      0→0, 1→4, 2→1, 3→6, 4→2, 5→7, 6→3, 7→5 (must be exact)
- [ ] **B: `fft_radix2` small N** — DFT of `[1,0,0,0]` (N=4) via FFT vs direct sum;
      all 4 unit impulses. Expect machine epsilon.
- [ ] **C: twiddle accumulation** — compare iterative vs `cosf/sinf` twiddle factors
      at k=128..255, stage size=512. **This test must fail before the fix.**
- [ ] **D: `dct_fft` small N** — all 8 unit impulses for N=8 vs reference.
      Expect machine epsilon (exact for small N).
- [ ] **E: `dct_fft` large N** — all original test cases (N=512): impulse[0],
      impulse[N/2], sinusoidal, complex. Expect < 1e-5 after fix.

Revert threshold in `arrays_match` back to `5e-3` (or tighter) once fixed.

## Priority 4: Audio System Enhancements [LOW PRIORITY]

1. **`synth.cc`: use `ola_decode()` from `src/audio/ola.h`** — the OLA decode logic in
   `synth_render()` is currently inlined for frame-by-frame lazy decoding. Refactor to
   call `ola_decode()` for consistency with `spectool` and the test (requires decoupling
   the per-frame lazy path, e.g. decode a full block on demand then serve samples).

2. **GPU-Accelerated PCM Synthesis:**
   - Compute shader for direct PCM generation (bypass spectrogram)
   - Write to compute buffer, readback to synth

---

## Priority 4: 3D System Enhancements (Task #18)

Pipeline for importing complex 3D scenes to replace hardcoded geometry.

**Status:** C++ object data loading complete. Shader SDF integration pending.

---

## Priority 4: WGSL Modularization (Task #50) [RECURRENT]

Ongoing shader code hygiene for granular, reusable snippets.

---

## Priority 4: Wine/Windows Black Screen

`demo64k.exe` runs under Wine (wgpu-native v27, Vulkan/MoltenVK) but shows a black window — no visuals rendered. Audio and timeline progress correctly. GPU device/adapter init succeeds.

**Likely causes to investigate:**
- Swapchain format mismatch (Wine Vulkan may prefer BGRA8 over RGBA8)
- Surface present failing silently (check `WGPUSurfaceGetCurrentTexture` status)
- Render pass output not reaching the surface (missing present call or wrong texture view)

**To reproduce:** `./scripts/run_win.sh` — window opens, stays black.

---

## CNN v3 — U-Net + FiLM [IN PROGRESS]

**Design:** `cnn_v3/docs/CNN_V3.md` | All phases 1–7 complete. Runtime pipeline operational.

**Current pipeline:** `GBufferEffect` → `GBufDeferredEffect` → `GBufViewEffect` → sink

**Shadow pass status:** ✅ Fixed and re-enabled. Cube + sphere shadows correct. Pulsating sphere scale confirmed correct end-to-end. Scene is currently simplified (1 cube + 1 sphere, 1 light) for debugging.

**Active work:**
- [ ] Restore full scene in `GBufferEffect::set_scene()` (20 cubes + 4 spheres, 2 lights)
- [ ] Run first real training pass — see `cnn_v3/docs/HOWTO.md` §3

**Pending (lower priority):**
- [ ] GBufferEffect: Pass 3 transparency (transp=0 placeholder)
- [ ] GBufferEffect: `resize()` support

## Future: CNN v3 "2D Mode" (G-buffer-free)

Allow `CNNv3Effect` to run on a plain screen buffer / photo without a real G-buffer.
Fake the missing feature vectors (normals, depth, material IDs, shadow, transp) from
the RGB input alone:
- normals: approximate from local luminance gradient (Sobel)
- depth: constant (e.g. 0.5) or estimated from a simple heuristic
- material IDs / shadow / transp: neutral defaults (e.g. 0)

This would let the effect be applied to any rendered frame (post-NTSC, post-Scratch, etc.)
without requiring a 3D G-buffer pass upstream, and enable training/inference on photos.

Implementation sketch:
- New `CNNv3Effect2D` subclass (or a mode flag) that synthesizes `feat_tex0`/`feat_tex1`
  internally from a single `rgba8unorm` input, then runs the same 5-pass U-Net.
- Separate `gbuf_pack_2d.wgsl` compute shader that fills feat0/feat1 from a photo buffer.

## Future: CNN v2 8-bit Quantization

Reduce weights from f16 (~3.2 KB) to i8 (~1.6 KB).

**Requirements:** Quantization-aware training (QAT)
**Design:** `cnn_v2/docs/CNN_V2.md`

---

## Future: Size Optimization (64k Target)

- Task #22: Windows Native Platform (Win32)
- Task #28: Spectrogram Quantization
- Task #34: Full STL Removal
- Task #35: CRT Replacement

**Measure:** `./scripts/measure_size.sh`

---

**Backlog:** `doc/BACKLOG.md` for untriaged ideas