diff options
| author | skal <pascal.massimino@gmail.com> | 2026-03-21 09:54:16 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-03-21 09:54:16 +0100 |
| commit | 5e740fc8f5f48fdd8ec4b84ae0c9a3c74e387d4f (patch) | |
| tree | c330c8402e771d4b02316331d734802337d413c4 /cnn_v3/docs/HOWTO.md | |
| parent | 673a24215b2670007317060325256059d1448f3b (diff) | |
docs(cnn_v3): update CNN_V3.md + HOWTO.md to reflect Phases 1-5 complete
- CNN_V3.md: status line, architecture channel counts (8/16→4/8), FiLM MLP
output count (96→40 params), size budget table (real implemented values)
- HOWTO.md: Phase status table (5→done, add phase 6 training TODO), sections
3-5 rewritten to reflect what exists vs what is still planned
Diffstat (limited to 'cnn_v3/docs/HOWTO.md')
| -rw-r--r-- | cnn_v3/docs/HOWTO.md | 50 |
1 files changed, 25 insertions, 25 deletions
diff --git a/cnn_v3/docs/HOWTO.md b/cnn_v3/docs/HOWTO.md index 22266d3..425a33b 100644 --- a/cnn_v3/docs/HOWTO.md +++ b/cnn_v3/docs/HOWTO.md @@ -135,7 +135,7 @@ Mix freely; the dataloader treats all sample directories uniformly. ## 3. Training -*(Network not yet implemented — this section will be filled as Phase 3+ lands.)* +*(Script not yet written — see TODO.md. Architecture spec in `CNN_V3.md` §Training.)* **Planned command:** ```bash @@ -146,21 +146,15 @@ python3 cnn_v3/training/train_cnn_v3.py \ ``` **FiLM conditioning** during training: -- Beat/audio inputs are randomized per sample -- Network learns to produce varied styles from same geometry - -**Validation:** -```bash -python3 cnn_v3/training/train_cnn_v3.py --validate \ - --checkpoint cnn_v3/weights/cnn_v3_weights.bin \ - --input test_frame.png -``` +- Beat/audio inputs randomized per sample +- MLP: `Linear(5→16) → ReLU → Linear(16→40)` trained jointly with U-Net +- Output: γ/β for enc0(4ch) + enc1(8ch) + dec1(4ch) + dec0(4ch) = 40 floats --- -## 4. Running the CNN v3 Effect (Future) +## 4. Running the CNN v3 Effect -Once the C++ CNNv3Effect exists: +`CNNv3Effect` is implemented. Wire into a sequence: ```seq # BPM 120 @@ -169,27 +163,32 @@ SEQUENCE 0 0 "Scene with CNN v3" EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> sink 0 60 ``` -FiLM parameters are uploaded via uniform each frame: +FiLM parameters uploaded each frame: ```cpp cnn_v3_effect->set_film_params( params.beat_phase, params.beat_time / 8.0f, params.audio_intensity, style_p0, style_p1); ``` +FiLM γ/β default to identity (γ=1, β=0) until `train_cnn_v3.py` produces a trained MLP. + --- ## 5. Per-Pixel Validation -The CNN v3 design requires exact parity between PyTorch, WGSL (HTML), and C++. +C++ parity test passes: `src/tests/gpu/test_cnn_v3_parity.cc` (2 tests). + +```bash +cmake -B build -DDEMO_BUILD_TESTS=ON && cmake --build build -j4 +cd build && ./test_cnn_v3_parity +``` -*(Validation tooling not yet implemented.)* +Results (8×8 test tensors, random weights): +- enc0 max_err = 1.95e-3 ✓ +- dec1 max_err = 1.95e-3 ✓ +- final max_err = 4.88e-4 ✓ (all ≤ 1/255 = 3.92e-3) -**Planned workflow:** -1. Export test input + weights as JSON -2. Run Python reference → save per-pixel output -3. Run HTML WebGPU tool → compare against Python -4. Run C++ `cnn_v3_test` tool → compare against Python -5. All comparisons must pass at ≤ 1/255 per pixel +Test vectors generated by `cnn_v3/training/gen_test_vectors.py` (PyTorch reference). --- @@ -197,12 +196,13 @@ The CNN v3 design requires exact parity between PyTorch, WGSL (HTML), and C++. | Phase | Status | Notes | |-------|--------|-------| -| 1 — G-buffer (raster + pack) | ✅ Done | Integrated, 35/35 tests pass | -| 1 — G-buffer (SDF + shadow passes) | TODO | Placeholder in place | +| 1 — G-buffer (raster + pack) | ✅ Done | Integrated, 36/36 tests pass | +| 1 — G-buffer (SDF + shadow passes) | TODO | Placeholder: shadow=1, transp=0 | | 2 — Training infrastructure | ✅ Done | blender_export.py, pack_*_sample.py | | 3 — WGSL U-Net shaders | ✅ Done | 5 compute shaders + cnn_v3/common snippet | -| 4 — C++ CNNv3Effect | ✅ Done | FiLM uniform upload, 35/35 tests pass | -| 5 — Parity validation | TODO | Test vectors, ≤1/255 | +| 4 — C++ CNNv3Effect | ✅ Done | FiLM uniform upload, 36/36 tests pass | +| 5 — Parity validation | ✅ Done | test_cnn_v3_parity.cc, max_err=4.88e-4 | +| 6 — FiLM MLP training | TODO | train_cnn_v3.py not yet written | --- |
