summaryrefslogtreecommitdiff
path: root/cnn_v3/docs/HOWTO.md
diff options
context:
space:
mode:
Diffstat (limited to 'cnn_v3/docs/HOWTO.md')
-rw-r--r--cnn_v3/docs/HOWTO.md50
1 files changed, 25 insertions, 25 deletions
diff --git a/cnn_v3/docs/HOWTO.md b/cnn_v3/docs/HOWTO.md
index 22266d3..425a33b 100644
--- a/cnn_v3/docs/HOWTO.md
+++ b/cnn_v3/docs/HOWTO.md
@@ -135,7 +135,7 @@ Mix freely; the dataloader treats all sample directories uniformly.
## 3. Training
-*(Network not yet implemented — this section will be filled as Phase 3+ lands.)*
+*(Script not yet written — see TODO.md. Architecture spec in `CNN_V3.md` §Training.)*
**Planned command:**
```bash
@@ -146,21 +146,15 @@ python3 cnn_v3/training/train_cnn_v3.py \
```
**FiLM conditioning** during training:
-- Beat/audio inputs are randomized per sample
-- Network learns to produce varied styles from same geometry
-
-**Validation:**
-```bash
-python3 cnn_v3/training/train_cnn_v3.py --validate \
- --checkpoint cnn_v3/weights/cnn_v3_weights.bin \
- --input test_frame.png
-```
+- Beat/audio inputs randomized per sample
+- MLP: `Linear(5→16) → ReLU → Linear(16→40)` trained jointly with U-Net
+- Output: γ/β for enc0(4ch) + enc1(8ch) + dec1(4ch) + dec0(4ch) = 40 floats
---
-## 4. Running the CNN v3 Effect (Future)
+## 4. Running the CNN v3 Effect
-Once the C++ CNNv3Effect exists:
+`CNNv3Effect` is implemented. Wire into a sequence:
```seq
# BPM 120
@@ -169,27 +163,32 @@ SEQUENCE 0 0 "Scene with CNN v3"
EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> sink 0 60
```
-FiLM parameters are uploaded via uniform each frame:
+FiLM parameters uploaded each frame:
```cpp
cnn_v3_effect->set_film_params(
params.beat_phase, params.beat_time / 8.0f, params.audio_intensity,
style_p0, style_p1);
```
+FiLM γ/β default to identity (γ=1, β=0) until `train_cnn_v3.py` produces a trained MLP.
+
---
## 5. Per-Pixel Validation
-The CNN v3 design requires exact parity between PyTorch, WGSL (HTML), and C++.
+C++ parity test passes: `src/tests/gpu/test_cnn_v3_parity.cc` (2 tests).
+
+```bash
+cmake -B build -DDEMO_BUILD_TESTS=ON && cmake --build build -j4
+cd build && ./test_cnn_v3_parity
+```
-*(Validation tooling not yet implemented.)*
+Results (8×8 test tensors, random weights):
+- enc0 max_err = 1.95e-3 ✓
+- dec1 max_err = 1.95e-3 ✓
+- final max_err = 4.88e-4 ✓ (all ≤ 1/255 = 3.92e-3)
-**Planned workflow:**
-1. Export test input + weights as JSON
-2. Run Python reference → save per-pixel output
-3. Run HTML WebGPU tool → compare against Python
-4. Run C++ `cnn_v3_test` tool → compare against Python
-5. All comparisons must pass at ≤ 1/255 per pixel
+Test vectors generated by `cnn_v3/training/gen_test_vectors.py` (PyTorch reference).
---
@@ -197,12 +196,13 @@ The CNN v3 design requires exact parity between PyTorch, WGSL (HTML), and C++.
| Phase | Status | Notes |
|-------|--------|-------|
-| 1 — G-buffer (raster + pack) | ✅ Done | Integrated, 35/35 tests pass |
-| 1 — G-buffer (SDF + shadow passes) | TODO | Placeholder in place |
+| 1 — G-buffer (raster + pack) | ✅ Done | Integrated, 36/36 tests pass |
+| 1 — G-buffer (SDF + shadow passes) | TODO | Placeholder: shadow=1, transp=0 |
| 2 — Training infrastructure | ✅ Done | blender_export.py, pack_*_sample.py |
| 3 — WGSL U-Net shaders | ✅ Done | 5 compute shaders + cnn_v3/common snippet |
-| 4 — C++ CNNv3Effect | ✅ Done | FiLM uniform upload, 35/35 tests pass |
-| 5 — Parity validation | TODO | Test vectors, ≤1/255 |
+| 4 — C++ CNNv3Effect | ✅ Done | FiLM uniform upload, 36/36 tests pass |
+| 5 — Parity validation | ✅ Done | test_cnn_v3_parity.cc, max_err=4.88e-4 |
+| 6 — FiLM MLP training | TODO | train_cnn_v3.py not yet written |
---