summaryrefslogtreecommitdiff
path: root/TODO.md
diff options
context:
space:
mode:
authorskal <pascal.massimino@gmail.com>2026-03-21 10:07:02 +0100
committerskal <pascal.massimino@gmail.com>2026-03-21 10:07:02 +0100
commit1e8ccfc67c264ce054c59257ee7c17ec4a584a9e (patch)
tree765c3e4392af87c86e9052c321c48a43fda0fac7 /TODO.md
parent5e740fc8f5f48fdd8ec4b84ae0c9a3c74e387d4f (diff)
feat(cnn_v3): Phase 6 — training script (train_cnn_v3.py + cnn_v3_utils.py)
- train_cnn_v3.py: CNNv3 U-Net+FiLM model, training loop, CLI - cnn_v3_utils.py: image I/O, pyrdown, depth_gradient, assemble_features, apply_channel_dropout, detect_salient_points, CNNv3Dataset - Patch-based training (default 64×64) with salient-point extraction (harris/shi-tomasi/fast/gradient/random detectors, pre-cached at init) - Channel dropout for geometric/context/temporal channels - Random FiLM conditioning per sample for joint MLP+U-Net training - docs: HOWTO.md §3 updated with commands and flag reference - TODO.md: Phase 6 marked done, export script noted as next step Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Diffstat (limited to 'TODO.md')
-rw-r--r--TODO.md11
1 files changed, 4 insertions, 7 deletions
diff --git a/TODO.md b/TODO.md
index 559f8b3..59511d3 100644
--- a/TODO.md
+++ b/TODO.md
@@ -79,13 +79,10 @@ PyTorch / HTML WebGPU / C++ WebGPU.
5. ✅ Parity validation: test vectors + `test_cnn_v3_parity.cc`. max_err=4.88e-4 (≤1/255).
- Key fix: intermediate nodes at fractional resolutions (W/2, W/4) via `NodeRegistry::default_width()/default_height()`
-**FiLM MLP training** (blocks meaningful Phase 4 output):
-- Needs `cnn_v3/training/train_cnn_v3.py` — not yet written
-- MLP: `Linear(5→16) → ReLU → Linear(16→48)` trained jointly with U-Net
-- Input: `[beat_phase, beat_time/8, audio_intensity, style_p0, style_p1]`
-- Output: γ/β for enc0(4ch) + enc1(8ch) + dec1(4ch) + dec0(4ch) = 40 floats
-- Trained weights (~3 KB f16) stored in `.bin` after conv weights; loaded at runtime
-- See `cnn_v3/docs/CNN_V3.md` §5 for full MLP spec and §11 for training pipeline plan
+**Next: export + real training run**
+- `train_cnn_v3.py` + `cnn_v3_utils.py` written (Phase 6 training script done)
+- Still needed: `export_cnn_v3_weights.py` — convert trained `.pth` → `.bin` (f16)
+- See `cnn_v3/docs/HOWTO.md` §3 for training commands
## Future: CNN v2 8-bit Quantization