summaryrefslogtreecommitdiff
path: root/cnn_v3/src/cnn_v3_effect.cc
AgeCommit message (Collapse)Author
34 hoursrefactor(cnn_v3): code review — comments, simplifications, test fixskal
C++: - cnn_v3_effect.cc: fix declare_nodes comment (output node declared by caller) - cnn_v3_effect.cc: add TODO(phase-7) marker for FiLM MLP replacement WGSL: - cnn_v3_bottleneck.wgsl: consolidate _pad fields onto one line, explain why array<u32,3> is invalid in uniform address space - cnn_v3_enc0.wgsl: fix "12xu8" → "12ch u8norm" in header comment - cnn_v3_dec0.wgsl: clarify parity note (sigmoid after FiLM+ReLU, not raw conv) - cnn_v3_common.wgsl: clarify unpack_8ch pack layout (low/high 16 bits) Python: - cnn_v3_utils.py: replace PIL-based _upsample_nearest (uint8 round-trip) with pure numpy index arithmetic; rename _resize_rgb → _resize_img (handles any channel count); add comment on normal zero-pad workaround - export_cnn_v3_weights.py: add cross-ref to cnn_v3_effect.cc constants; clarify weight count comments with Conv notation Test: - test_cnn_v3_parity.cc: enc0/dec1 layer failures now return 0 (were print-only) handoff(Gemini): CNN v3 review complete, 36/36 tests passing.
38 hoursfeat(cnn_v3): Phase 5 complete — parity validation passing (36/36 tests)skal
- Add test_cnn_v3_parity.cc: zero_weights + random_weights tests - Add gen_test_vectors.py: PyTorch reference implementation for enc0/enc1/bn/dec1/dec0 - Add test_vectors.h: generated C header with enc0, dec1, output expected values - Fix declare_nodes(): intermediate textures at fractional resolutions (W/2, W/4) using new NodeRegistry::default_width()/default_height() getters - Add layer-by-layer readback (enc0, dec1) for regression coverage - Final parity: enc0 max_err=1.95e-3, dec1 max_err=1.95e-3, out max_err=4.88e-4 handoff(Claude): CNN v3 parity done. Next: train_cnn_v3.py (FiLM MLP training).
39 hoursfeat(cnn_v3): Phase 4 complete — CNNv3Effect C++ + FiLM uniform uploadskal
- cnn_v3/src/cnn_v3_effect.{h,cc}: full Effect subclass with 5 compute passes (enc0→enc1→bottleneck→dec1→dec0), shared weights storage buffer, per-pass uniform buffers, set_film_params() API - Fixed WGSL/C++ struct alignment: vec3u has align=16, so CnnV3Params4ch is 64 bytes and CnnV3ParamsEnc1 is 96 bytes (not 48/80) - Weight offsets computed as explicit formulas (e.g. 20*4*9+4) for clarity - Registered in CMake, shaders.h/cc, demo_effects.h, test_demo_effects.cc - 35/35 tests pass handoff(Gemini): CNN v3 Phase 5 next — parity validation (Python ref vs WGSL)