|
architecture PNG
- Replace 1×1 pointwise bottleneck with Conv(8→8, 3×3, dilation=2):
effective RF grows from ~13px to ~29px at ¼res (~+1 KB weights)
- Add Sobel edge loss in training (--edge-loss-weight, default 0.1)
- Add FiLM 2-phase training: freeze MLP for warmup epochs then
unfreeze at lr×0.1 (--film-warmup-epochs, default 50)
- Update weight layout: BN 72→584 f16, total 1964→2476 f16 (4952 B)
- Cascade offsets in C++ effect, JS tool, export/gen_test_vectors scripts
- Regenerate test_vectors.h (1238 u32); parity max_err=9.77e-04
- Generate dark-theme U-Net+FiLM architecture PNG (gen_architecture_png.py)
- Replace ASCII art in CNN_V3.md and HOW_TO_CNN.md with PNG embed
handoff(Gemini): bottleneck dilation + Sobel loss + FiLM warmup landed.
Next: run first real training pass (see cnn_v3/docs/HOWTO.md §3).
|
|
- Add test_cnn_v3_parity.cc: zero_weights + random_weights tests
- Add gen_test_vectors.py: PyTorch reference implementation for enc0/enc1/bn/dec1/dec0
- Add test_vectors.h: generated C header with enc0, dec1, output expected values
- Fix declare_nodes(): intermediate textures at fractional resolutions (W/2, W/4)
using new NodeRegistry::default_width()/default_height() getters
- Add layer-by-layer readback (enc0, dec1) for regression coverage
- Final parity: enc0 max_err=1.95e-3, dec1 max_err=1.95e-3, out max_err=4.88e-4
handoff(Claude): CNN v3 parity done. Next: train_cnn_v3.py (FiLM MLP training).
|