summaryrefslogtreecommitdiff
path: root/cnn_v3/docs/HOW_TO_CNN.md
diff options
context:
space:
mode:
authorskal <pascal.massimino@gmail.com>2026-03-26 07:03:01 +0100
committerskal <pascal.massimino@gmail.com>2026-03-26 07:03:01 +0100
commit8f14bdd66cb002b2f89265b2a578ad93249089c9 (patch)
tree2ccdb3939b673ebc3a5df429160631240239cee2 /cnn_v3/docs/HOW_TO_CNN.md
parent4ca498277b033ae10134045dae9c8c249a8d2b2b (diff)
feat(cnn_v3): upgrade architecture to enc_channels=[8,16]
Double encoder capacity: enc0 4→8ch, enc1 8→16ch, bottleneck 16→16ch, dec1 32→8ch, dec0 16→4ch. Total weights 2476→7828 f16 (~15.3 KB). FiLM MLP output 40→72 params (L1: 16×40→16×72). 16-ch textures split into _lo/_hi rgba32uint pairs (enc1, bottleneck). enc0 and dec1 textures changed from rgba16float to rgba32uint (8ch). GBUF_RGBA32UINT node gains CopySrc for parity test readback. - WGSL shaders: all 5 passes rewritten for new channel counts - C++ CNNv3Effect: new weight offsets/sizes, 8ch uniform structs - Web tool (shaders.js + tester.js): matching texture formats and bindings - Parity test: readback_rgba32uint_8ch helper, updated vector counts - Training scripts: default enc_channels=[8,16], updated docstrings - Docs + architecture PNG regenerated handoff(Gemini): CNN v3 [8,16] upgrade complete. All code, tests, web tool, training scripts, and docs updated. Next: run training pass.
Diffstat (limited to 'cnn_v3/docs/HOW_TO_CNN.md')
-rw-r--r--cnn_v3/docs/HOW_TO_CNN.md32
1 files changed, 16 insertions, 16 deletions
diff --git a/cnn_v3/docs/HOW_TO_CNN.md b/cnn_v3/docs/HOW_TO_CNN.md
index 09db97c..11ed260 100644
--- a/cnn_v3/docs/HOW_TO_CNN.md
+++ b/cnn_v3/docs/HOW_TO_CNN.md
@@ -358,7 +358,7 @@ uv run train_cnn_v3.py \
The model prints its parameter count:
```
-Model: enc=[4, 8] film_cond_dim=5 params=3252 (~6.4 KB f16)
+Model: enc=[8, 16] film_cond_dim=5 params=9148 (~17.9 KB f16)
```
If `params` is much higher, `--enc-channels` was changed; update C++ constants accordingly.
@@ -492,12 +492,12 @@ WEIGHTS_CNN_V3_FILM_MLP, BINARY, weights/cnn_v3_film_mlp.bin, "CNN v3 FiLM MLP w
| Layer | f16 count | Bytes |
|-------|-----------|-------|
-| enc0 Conv(20→4,3×3)+bias | 724 | — |
-| enc1 Conv(4→8,3×3)+bias | 296 | — |
-| bottleneck Conv(8→8,3×3,dil=2)+bias | 584 | — |
-| dec1 Conv(16→4,3×3)+bias | 580 | — |
-| dec0 Conv(8→4,3×3)+bias | 292 | — |
-| **Total** | **2476 f16** | **4952 bytes** |
+| enc0 Conv(20→8,3×3)+bias | 1448 | — |
+| enc1 Conv(8→16,3×3)+bias | 1168 | — |
+| bottleneck Conv(16→16,3×3,dil=2)+bias | 2320 | — |
+| dec1 Conv(32→8,3×3)+bias | 2312 | — |
+| dec0 Conv(16→4,3×3)+bias | 580 | — |
+| **Total** | **7828 f16** | **15656 bytes** |
**`cnn_v3_film_mlp.bin`** — FiLM MLP weights as raw f32, row-major:
@@ -505,9 +505,9 @@ WEIGHTS_CNN_V3_FILM_MLP, BINARY, weights/cnn_v3_film_mlp.bin, "CNN v3 FiLM MLP w
|-------|-------|-----------|
| L0 weight | (16, 5) | 80 |
| L0 bias | (16,) | 16 |
-| L1 weight | (40, 16) | 640 |
-| L1 bias | (40,) | 40 |
-| **Total** | | **776 f32 = 3104 bytes** |
+| L1 weight | (72, 16) | 1152 |
+| L1 bias | (72,) | 72 |
+| **Total** | | **1320 f32 = 5280 bytes** |
The FiLM MLP is for CPU-side inference (future — see §4d). The U-Net weights in
`cnn_v3_weights.bin` are what you need immediately.
@@ -524,16 +524,16 @@ The export script produces this layout: `u32 = u16[0::2] | (u16[1::2] << 16)`.
```
Checkpoint: epoch=200 loss=0.012345
- enc_channels=[4, 8] film_cond_dim=5
+ enc_channels=[8, 16] film_cond_dim=5
cnn_v3_weights.bin
- 2476 f16 values → 1238 u32 → 4952 bytes
- Upload via CNNv3Effect::upload_weights(queue, data, 4952)
+ 7828 f16 values → 3914 u32 → 15656 bytes
+ Upload via CNNv3Effect::upload_weights(queue, data, 15656)
cnn_v3_film_mlp.bin
L0: weight (16, 5) + bias (16,)
- L1: weight (40, 16) + bias (40,)
- 776 f32 values → 3104 bytes
+ L1: weight (72, 16) + bias (72,)
+ 1320 f32 values → 5280 bytes
```
### Pitfalls
@@ -542,7 +542,7 @@ cnn_v3_film_mlp.bin
assertion in the export script fires. The C++ weight-offset constants (`kEnc0Weights` etc.)
in `cnn_v3_effect.cc` must also be updated to match.
- **Old checkpoint missing `config`:** if `config` key is absent (checkpoint from a very early
- version), the script defaults to `enc_channels=[4,8], film_cond_dim=5`.
+ version), the script defaults to `enc_channels=[8,16], film_cond_dim=5`.
- **`weights_only=True`:** requires PyTorch ≥ 2.0. If you get a warning, upgrade torch.
---