diff options
Diffstat (limited to 'cnn_v3/docs/CNN_V3.md')
| -rw-r--r-- | cnn_v3/docs/CNN_V3.md | 8 |
1 files changed, 6 insertions, 2 deletions
diff --git a/cnn_v3/docs/CNN_V3.md b/cnn_v3/docs/CNN_V3.md index a197a1d..081adf8 100644 --- a/cnn_v3/docs/CNN_V3.md +++ b/cnn_v3/docs/CNN_V3.md @@ -19,7 +19,7 @@ CNN v3 is a next-generation post-processing effect using: - Training from both Blender renders and real photos - Strict test framework: per-pixel bit-exact validation across all implementations -**Status:** Phases 1–7 complete. Architecture upgraded to enc_channels=[8,16] for improved capacity. Parity test and runtime updated. Next: training pass. +**Status:** Phases 1–9 complete. Architecture upgraded to enc_channels=[8,16]. Two training bugs fixed (dec0 ReLU removed; FiLM MLP loaded at runtime). Parity validated. Next: retrain from scratch with more data. --- @@ -34,9 +34,13 @@ FiLM is applied **inside each encoder/decoder block**, after each convolution. ### U-Net Block (per level) ``` -input → Conv 3×3 → BN (or none) → FiLM(γ,β) → ReLU → output +enc0/enc1/dec1: input → Conv 3×3 → FiLM(γ,β) → ReLU → output +dec0 (final): input → Conv 3×3 → FiLM(γ,β) → Sigmoid → output ``` +The final decoder layer uses sigmoid directly — **no ReLU** — so the network +can output the full [0,1] range. ReLU before sigmoid would clamp to [0.5,1.0]. + FiLM at level `l`: ``` FiLM(x, γ_l, β_l) = γ_l ⊙ x + β_l (per-channel affine) |
