diff options
Diffstat (limited to 'cnn_v3/docs/HOWTO.md')
| -rw-r--r-- | cnn_v3/docs/HOWTO.md | 138 |
1 files changed, 137 insertions, 1 deletions
diff --git a/cnn_v3/docs/HOWTO.md b/cnn_v3/docs/HOWTO.md index 5cfc371..58f09ed 100644 --- a/cnn_v3/docs/HOWTO.md +++ b/cnn_v3/docs/HOWTO.md @@ -587,9 +587,145 @@ Visualization panel still works. --- -## 10. See Also +## 10. Python / WGSL Parity Check (infer_cnn_v3 + cnn_test) + +Two complementary tools for comparing PyTorch inference against the live WGSL +compute shaders on the same input image. + +### 10a. infer_cnn_v3.py — PyTorch reference inference + +**Location:** `cnn_v3/training/infer_cnn_v3.py` + +Runs the trained `CNNv3` model in Python and saves the RGBA output as PNG. + +**Simple mode** (single PNG, geometry zeroed): +```bash +cd cnn_v3/training +python3 infer_cnn_v3.py photo.png out_python.png \ + --checkpoint checkpoints/checkpoint_epoch_200.pth +``` + +**Full mode** (sample directory with all G-buffer files): +```bash +python3 infer_cnn_v3.py dataset/simple/sample_000/ out_python.png \ + --checkpoint checkpoints/checkpoint_epoch_200.pth +``` + +**Identity FiLM** — bypass MLP, use γ=1 β=0 (matches C++ `cnn_test` default): +```bash +python3 infer_cnn_v3.py photo.png out_python.png \ + --checkpoint checkpoints/checkpoint_epoch_200.pth \ + --identity-film +``` + +**Options:** + +| Flag | Default | Description | +|------|---------|-------------| +| `--checkpoint CKPT` | auto-find latest | Path to `.pth` checkpoint | +| `--enc-channels C` | from checkpoint | `4,8` — must match training config | +| `--cond F F F F F` | `0 0 0 0 0` | FiLM conditioning (beat_phase, beat_norm, audio, style0, style1) | +| `--identity-film` | off | Bypass FiLM MLP, use γ=1 β=0 | +| `--blend F` | `1.0` | Blend with albedo: 0=input, 1=CNN | +| `--debug-hex` | off | Print first 8 output pixels as hex | + +In **simple mode**, geometry channels are zeroed: `normal=(0.5,0.5)` (oct-encodes +to ≈(0,0,1)), `depth=0`, `matid=0`, `shadow=1`, `transp=0`. + +The checkpoint `config` dict (saved by `train_cnn_v3.py`) sets `enc_channels` +and `film_cond_dim` automatically; `--enc-channels` is only needed if the +checkpoint lacks a config key. + +--- + +### 10b. cnn_test — WGSL / GPU reference inference + +**Location:** `tools/cnn_test.cc` **Binary:** `build/cnn_test` + +Packs the same 20-channel feature tensor as `infer_cnn_v3.py`, uploads it to +GPU, runs the five `CNNv3Effect` compute passes, and saves the RGBA16Float +output as PNG. + +**Build** (requires `DEMO_BUILD_TESTS=ON` or `DEMO_WORKSPACE=main`): +```bash +cmake -B build -DDEMO_BUILD_TESTS=ON && cmake --build build -j4 --target cnn_test +``` + +**Simple mode:** +```bash +./build/cnn_test photo.png out_gpu.png --weights workspaces/main/weights/cnn_v3_weights.bin +``` + +**Full mode** (sample directory): +```bash +./build/cnn_test dataset/simple/sample_000/albedo.png out_gpu.png \ + --sample-dir dataset/simple/sample_000/ \ + --weights workspaces/main/weights/cnn_v3_weights.bin +``` + +**Options:** + +| Flag | Description | +|------|-------------| +| `--sample-dir DIR` | Load all G-buffer files (albedo/normal/depth/matid/shadow/transp) | +| `--weights FILE` | `cnn_v3_weights.bin` (uses asset-embedded weights if omitted) | +| `--debug-hex` | Print first 8 output pixels as hex | +| `--help` | Show usage | + +FiLM is always **identity** (γ=1, β=0) — matching the C++ `CNNv3Effect` default +until GPU-side FiLM MLP evaluation is added. + +--- + +### 10c. Side-by-side comparison + +For a pixel-accurate comparison, use `--identity-film` in Python and `--debug-hex` +in both tools: + +```bash +cd cnn_v3/training + +# 1. Python inference (identity FiLM) +python3 infer_cnn_v3.py photo.png out_python.png \ + --checkpoint checkpoints/checkpoint_epoch_200.pth \ + --identity-film --debug-hex + +# 2. GPU inference (always identity FiLM) +./build/cnn_test photo.png out_gpu.png \ + --weights workspaces/main/weights/cnn_v3_weights.bin \ + --debug-hex +``` + +Both tools print first 8 pixels in the same format: +``` + [0] 0x7F804000 (0.4980 0.5020 0.2510 0.0000) +``` + +**Expected delta:** ≤ 1/255 (≈ 4e-3) per channel, matching the parity test +(`test_cnn_v3_parity`). Larger deltas indicate a weight mismatch — re-export +with `export_cnn_v3_weights.py` and verify the `.bin` size is 3928 bytes. + +--- + +### 10d. Feature format note + +Both tools pack features in **training format** ([0,1] oct-encoded normals), +not the runtime `gbuf_pack.wgsl` format (which remaps normals to [-1,1]). +This makes `infer_cnn_v3.py` ↔ `cnn_test` directly comparable. + +The live pipeline (`GBufferEffect → gbuf_pack.wgsl → CNNv3Effect`) uses [-1,1] +normals — that is the intended inference distribution after a full training run +with `--input-mode full` (Blender renders). For training on photos +(`--input-mode simple`), [0,1] normals are correct since channel dropout +teaches the network to handle absent geometry. + +--- + +## 11. See Also - `cnn_v3/docs/CNN_V3.md` — Full architecture design (U-Net, FiLM, feature layout) - `doc/EFFECT_WORKFLOW.md` — General effect integration guide - `cnn_v2/docs/CNN_V2.md` — Reference implementation (simpler, operational) - `src/tests/gpu/test_demo_effects.cc` — GBufferEffect + GBufViewEffect tests +- `src/tests/gpu/test_cnn_v3_parity.cc` — Zero/random weight parity tests +- `cnn_v3/training/export_cnn_v3_weights.py` — Export trained checkpoint → `.bin` |
