diff options
Diffstat (limited to 'cnn_v3/docs/HOWTO.md')
| -rw-r--r-- | cnn_v3/docs/HOWTO.md | 61 |
1 files changed, 46 insertions, 15 deletions
diff --git a/cnn_v3/docs/HOWTO.md b/cnn_v3/docs/HOWTO.md index 5c5cc2a..5cfc371 100644 --- a/cnn_v3/docs/HOWTO.md +++ b/cnn_v3/docs/HOWTO.md @@ -79,7 +79,7 @@ Each frame, `GBufferEffect::render()` executes: 3. **Pass 3 — Transparency** — TODO (deferred; transp=0 for opaque scenes) 4. **Pass 4 — Pack compute** (`gbuf_pack.wgsl`) ✅ - - Reads all G-buffer textures + `prev_cnn` input + - Reads all G-buffer textures + persistent `prev_cnn` texture - Writes `feat_tex0` + `feat_tex1` (rgba32uint, 20 channels, 32 bytes/pixel) - Shadow / transp nodes cleared to 1.0 / 0.0 via zero-draw render passes until Pass 2/3 are implemented. @@ -90,9 +90,38 @@ Outputs are named from the `outputs` vector passed to the constructor: ``` outputs[0] → feat_tex0 (rgba32uint: albedo.rgb, normal.xy, depth, depth_grad.xy) -outputs[1] → feat_tex1 (rgba32uint: mat_id, prev.rgb, mip1.rgb, mip2.rgb, shadow, transp) +outputs[1] → feat_tex1 (rgba32uint: mat_id, prev.rgb, mip1.rgb, mip2.rgb, dif, transp) ``` +### Temporal feedback (prev.rgb) + +`GBufferEffect` owns a persistent internal node `<prefix>_prev` (`F16X8` = Rgba16Float, +`CopySrc|CopyDst`). Each frame it is GPU-copied from the CNN effect's output after all +effects render (`post_render`), then bound as `prev_cnn` in the pack shader (binding 6). + +**Wiring is automatic** via `wire_dag()`, called by `Sequence::init_effect_nodes()`. +`GBufferEffect` scans the DAG for the first downstream consumer of its output nodes and +uses that effect's output as `cnn_output_node_`. No manual call needed. + +**Requirement**: the sequence must include `CNNv3Effect` downstream of `GBufferEffect`. +In `timeline.seq`, declare a `gbuf_albedo` output node and add the effect: + +```seq +NODE cnn_out gbuf_albedo +EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0 60 +EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> cnn_out 0 60 +``` + +If no CNN effect follows, `cnn_output_node_` stays empty and `post_render` is a no-op +(prev.rgb will be zero — correct for static/debug-only sequences). + +Frame 0 behaviour: `_prev` is zeroed on allocation → `prev.rgb = 0`, matching the training +convention (static frames use zero history). + +The copy uses `wgpuCommandEncoderCopyTextureToTexture` (no extra render pass overhead). +`node_prev_tex_` is `F16X8` (Rgba16Float) to match the `GBUF_ALBEDO` format of CNNv3Effect's +output — `CopyTextureToTexture` requires identical formats. + --- ## 1b. GBufferEffect — Implementation Plan (Pass 2: SDF Shadow) @@ -285,7 +314,7 @@ python3 train_cnn_v3.py \ Applied per-sample in `cnn_v3_utils.apply_channel_dropout()`: - Geometric channels (normal, depth, depth_grad) zeroed with `p=channel_dropout_p` -- Context channels (mat_id, shadow, transp) with `p≈0.2` +- Context channels (mat_id, dif, transp) with `p≈0.2` - Temporal channels (prev.rgb) with `p=0.5` This ensures the network works for both full G-buffer and photo-only inputs. @@ -299,10 +328,12 @@ This ensures the network works for both full G-buffer and photo-only inputs. ```seq # BPM 120 SEQUENCE 0 0 "Scene with CNN v3" - EFFECT + GBufferEffect prev_cnn -> gbuf_feat0 gbuf_feat1 0 60 - EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> sink 0 60 + EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0 60 + EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> sink 0 60 ``` +Temporal feedback is wired automatically by `wire_dag()` — no manual call needed. + FiLM parameters uploaded each frame: ```cpp cnn_v3_effect->set_film_params( @@ -455,15 +486,15 @@ GBufViewEffect(const GpuContext& ctx, float start_time, float end_time) ``` -**Wiring example** (alongside GBufferEffect): +**Wiring example** — use `timeline.seq`, temporal feedback wires automatically: -```cpp -auto gbuf = std::make_shared<GBufferEffect>(ctx, - std::vector<std::string>{"prev_cnn"}, - std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, 0.0f, 60.0f); -auto gview = std::make_shared<GBufViewEffect>(ctx, - std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, - std::vector<std::string>{"gbuf_view_out"}, 0.0f, 60.0f); +```seq +NODE gbuf_feat0 gbuf_rgba32uint +NODE gbuf_feat1 gbuf_rgba32uint +NODE cnn_out gbuf_albedo +EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0 60 +EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> cnn_out 0 60 +EFFECT + GBufViewEffect gbuf_feat0 gbuf_feat1 -> sink 0 60 ``` **Grid layout** (output resolution = input resolution, channel cells each 1/4 W × 1/5 H): @@ -474,7 +505,7 @@ auto gview = std::make_shared<GBufViewEffect>(ctx, | 1 | `nrm.y` remap→[0,1] | `depth` (inverted) | `dzdx` ×20+0.5 | `dzdy` ×20+0.5 | | 2 | `mat_id` | `prev.r` | `prev.g` | `prev.b` | | 3 | `mip1.r` | `mip1.g` | `mip1.b` | `mip2.r` | -| 4 | `mip2.g` | `mip2.b` | `shadow` | `transp` | +| 4 | `mip2.g` | `mip2.b` | `dif` | `transp` | All channels displayed as grayscale. 1-pixel gray grid lines separate cells. Dark background for out-of-range cells. @@ -535,7 +566,7 @@ No sampler — all reads use `textureLoad()` (integer texel coordinates). Packs channels identically to `gbuf_pack.wgsl`: - `feat_tex0`: `pack2x16float(alb.rg)`, `pack2x16float(alb.b, nrm.x)`, `pack2x16float(nrm.y, depth)`, `pack2x16float(dzdx, dzdy)` -- `feat_tex1`: `pack4x8unorm(matid,0,0,0)`, `pack4x8unorm(mip1.rgb, mip2.r)`, `pack4x8unorm(mip2.gb, shadow, transp)` +- `feat_tex1`: `pack4x8unorm(matid,0,0,0)`, `pack4x8unorm(mip1.rgb, mip2.r)`, `pack4x8unorm(mip2.gb, dif, transp)` - Depth gradients: central differences on depth R channel - Mip1 / Mip2: box2 (2×2) / box4 (4×4) average filter on albedo |
