summaryrefslogtreecommitdiff
path: root/cnn_v3/docs/HOWTO.md
diff options
context:
space:
mode:
Diffstat (limited to 'cnn_v3/docs/HOWTO.md')
-rw-r--r--cnn_v3/docs/HOWTO.md61
1 files changed, 46 insertions, 15 deletions
diff --git a/cnn_v3/docs/HOWTO.md b/cnn_v3/docs/HOWTO.md
index 5c5cc2a..5cfc371 100644
--- a/cnn_v3/docs/HOWTO.md
+++ b/cnn_v3/docs/HOWTO.md
@@ -79,7 +79,7 @@ Each frame, `GBufferEffect::render()` executes:
3. **Pass 3 — Transparency** — TODO (deferred; transp=0 for opaque scenes)
4. **Pass 4 — Pack compute** (`gbuf_pack.wgsl`) ✅
- - Reads all G-buffer textures + `prev_cnn` input
+ - Reads all G-buffer textures + persistent `prev_cnn` texture
- Writes `feat_tex0` + `feat_tex1` (rgba32uint, 20 channels, 32 bytes/pixel)
- Shadow / transp nodes cleared to 1.0 / 0.0 via zero-draw render passes
until Pass 2/3 are implemented.
@@ -90,9 +90,38 @@ Outputs are named from the `outputs` vector passed to the constructor:
```
outputs[0] → feat_tex0 (rgba32uint: albedo.rgb, normal.xy, depth, depth_grad.xy)
-outputs[1] → feat_tex1 (rgba32uint: mat_id, prev.rgb, mip1.rgb, mip2.rgb, shadow, transp)
+outputs[1] → feat_tex1 (rgba32uint: mat_id, prev.rgb, mip1.rgb, mip2.rgb, dif, transp)
```
+### Temporal feedback (prev.rgb)
+
+`GBufferEffect` owns a persistent internal node `<prefix>_prev` (`F16X8` = Rgba16Float,
+`CopySrc|CopyDst`). Each frame it is GPU-copied from the CNN effect's output after all
+effects render (`post_render`), then bound as `prev_cnn` in the pack shader (binding 6).
+
+**Wiring is automatic** via `wire_dag()`, called by `Sequence::init_effect_nodes()`.
+`GBufferEffect` scans the DAG for the first downstream consumer of its output nodes and
+uses that effect's output as `cnn_output_node_`. No manual call needed.
+
+**Requirement**: the sequence must include `CNNv3Effect` downstream of `GBufferEffect`.
+In `timeline.seq`, declare a `gbuf_albedo` output node and add the effect:
+
+```seq
+NODE cnn_out gbuf_albedo
+EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0 60
+EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> cnn_out 0 60
+```
+
+If no CNN effect follows, `cnn_output_node_` stays empty and `post_render` is a no-op
+(prev.rgb will be zero — correct for static/debug-only sequences).
+
+Frame 0 behaviour: `_prev` is zeroed on allocation → `prev.rgb = 0`, matching the training
+convention (static frames use zero history).
+
+The copy uses `wgpuCommandEncoderCopyTextureToTexture` (no extra render pass overhead).
+`node_prev_tex_` is `F16X8` (Rgba16Float) to match the `GBUF_ALBEDO` format of CNNv3Effect's
+output — `CopyTextureToTexture` requires identical formats.
+
---
## 1b. GBufferEffect — Implementation Plan (Pass 2: SDF Shadow)
@@ -285,7 +314,7 @@ python3 train_cnn_v3.py \
Applied per-sample in `cnn_v3_utils.apply_channel_dropout()`:
- Geometric channels (normal, depth, depth_grad) zeroed with `p=channel_dropout_p`
-- Context channels (mat_id, shadow, transp) with `p≈0.2`
+- Context channels (mat_id, dif, transp) with `p≈0.2`
- Temporal channels (prev.rgb) with `p=0.5`
This ensures the network works for both full G-buffer and photo-only inputs.
@@ -299,10 +328,12 @@ This ensures the network works for both full G-buffer and photo-only inputs.
```seq
# BPM 120
SEQUENCE 0 0 "Scene with CNN v3"
- EFFECT + GBufferEffect prev_cnn -> gbuf_feat0 gbuf_feat1 0 60
- EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> sink 0 60
+ EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0 60
+ EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> sink 0 60
```
+Temporal feedback is wired automatically by `wire_dag()` — no manual call needed.
+
FiLM parameters uploaded each frame:
```cpp
cnn_v3_effect->set_film_params(
@@ -455,15 +486,15 @@ GBufViewEffect(const GpuContext& ctx,
float start_time, float end_time)
```
-**Wiring example** (alongside GBufferEffect):
+**Wiring example** — use `timeline.seq`, temporal feedback wires automatically:
-```cpp
-auto gbuf = std::make_shared<GBufferEffect>(ctx,
- std::vector<std::string>{"prev_cnn"},
- std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, 0.0f, 60.0f);
-auto gview = std::make_shared<GBufViewEffect>(ctx,
- std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"},
- std::vector<std::string>{"gbuf_view_out"}, 0.0f, 60.0f);
+```seq
+NODE gbuf_feat0 gbuf_rgba32uint
+NODE gbuf_feat1 gbuf_rgba32uint
+NODE cnn_out gbuf_albedo
+EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0 60
+EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> cnn_out 0 60
+EFFECT + GBufViewEffect gbuf_feat0 gbuf_feat1 -> sink 0 60
```
**Grid layout** (output resolution = input resolution, channel cells each 1/4 W × 1/5 H):
@@ -474,7 +505,7 @@ auto gview = std::make_shared<GBufViewEffect>(ctx,
| 1 | `nrm.y` remap→[0,1] | `depth` (inverted) | `dzdx` ×20+0.5 | `dzdy` ×20+0.5 |
| 2 | `mat_id` | `prev.r` | `prev.g` | `prev.b` |
| 3 | `mip1.r` | `mip1.g` | `mip1.b` | `mip2.r` |
-| 4 | `mip2.g` | `mip2.b` | `shadow` | `transp` |
+| 4 | `mip2.g` | `mip2.b` | `dif` | `transp` |
All channels displayed as grayscale. 1-pixel gray grid lines separate cells. Dark background for out-of-range cells.
@@ -535,7 +566,7 @@ No sampler — all reads use `textureLoad()` (integer texel coordinates).
Packs channels identically to `gbuf_pack.wgsl`:
- `feat_tex0`: `pack2x16float(alb.rg)`, `pack2x16float(alb.b, nrm.x)`, `pack2x16float(nrm.y, depth)`, `pack2x16float(dzdx, dzdy)`
-- `feat_tex1`: `pack4x8unorm(matid,0,0,0)`, `pack4x8unorm(mip1.rgb, mip2.r)`, `pack4x8unorm(mip2.gb, shadow, transp)`
+- `feat_tex1`: `pack4x8unorm(matid,0,0,0)`, `pack4x8unorm(mip1.rgb, mip2.r)`, `pack4x8unorm(mip2.gb, dif, transp)`
- Depth gradients: central differences on depth R channel
- Mip1 / Mip2: box2 (2×2) / box4 (4×4) average filter on albedo