diff options
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/COMPLETED.md | 8 | ||||
| -rw-r--r-- | doc/SEQUENCE.md | 132 |
2 files changed, 134 insertions, 6 deletions
diff --git a/doc/COMPLETED.md b/doc/COMPLETED.md index 072c92f..a3a988c 100644 --- a/doc/COMPLETED.md +++ b/doc/COMPLETED.md @@ -36,6 +36,14 @@ Completed task archive. See `doc/archive/` for detailed historical documents. ## March 2026 +- [x] **CNN v3 shadow pass debugging** — Fixed 5 independent bugs in `gbuf_shadow.wgsl` + `gbuffer_effect.cc`: + 1. **Camera Y-inversion**: `mat4::perspective` negates Y for post-process chain; fixed with `proj.m[5] = -proj.m[5]` in `upload_scene_data` + `WGPUFrontFace_CCW` on raster pipeline. + 2. **Shadow formula**: replaced `shadowWithStoredDistance` (20 steps, bounded) with 64-step IQ soft shadow (`res = min(res, 8.0*d/t)`, unbounded march). + 3. **Local→world SDF scale**: `sdBox/sdSphere` return local-space distance; fixed with `d *= length(obj.model[0].xyz)`. + 4. **Shadow bias**: replaced light-direction bias (fails at terminator) with rasterized surface normal from `normal_mat_tex` (binding 4); `bias_pos = world + nor * 0.05`. + 5. **ShaderComposer**: `GBufViewEffect` needed `ShaderComposer::Get().Compose()` to resolve `#include "debug/debug_print"`. + - Added per-tile labels to `gbuf_view.wgsl` via `debug_str`. Scale propagation for pulsating sphere confirmed correct end-to-end. 36/36 tests. + - [x] **CNN v3 Phase 7: Validation tools** — `GBufViewEffect` (C++ 4×5 channel grid, `cnn_v3/shaders/gbuf_view.wgsl`, `cnn_v3/src/gbuf_view_effect.{h,cc}`): renders all 20 G-buffer feature channels tiled on screen; custom BGL with `WGPUTextureSampleType_Uint`, bind group rebuilt per frame via `wgpuRenderPipelineGetBindGroupLayout`. Web tool "Load sample directory" (`cnn_v3/tools/tester.js` + `shaders.js`): `webkitdirectory` picker, `FULL_PACK_SHADER` compute (matches `gbuf_pack.wgsl`), `runFromFeat()` inference, PSNR vs `target.png`. 36/36 tests. - [x] **CNN v3 Phase 5: Parity validation** — `test_cnn_v3_parity.cc` (2 tests: zero_weights, random_weights). Root cause: intermediate nodes declared at full res instead of W/2, W/4. Fix: `NodeRegistry::default_width()/default_height()` getters + fractional resolution in `declare_nodes()`. Final max_err=4.88e-4 ✓. 36/36 tests. diff --git a/doc/SEQUENCE.md b/doc/SEQUENCE.md index 202bf09..3d7a6ce 100644 --- a/doc/SEQUENCE.md +++ b/doc/SEQUENCE.md @@ -91,21 +91,141 @@ class Effect { std::vector<std::string> input_nodes_; std::vector<std::string> output_nodes_; - virtual void declare_nodes(NodeRegistry& registry) {} // Optional temp nodes + // Optional: declare internal nodes (depth buffers, intermediate textures). + virtual void declare_nodes(NodeRegistry& registry) {} + + // Required: render this effect for the current frame. virtual void render(WGPUCommandEncoder encoder, const UniformsSequenceParams& params, NodeRegistry& nodes) = 0; + + // Optional: called after ALL effects in the sequence have rendered. + // Use for end-of-frame bookkeeping, e.g. copying temporal feedback buffers. + // Default implementation is a no-op. + virtual void post_render(WGPUCommandEncoder encoder, NodeRegistry& nodes) {} }; ``` +### Frame execution order + +Each frame, `Sequence::render_effects()` runs two passes over the DAG: + +1. **Render pass** — `dispatch_render()` on every effect in topological order +2. **Post-render pass** — `post_render()` on every effect in the same order + +This ordering guarantees that by the time any `post_render()` runs, all output +textures for the frame are fully written. It is safe to read any node's texture +in `post_render()`. + +### Temporal feedback pattern + +DAG-based sequences cannot express read-after-write cycles within a single frame. +Use `post_render()` + a persistent internal node to implement temporal feedback +(e.g. CNN prev-frame input): + +```cpp +class MyEffect : public Effect { + std::string node_prev_; // internal persistent texture + std::string source_node_; // node to capture at end of frame + + public: + void set_source_node(const std::string& n) { source_node_ = n; } + + void declare_nodes(NodeRegistry& reg) override { + // Use a NodeType whose format matches source_node_ and has CopyDst. + reg.declare_node(node_prev_, NodeType::F16X8, -1, -1); + } + + void render(...) override { + // Read node_prev_ — contains source_node_ output from the *previous* frame. + WGPUTextureView prev = nodes.get_view(node_prev_); + // ... use prev + } + + void post_render(WGPUCommandEncoder enc, NodeRegistry& nodes) override { + if (source_node_.empty() || !nodes.has_node(source_node_)) return; + // Copy this frame's output into node_prev_ for next frame. + WGPUTexelCopyTextureInfo src = {.texture = nodes.get_texture(source_node_)}; + WGPUTexelCopyTextureInfo dst = {.texture = nodes.get_texture(node_prev_)}; + WGPUExtent3D ext = {(uint32_t)width_, (uint32_t)height_, 1}; + wgpuCommandEncoderCopyTextureToTexture(enc, &src, &dst, &ext); + } +}; +``` + +**Why not `input_nodes_[0]` / ping-pong as prev?** The ping-pong alias makes +`source` equal to last frame's `sink` only when the effect is the first in the +sequence and no post-CNN effects overwrite `sink`. `post_render()` is +unconditionally correct regardless of sequence structure. + +**Current user**: `GBufferEffect` uses this pattern for `prev.rgb` (CNN temporal +feedback). `cnn_output_node_` is wired automatically via `wire_dag()` — no +manual `set_cnn_output_node()` call needed. + +### DAG wiring (`wire_dag`) + +```cpp +// Effect base class +virtual void wire_dag(const std::vector<EffectDAGNode>& dag) {} +``` + +Called once from `Sequence::init_effect_nodes()` after all `declare_nodes()` +calls, so the full DAG is visible. Override to resolve inter-effect +dependencies that cannot be expressed through node names alone. + +`GBufferEffect::wire_dag()` delegates to the base-class helper +`find_downstream_output(dag)`, then guards against wiring to `"sink"`: + +```cpp +void GBufferEffect::wire_dag(const std::vector<EffectDAGNode>& dag) { + const std::string out = find_downstream_output(dag); + if (out != "sink") cnn_output_node_ = out; +} +``` + +`"sink"` is registered as an external view (`texture == nullptr`); copying +from it in `post_render` would crash. When no CNN follows the G-buffer stage +(e.g. debug/deferred sequences), `cnn_output_node_` stays empty and +`post_render` is a no-op. + +#### `Effect::find_downstream_output` + +```cpp +// protected helper — call from wire_dag() +std::string find_downstream_output(const std::vector<EffectDAGNode>& dag) const; +``` + +Returns `output_nodes[0]` of the first direct downstream consumer in the DAG, +or `""` if none exists. The helper is agnostic about node semantics — it is +the **caller's responsibility** to reject unsuitable results (e.g. `"sink"` or +any other external/terminal node whose texture is not owned by the registry). + +`post_render` also null-checks the source texture as a belt-and-suspenders +guard: + +```cpp +WGPUTexture src_tex = nodes.get_texture(cnn_output_node_); +if (!src_tex) return; // external view — no owned texture to copy +``` + ### Node System **Types**: Match WGSL texture formats -- `U8X4_NORM`: RGBA8Unorm (default for source/sink/intermediate) -- `F32X4`: RGBA32Float (HDR, compute outputs) -- `F16X8`: 8-channel float16 (G-buffer normals/vectors) -- `DEPTH24`: Depth24Plus (3D rendering) -- `COMPUTE_F32`: Storage buffer (non-texture compute data) +- `U8X4_NORM`: RGBA8Unorm — default for source/sink/intermediate; `COPY_SRC|COPY_DST` +- `F32X4`: RGBA32Float — HDR, compute outputs +- `F16X8`: 8-channel float16 — G-buffer normals/vectors +- `DEPTH24`: Depth24Plus — 3D rendering +- `COMPUTE_F32`: Storage buffer — non-texture compute data +- `GBUF_ALBEDO`: RGBA16Float — G-buffer albedo/normal MRT; `RENDER_ATTACHMENT|TEXTURE_BINDING|STORAGE_BINDING|COPY_SRC` +- `GBUF_DEPTH32`: Depth32Float — G-buffer depth; `RENDER_ATTACHMENT|TEXTURE_BINDING|COPY_SRC` +- `GBUF_R8`: RGBA8Unorm — G-buffer single-channel (shadow, transp); `STORAGE_BINDING|TEXTURE_BINDING|RENDER_ATTACHMENT` +- `GBUF_RGBA32UINT`: RGBA32Uint — packed feature textures (CNN v3 feat_tex0/1); `STORAGE_BINDING|TEXTURE_BINDING` + +**`COPY_SRC|COPY_DST`** is required on any node used with `wgpuCommandEncoderCopyTextureToTexture`. +The `node_prev_` format **must match** the source texture format exactly — +`CopyTextureToTexture` requires identical formats. `F16X8` (Rgba16Float, +`CopySrc|CopyDst`) matches `GBUF_ALBEDO` (CNNv3Effect output). Use `U8X4_NORM` +only when the source is also Rgba8Unorm. **Aliasing**: Compiler detects ping-pong patterns (Effect i writes A reads B, Effect i+1 writes B reads A) and aliases nodes to same backing texture. |
