diff options
| -rw-r--r-- | PROJECT_CONTEXT.md | 2 | ||||
| -rw-r--r-- | TODO.md | 11 | ||||
| -rw-r--r-- | cmake/DemoSourceLists.cmake | 1 | ||||
| -rw-r--r-- | cnn_v3/README.md | 15 | ||||
| -rw-r--r-- | cnn_v3/docs/HOWTO.md | 138 | ||||
| -rw-r--r-- | cnn_v3/shaders/gbuf_view.wgsl | 102 | ||||
| -rw-r--r-- | cnn_v3/src/gbuf_view_effect.cc | 144 | ||||
| -rw-r--r-- | cnn_v3/src/gbuf_view_effect.h | 25 | ||||
| -rw-r--r-- | cnn_v3/tools/index.html | 17 | ||||
| -rw-r--r-- | cnn_v3/tools/shaders.js | 48 | ||||
| -rw-r--r-- | cnn_v3/tools/tester.js | 277 | ||||
| -rw-r--r-- | doc/COMPLETED.md | 2 | ||||
| -rw-r--r-- | src/effects/shaders.cc | 1 | ||||
| -rw-r--r-- | src/effects/shaders.h | 1 | ||||
| -rw-r--r-- | src/gpu/demo_effects.h | 3 | ||||
| -rw-r--r-- | src/tests/gpu/test_demo_effects.cc | 5 | ||||
| -rw-r--r-- | workspaces/main/assets.txt | 1 |
17 files changed, 782 insertions, 11 deletions
diff --git a/PROJECT_CONTEXT.md b/PROJECT_CONTEXT.md index 6219275..3ed265a 100644 --- a/PROJECT_CONTEXT.md +++ b/PROJECT_CONTEXT.md @@ -36,7 +36,7 @@ - **Audio:** Sample-accurate sync. Zero heap allocations per frame. Variable tempo. OLA-IDCT synthesis (v2 .spec): Hann analysis window, rectangular synthesis, 50% overlap, click-free. V1 (raw DCT-512) preserved for generated notes. .spec files regenerated as v2. - **Shaders:** Parameterized effects (UniformHelper, .seq syntax). Beat-synchronized animation support (`beat_time`, `beat_phase`). Modular WGSL composition with ShaderComposer. 27 shared common shaders (math, render, compute). Reusable snippets: `render/scratch_lines`, `render/ntsc_common` (NTSC signal processing, RGB and YIQ input variants via `sample_ntsc_signal` hook), `math/color` (YIQ/NTSC), `math/color_c64` (C64 palette, Bayer dither, border animation). - **3D:** Hybrid SDF/rasterization with BVH. Binary scene loader. Blender pipeline. -- **Effects:** CNN post-processing: CNNEffect (v1) and CNNv2Effect operational. CNN v2: sigmoid activation, storage buffer weights (~3.2 KB), 7D static features, dynamic layers. Training stable, convergence validated. **CNN v3 Phases 1–5 complete:** `CNNv3Effect` C++ class (5 compute passes, FiLM uniform upload, identity γ/β defaults). Parity validated: enc0 max_err=1.95e-3, dec1 max_err=1.95e-3, final max_err=4.88e-4 (all ≤1/255). Key fix: intermediate nodes declared at fractional resolutions (W/2, W/4) via `NodeRegistry::default_width()/default_height()`. See `cnn_v3/docs/HOWTO.md`. +- **Effects:** CNN post-processing: CNNEffect (v1) and CNNv2Effect operational. CNN v2: sigmoid activation, storage buffer weights (~3.2 KB), 7D static features, dynamic layers. Training stable, convergence validated. **CNN v3 Phases 1–7 complete:** `CNNv3Effect` C++ class (5 compute passes, FiLM uniform upload, identity γ/β defaults). Parity validated: max_err=4.88e-4 (≤1/255). Validation tools: `GBufViewEffect` (C++ 4×5 channel grid) + web "Load sample directory" (G-buffer pack → CNN inference → PSNR vs target.png). See `cnn_v3/docs/HOWTO.md` §9. - **Tools:** CNN test tool operational. Texture readback utility functional. Timeline editor (web-based, beat-aligned, audio playback). - **Build:** Asset dependency tracking. Size measurement. Hot-reload (debug-only). WSL (Windows 10) supported: native Linux build and cross-compile to `.exe` via `mingw-w64`. - **Sequence:** DAG-based effect routing with explicit node system. Python compiler with topological sort and ping-pong optimization. 12 effects operational (Passthrough, Placeholder, GaussianBlur, Heptagon, Particles, RotatingCube, Hybrid3D, Flash, PeakMeter, Scene1, Scene2, Scratch). Effect times are absolute (seq_compiler adds sequence start offset). See `doc/SEQUENCE.md`. @@ -79,9 +79,14 @@ PyTorch / HTML WebGPU / C++ WebGPU. 5. ✅ Parity validation: test vectors + `test_cnn_v3_parity.cc`. max_err=4.88e-4 (≤1/255). - Key fix: intermediate nodes at fractional resolutions (W/2, W/4) via `NodeRegistry::default_width()/default_height()` -**Next: export + real training run** -- `train_cnn_v3.py` + `cnn_v3_utils.py` written (Phase 6 training script done) -- ✅ `export_cnn_v3_weights.py` — convert trained `.pth` → `.bin` (f16) +6. ✅ Training script: `train_cnn_v3.py` + `cnn_v3_utils.py` written + - ✅ `export_cnn_v3_weights.py` — convert trained `.pth` → `.bin` (f16) +7. ✅ Validation tools: + - `GBufViewEffect` — C++ 4×5 channel grid (all 20 G-buffer channels) + - Web tool "Load sample directory" — G-buffer pack → CNN inference → PSNR + - See `cnn_v3/docs/HOWTO.md` §9 + +**Next: run a real training pass** - See `cnn_v3/docs/HOWTO.md` §3 for training commands ## Future: CNN v3 "2D Mode" (G-buffer-free) diff --git a/cmake/DemoSourceLists.cmake b/cmake/DemoSourceLists.cmake index 742057a..e5c7339 100644 --- a/cmake/DemoSourceLists.cmake +++ b/cmake/DemoSourceLists.cmake @@ -42,6 +42,7 @@ set(COMMON_GPU_EFFECTS src/effects/scene2_effect.cc cnn_v3/src/gbuffer_effect.cc cnn_v3/src/cnn_v3_effect.cc + cnn_v3/src/gbuf_view_effect.cc # TODO: Port CNN effects to v2 (complex v1 dependencies) # cnn_v1/src/cnn_v1_effect.cc # cnn_v2/src/cnn_v2_effect.cc diff --git a/cnn_v3/README.md b/cnn_v3/README.md index f161bf4..a844b1b 100644 --- a/cnn_v3/README.md +++ b/cnn_v3/README.md @@ -31,9 +31,18 @@ Add images directly to these directories and commit them. ## Status -**Phase 1 complete.** G-buffer integrated (raster + pack), 35/35 tests pass. -Training infrastructure ready. U-Net WGSL shaders are next. +**Phases 1–7 complete.** 36/36 tests pass. -See `cnn_v3/docs/HOWTO.md` for the practical playbook. +| Phase | Status | +|-------|--------| +| 1 — G-buffer (raster + pack) | ✅ | +| 2 — Training infrastructure | ✅ | +| 3 — WGSL U-Net shaders | ✅ | +| 4 — C++ CNNv3Effect + FiLM | ✅ | +| 5 — Parity validation | ✅ max_err=4.88e-4 | +| 6 — Training script | ✅ train_cnn_v3.py | +| 7 — Validation tools | ✅ GBufViewEffect + web sample loader | + +See `cnn_v3/docs/HOWTO.md` for the practical playbook (§9 covers validation tools). See `cnn_v3/docs/CNN_V3.md` for full design. See `cnn_v2/` for reference implementation. diff --git a/cnn_v3/docs/HOWTO.md b/cnn_v3/docs/HOWTO.md index 983e8b7..c6f306b 100644 --- a/cnn_v3/docs/HOWTO.md +++ b/cnn_v3/docs/HOWTO.md @@ -259,6 +259,8 @@ Test vectors generated by `cnn_v3/training/gen_test_vectors.py` (PyTorch referen | 4 — C++ CNNv3Effect | ✅ Done | FiLM uniform upload, 36/36 tests pass | | 5 — Parity validation | ✅ Done | test_cnn_v3_parity.cc, max_err=4.88e-4 | | 6 — FiLM MLP training | ✅ Done | train_cnn_v3.py + cnn_v3_utils.py written | +| 7 — G-buffer visualizer (C++) | ✅ Done | GBufViewEffect, 36/36 tests pass | +| 7 — Sample loader (web tool) | ✅ Done | "Load sample directory" in cnn_v3/tools/ | --- @@ -337,9 +339,141 @@ auto src = ShaderComposer::Get().Compose({"cnn_v3/common"}, raw_wgsl); --- -## 9. See Also +## 9. Validation Workflow + +Two complementary tools let you verify each stage of the pipeline before training +or integrating into the demo. + +### 9a. C++ — GBufViewEffect (G-buffer channel grid) + +`GBufViewEffect` renders all 20 feature channels from `feat_tex0` / `feat_tex1` +in a **4×5 tiled grid** so you can see the G-buffer at a glance. + +**Registration (already done)** + +| File | What changed | +|------|-------------| +| `cnn_v3/shaders/gbuf_view.wgsl` | New fragment shader | +| `cnn_v3/src/gbuf_view_effect.h` | Effect class declaration | +| `cnn_v3/src/gbuf_view_effect.cc` | Effect class implementation | +| `workspaces/main/assets.txt` | `SHADER_GBUF_VIEW` asset | +| `cmake/DemoSourceLists.cmake` | `gbuf_view_effect.cc` in COMMON_GPU_EFFECTS | +| `src/gpu/demo_effects.h` | `#include "../../cnn_v3/src/gbuf_view_effect.h"` | +| `src/effects/shaders.h/.cc` | `gbuf_view_wgsl` extern declaration + definition | +| `src/tests/gpu/test_demo_effects.cc` | GBufViewEffect test | + +**Constructor signature** + +```cpp +GBufViewEffect(const GpuContext& ctx, + const std::vector<std::string>& inputs, // {feat_tex0, feat_tex1} + const std::vector<std::string>& outputs, // {gbuf_view_out} + float start_time, float end_time) +``` + +**Wiring example** (alongside GBufferEffect): + +```cpp +auto gbuf = std::make_shared<GBufferEffect>(ctx, + std::vector<std::string>{"prev_cnn"}, + std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, 0.0f, 60.0f); +auto gview = std::make_shared<GBufViewEffect>(ctx, + std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, + std::vector<std::string>{"gbuf_view_out"}, 0.0f, 60.0f); +``` + +**Grid layout** (output resolution = input resolution, channel cells each 1/4 W × 1/5 H): + +| Row | Col 0 | Col 1 | Col 2 | Col 3 | +|-----|-------|-------|-------|-------| +| 0 | `alb.r` (red tint) | `alb.g` (green tint) | `alb.b` (blue tint) | `nrm.x` remap→[0,1] | +| 1 | `nrm.y` remap→[0,1] | `depth` (inverted) | `dzdx` ×20+0.5 | `dzdy` ×20+0.5 | +| 2 | `mat_id` | `prev.r` | `prev.g` | `prev.b` | +| 3 | `mip1.r` | `mip1.g` | `mip1.b` | `mip2.r` | +| 4 | `mip2.g` | `mip2.b` | `shadow` | `transp` | + +1-pixel gray grid lines separate cells. Dark background for out-of-range cells. + +**Shader binding layout** (no sampler needed — integer texture): + +| Binding | Type | Content | +|---------|------|---------| +| 0 | `texture_2d<u32>` | `feat_tex0` (8 f16 channels via `pack2x16float`) | +| 1 | `texture_2d<u32>` | `feat_tex1` (12 u8 channels via `pack4x8unorm`) | +| 2 | `uniform` (8 B) | `GBufViewUniforms { resolution: vec2f }` | + +The BGL is built manually in the constructor (no sampler) — this is an exception to the +standard post-process pattern because `rgba32uint` textures use `WGPUTextureSampleType_Uint` +and cannot be sampled, only loaded via `textureLoad()`. + +**Implementation note — bind group recreation** + +`render()` calls `wgpuRenderPipelineGetBindGroupLayout(pipeline_, 0)` each frame to +extract the BGL, creates a new `BindGroup`, then immediately releases the BGL handle. +This avoids storing a raw BGL as a member (no RAII wrapper exists for it) while +remaining correct across ping-pong buffer swaps. + +--- + +### 9b. Web tool — "Load sample directory" + +`cnn_v3/tools/index.html` has a **"Load sample directory"** button that: +1. Opens a `webkitdirectory` picker to select a sample folder +2. Loads all G-buffer component PNGs as `rgba8unorm` GPU textures +3. Runs the `FULL_PACK_SHADER` compute shader to assemble `feat_tex0` / `feat_tex1` +4. Runs full CNN inference (enc0 → enc1 → bottleneck → dec1 → dec0) +5. Displays the CNN output on the main canvas +6. If `target.png` is present, shows it side-by-side and prints PSNR + +**File name matching** (case-insensitive, substring): + +| Channel | Matched patterns | Fallback | +|---------|-----------------|---------| +| Albedo (required) | `albedo`, `color` | — (error if missing) | +| Normal | `normal`, `nrm` | `rgb(128,128,0,255)` — flat (0,0) oct-encoded | +| Depth | `depth` | `0` — zero depth | +| Mat ID | `matid`, `index`, `mat_id` | `0` — no material | +| Shadow | `shadow` | `255` — fully lit | +| Transparency | `transp`, `alpha` | `0` — fully opaque | +| Target | `target`, `output`, `ground_truth` | not shown | + +**`FULL_PACK_SHADER`** (defined in `cnn_v3/tools/shaders.js`) + +WebGPU compute shader (`@workgroup_size(8,8)`) with 9 bindings: + +| Binding | Resource | Format | +|---------|----------|--------| +| 0–5 | albedo, normal, depth, matid, shadow, transp | `texture_2d<f32>` (rgba8unorm, R channel for single-channel maps) | +| 6 | linear sampler | `sampler` | +| 7 | feat_tex0 output | `texture_storage_2d<rgba32uint,write>` | +| 8 | feat_tex1 output | `texture_storage_2d<rgba32uint,write>` | + +Packs channels identically to `gbuf_pack.wgsl`: +- `feat_tex0`: `pack2x16float(alb.rg)`, `pack2x16float(alb.b, nrm.x)`, `pack2x16float(nrm.y, depth)`, `pack2x16float(dzdx, dzdy)` +- `feat_tex1`: `pack4x8unorm(matid,0,0,0)`, `pack4x8unorm(mip1.rgb, mip2.r)`, `pack4x8unorm(mip2.gb, shadow, transp)` +- Depth gradients: central differences on depth R channel +- Mip1 / Mip2: box2 (2×2) / box4 (4×4) average filter on albedo + +**PSNR computation** (`computePSNR`) + +- CNN output (`rgba16float`) copied to CPU staging buffer via `copyTextureToBuffer` +- f16→float32 decoded in JavaScript +- Target drawn to offscreen `<canvas>` via `drawImage`, pixels read with `getImageData` +- MSE and PSNR computed over all RGB pixels (alpha ignored) +- Result displayed below target canvas as `MSE=X.XXXXX PSNR=XX.XXdB` + +**`runFromFeat(f0, f1, w, h)`** + +Called by `loadSampleDir()` after packing, or can be called directly if feat textures +are already available. Skips the photo-pack step, runs all 5 CNN passes, and displays +the result. Intermediate textures are stored in `this.layerTextures` so the Layer +Visualization panel still works. + +--- + +## 10. See Also - `cnn_v3/docs/CNN_V3.md` — Full architecture design (U-Net, FiLM, feature layout) - `doc/EFFECT_WORKFLOW.md` — General effect integration guide - `cnn_v2/docs/CNN_V2.md` — Reference implementation (simpler, operational) -- `src/tests/gpu/test_demo_effects.cc` — GBufferEffect construction test +- `src/tests/gpu/test_demo_effects.cc` — GBufferEffect + GBufViewEffect tests diff --git a/cnn_v3/shaders/gbuf_view.wgsl b/cnn_v3/shaders/gbuf_view.wgsl new file mode 100644 index 0000000..f2ae085 --- /dev/null +++ b/cnn_v3/shaders/gbuf_view.wgsl @@ -0,0 +1,102 @@ +// G-buffer channel visualization — 4×5 grid of 20 feature channels. +// Takes feat_tex0 (rgba32uint, ch 0-7 f16) and feat_tex1 (rgba32uint, ch 8-19 unorm8). +// Outputs tiled channel view to a standard rgba8unorm render target. +// +// Channel layout (row×col): +// Row 0: ch0(alb.r) ch1(alb.g) ch2(alb.b) ch3(nrm.x) +// Row 1: ch4(nrm.y) ch5(depth) ch6(dzdx) ch7(dzdy) +// Row 2: ch8(matid) ch9(prv.r) ch10(prv.g) ch11(prv.b) +// Row 3: ch12(m1.r) ch13(m1.g) ch14(m1.b) ch15(m2.r) +// Row 4: ch16(m2.g) ch17(m2.b) ch18(shdw) ch19(trns) + +struct GBufViewUniforms { resolution: vec2f } + +@group(0) @binding(0) var feat0: texture_2d<u32>; +@group(0) @binding(1) var feat1: texture_2d<u32>; +@group(0) @binding(2) var<uniform> u: GBufViewUniforms; + +@vertex +fn vs_main(@builtin(vertex_index) vid: u32) -> @builtin(position) vec4f { + var corners = array<vec2f, 3>( + vec2f(-1.0, -1.0), vec2f(3.0, -1.0), vec2f(-1.0, 3.0)); + return vec4f(corners[vid], 0.0, 1.0); +} + +@fragment +fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f { + let uv = pos.xy / u.resolution; + + let COLS = 4.0; + let ROWS = 5.0; + let col = u32(uv.x * COLS); + let row = u32(uv.y * ROWS); + let ch = row * 4u + col; + + if (col >= 4u || ch >= 20u) { + return vec4f(0.05, 0.05, 0.05, 1.0); + } + + // 1-pixel grid lines (thin border per cell) + let lx = fract(uv.x * COLS); + let ly = fract(uv.y * ROWS); + if (lx < 0.005 || lx > 0.995 || ly < 0.005 || ly > 0.995) { + return vec4f(0.25, 0.25, 0.25, 1.0); + } + + // Map local UV to texel coordinate + let dim = vec2i(textureDimensions(feat0)); + let tc = clamp(vec2i(vec2f(lx, ly) * vec2f(dim)), vec2i(0), dim - vec2i(1)); + + var v: f32 = 0.0; + + if (ch < 8u) { + // feat0: 4 × pack2x16float — each u32 component holds two f16 values + let t = textureLoad(feat0, tc, 0); + let pair_idx = ch >> 1u; + let sub = ch & 1u; + var p: vec2f; + if (pair_idx == 0u) { p = unpack2x16float(t.x); } + else if (pair_idx == 1u) { p = unpack2x16float(t.y); } + else if (pair_idx == 2u) { p = unpack2x16float(t.z); } + else { p = unpack2x16float(t.w); } + v = select(p.y, p.x, sub == 0u); + } else { + // feat1: 3 × pack4x8unorm — components .x/.y/.z hold 4 u8 values each + let t = textureLoad(feat1, tc, 0); + let ch1 = ch - 8u; + let comp_idx = ch1 / 4u; + let sub = ch1 % 4u; + var bytes: vec4f; + if (comp_idx == 0u) { bytes = unpack4x8unorm(t.x); } + else if (comp_idx == 1u) { bytes = unpack4x8unorm(t.y); } + else { bytes = unpack4x8unorm(t.z); } + var ba = array<f32, 4>(bytes.x, bytes.y, bytes.z, bytes.w); + v = ba[sub]; + } + + // Channel-specific normalization for display clarity + var disp: f32; + if (ch <= 2u) { + // Albedo: already [0,1] + disp = clamp(v, 0.0, 1.0); + } else if (ch == 3u || ch == 4u) { + // Normals oct-encoded in [-1,1] → remap to [0,1] + disp = clamp(v * 0.5 + 0.5, 0.0, 1.0); + } else if (ch == 5u) { + // Depth [0,1]: invert so near=white, far=dark + disp = clamp(1.0 - v, 0.0, 1.0); + } else if (ch == 6u || ch == 7u) { + // Depth gradients (signed, small values): amplify × 20 + 0.5 for visibility + disp = clamp(v * 20.0 + 0.5, 0.0, 1.0); + } else { + // Everything else: clamp to [0,1] + disp = clamp(v, 0.0, 1.0); + } + + // Albedo channels: tint for identification (ch0=red, ch1=green, ch2=blue) + if (ch == 0u) { return vec4f(disp, 0.0, 0.0, 1.0); } + else if (ch == 1u) { return vec4f(0.0, disp, 0.0, 1.0); } + else if (ch == 2u) { return vec4f(0.0, 0.0, disp, 1.0); } + + return vec4f(disp, disp, disp, 1.0); +} diff --git a/cnn_v3/src/gbuf_view_effect.cc b/cnn_v3/src/gbuf_view_effect.cc new file mode 100644 index 0000000..180919d --- /dev/null +++ b/cnn_v3/src/gbuf_view_effect.cc @@ -0,0 +1,144 @@ +// GBufViewEffect — G-buffer channel grid visualization +// Renders 20 feature channels from feat_tex0/feat_tex1 in a 4×5 tiled layout. + +#include "gbuf_view_effect.h" + +#if defined(USE_TEST_ASSETS) +#include "test_assets.h" +#else +#include "generated/assets.h" +#endif + +#include "gpu/gpu.h" +#include "util/asset_manager.h" +#include "util/fatal_error.h" + +extern const char* gbuf_view_wgsl; + +// BGL entry: texture_2d<u32> read binding (fragment stage) +static WGPUBindGroupLayoutEntry bgl_uint_tex_frag(uint32_t binding) { + WGPUBindGroupLayoutEntry e = {}; + e.binding = binding; + e.visibility = WGPUShaderStage_Fragment; + e.texture.sampleType = WGPUTextureSampleType_Uint; + e.texture.viewDimension = WGPUTextureViewDimension_2D; + return e; +} + +// BGL entry: uniform buffer (fragment stage) +static WGPUBindGroupLayoutEntry bgl_uniform_frag(uint32_t binding, + uint64_t min_size) { + WGPUBindGroupLayoutEntry e = {}; + e.binding = binding; + e.visibility = WGPUShaderStage_Fragment; + e.buffer.type = WGPUBufferBindingType_Uniform; + e.buffer.minBindingSize = min_size; + return e; +} + +GBufViewEffect::GBufViewEffect(const GpuContext& ctx, + const std::vector<std::string>& inputs, + const std::vector<std::string>& outputs, + float start_time, float end_time) + : Effect(ctx, inputs, outputs, start_time, end_time) { + HEADLESS_RETURN_IF_NULL(ctx_.device); + + // Build BGL: binding 0 = feat0 (u32 tex), 1 = feat1 (u32 tex), 2 = uniforms + WGPUBindGroupLayoutEntry entries[3] = { + bgl_uint_tex_frag(0), + bgl_uint_tex_frag(1), + bgl_uniform_frag(2, 8), // only resolution (vec2f = 8 bytes) is read + }; + WGPUBindGroupLayoutDescriptor bgl_desc = {}; + bgl_desc.entryCount = 3; + bgl_desc.entries = entries; + WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(ctx_.device, &bgl_desc); + + // Pipeline layout + WGPUPipelineLayoutDescriptor pl_desc = {}; + pl_desc.bindGroupLayoutCount = 1; + pl_desc.bindGroupLayouts = &bgl; + WGPUPipelineLayout pl = wgpuDeviceCreatePipelineLayout(ctx_.device, &pl_desc); + + // Shader module + WGPUShaderSourceWGSL wgsl_src = {}; + wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL; + wgsl_src.code = str_view(gbuf_view_wgsl); + WGPUShaderModuleDescriptor shader_desc = {}; + shader_desc.nextInChain = &wgsl_src.chain; + WGPUShaderModule shader = + wgpuDeviceCreateShaderModule(ctx_.device, &shader_desc); + + // Render pipeline + WGPUColorTargetState target = {}; + target.format = WGPUTextureFormat_RGBA8Unorm; + target.writeMask = WGPUColorWriteMask_All; + + WGPUFragmentState frag = {}; + frag.module = shader; + frag.entryPoint = str_view("fs_main"); + frag.targetCount = 1; + frag.targets = ⌖ + + WGPURenderPipelineDescriptor pipe_desc = {}; + pipe_desc.layout = pl; + pipe_desc.vertex.module = shader; + pipe_desc.vertex.entryPoint = str_view("vs_main"); + pipe_desc.fragment = &frag; + pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList; + pipe_desc.multisample.count = 1; + pipe_desc.multisample.mask = UINT32_MAX; + + pipeline_.set(wgpuDeviceCreateRenderPipeline(ctx_.device, &pipe_desc)); + + wgpuShaderModuleRelease(shader); + wgpuPipelineLayoutRelease(pl); + wgpuBindGroupLayoutRelease(bgl); +} + +void GBufViewEffect::render(WGPUCommandEncoder encoder, + const UniformsSequenceParams& params, + NodeRegistry& nodes) { + WGPUTextureView feat0_view = nodes.get_view(input_nodes_[0]); + WGPUTextureView feat1_view = nodes.get_view(input_nodes_[1]); + WGPUTextureView output_view = nodes.get_view(output_nodes_[0]); + + // Rebuild bind group (views may change with ping-pong or resize) + WGPUBindGroupLayout bgl = + wgpuRenderPipelineGetBindGroupLayout(pipeline_.get(), 0); + + WGPUBindGroupEntry bg_entries[3] = {}; + bg_entries[0].binding = 0; + bg_entries[0].textureView = feat0_view; + bg_entries[1].binding = 1; + bg_entries[1].textureView = feat1_view; + bg_entries[2].binding = 2; + bg_entries[2].buffer = uniforms_buffer_.get().buffer; + bg_entries[2].size = sizeof(UniformsSequenceParams); + + WGPUBindGroupDescriptor bg_desc = {}; + bg_desc.layout = bgl; + bg_desc.entryCount = 3; + bg_desc.entries = bg_entries; + bind_group_.replace(wgpuDeviceCreateBindGroup(ctx_.device, &bg_desc)); + wgpuBindGroupLayoutRelease(bgl); + + WGPURenderPassColorAttachment color_att = {}; + color_att.view = output_view; + color_att.loadOp = WGPULoadOp_Clear; + color_att.storeOp = WGPUStoreOp_Store; + color_att.clearValue = {0.0f, 0.0f, 0.0f, 1.0f}; + color_att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED; + + WGPURenderPassDescriptor pass_desc = {}; + pass_desc.colorAttachmentCount = 1; + pass_desc.colorAttachments = &color_att; + + WGPURenderPassEncoder pass = + wgpuCommandEncoderBeginRenderPass(encoder, &pass_desc); + wgpuRenderPassEncoderSetPipeline(pass, pipeline_.get()); + wgpuRenderPassEncoderSetBindGroup(pass, 0, bind_group_.get(), 0, nullptr); + wgpuRenderPassEncoderDraw(pass, 3, 1, 0, 0); + wgpuRenderPassEncoderEnd(pass); + wgpuRenderPassEncoderRelease(pass); +} diff --git a/cnn_v3/src/gbuf_view_effect.h b/cnn_v3/src/gbuf_view_effect.h new file mode 100644 index 0000000..d4d8139 --- /dev/null +++ b/cnn_v3/src/gbuf_view_effect.h @@ -0,0 +1,25 @@ +// GBufViewEffect: Visualizes G-buffer feature textures as a 4×5 channel grid. +// Inputs: feat_tex0 (rgba32uint, ch 0-7 f16), feat_tex1 (rgba32uint, ch 8-19 unorm8) +// Output: rgba8unorm tiled channel visualization (downscaled 4× per channel) + +#pragma once + +#include "gpu/effect.h" +#include "gpu/sequence.h" +#include "gpu/wgpu_resource.h" + +class GBufViewEffect : public Effect { + public: + GBufViewEffect(const GpuContext& ctx, + const std::vector<std::string>& inputs, + const std::vector<std::string>& outputs, + float start_time, float end_time); + + void render(WGPUCommandEncoder encoder, + const UniformsSequenceParams& params, + NodeRegistry& nodes) override; + + private: + RenderPipeline pipeline_; + BindGroup bind_group_; +}; diff --git a/cnn_v3/tools/index.html b/cnn_v3/tools/index.html index 8494fef..1398ca5 100644 --- a/cnn_v3/tools/index.html +++ b/cnn_v3/tools/index.html @@ -64,6 +64,7 @@ video{display:none} <div class="left"> <input type="file" id="wFile" accept=".bin" style="display:none"> <input type="file" id="fFile" accept=".bin" style="display:none"> + <input type="file" id="sFile" webkitdirectory style="display:none" onchange="tester.loadSampleDir(this.files)"> <div class="dz" id="wDrop" onclick="document.getElementById('wFile').click()">Drop cnn_v3_weights.bin</div> <div class="dz" id="fDrop" onclick="document.getElementById('fFile').click()">Drop cnn_v3_film_mlp.bin (optional)</div> @@ -79,6 +80,10 @@ video{display:none} <div id="fullHelp" style="display:none;margin-top:6px;font-size:9px;color:#555;line-height:1.6"> Drop PNGs: *albedo*/color · *normal* · *depth* · *matid*/index · *shadow* · *transp*/alpha </div> + <div style="margin-top:8px;border-top:1px solid #333;padding-top:8px"> + <button onclick="document.getElementById('sFile').click()" style="width:100%">↑ Load sample directory</button> + <div id="sampleSt" style="font-size:9px;color:#555;margin-top:3px"></div> + </div> </div> </div> @@ -121,7 +126,17 @@ video{display:none} <div class="sep"></div> <button onclick="tester.savePNG()">Save PNG</button> </div> - <canvas id="canvas"></canvas> + <div style="display:flex;gap:12px;align-items:flex-start"> + <div style="display:flex;flex-direction:column;align-items:center;gap:3px"> + <canvas id="canvas"></canvas> + <span id="cnnLabel" style="font-size:9px;color:#555"></span> + </div> + <div id="targetPane" style="display:none;flex-direction:column;align-items:center;gap:3px"> + <canvas id="targetCanvas" style="max-width:100%;max-height:100%;image-rendering:pixelated;box-shadow:0 4px 12px rgba(0,0,0,.5)"></canvas> + <span style="font-size:9px;color:#555">target.png</span> + <span id="psnrSt" style="font-size:9px;color:#4a9eff"></span> + </div> + </div> </div> <div class="right"> diff --git a/cnn_v3/tools/shaders.js b/cnn_v3/tools/shaders.js index c3e994d..d5b1fb4 100644 --- a/cnn_v3/tools/shaders.js +++ b/cnn_v3/tools/shaders.js @@ -250,3 +250,51 @@ const VIZ_U32=` var v=array<f32,8>(a.x,a.y,b.x,b.y,c.x,c.y,d.x,d.y); let x=clamp(v[min(ch,7u)],0.,1.); return vec4f(x,x,x,1.); }`; + +// Full G-buffer pack: assembles feat_tex0/feat_tex1 from individual G-buffer images. +// Bindings: albedo(0) normal(1) depth(2) matid(3) shadow(4) transp(5) sampler(6) f0(7) f1(8) +// All source textures are rgba8unorm (browser-loaded images, R channel for depth/matid/shadow/transp). +// Matches gbuf_pack.wgsl packing exactly so the CNN sees the same layout. +const FULL_PACK_SHADER=` +@group(0) @binding(0) var albedo: texture_2d<f32>; +@group(0) @binding(1) var normal: texture_2d<f32>; +@group(0) @binding(2) var depth: texture_2d<f32>; +@group(0) @binding(3) var matid: texture_2d<f32>; +@group(0) @binding(4) var shadow: texture_2d<f32>; +@group(0) @binding(5) var transp: texture_2d<f32>; +@group(0) @binding(6) var smp: sampler; +@group(0) @binding(7) var f0: texture_storage_2d<rgba32uint,write>; +@group(0) @binding(8) var f1: texture_storage_2d<rgba32uint,write>; +fn ld(c:vec2i,d:vec2i)->f32{return textureLoad(depth,clamp(c,vec2i(0),d-vec2i(1)),0).r;} +fn b2(tl:vec2i,d:vec2i)->vec3f{ + var s=vec3f(0.); + for(var y:i32=0;y<2;y++){for(var x:i32=0;x<2;x++){s+=textureLoad(albedo,clamp(tl+vec2i(x,y),vec2i(0),d-vec2i(1)),0).rgb;}} + return s*.25;} +fn b4(tl:vec2i,d:vec2i)->vec3f{ + var s=vec3f(0.); + for(var y:i32=0;y<4;y++){for(var x:i32=0;x<4;x++){s+=textureLoad(albedo,clamp(tl+vec2i(x,y),vec2i(0),d-vec2i(1)),0).rgb;}} + return s*(1./16.);} +@compute @workgroup_size(8,8) +fn main(@builtin(global_invocation_id) id:vec3u){ + let c=vec2i(id.xy); let d=vec2i(textureDimensions(albedo)); + if(c.x>=d.x||c.y>=d.y){return;} + let alb=textureLoad(albedo,c,0).rgb; + let nrm=textureLoad(normal,c,0).rg; + let oct=nrm*2.-vec2f(1.); // [0,1] -> [-1,1] + let dv=ld(c,d); + let dzdx=(ld(c+vec2i(1,0),d)-ld(c-vec2i(1,0),d))*.5; + let dzdy=(ld(c+vec2i(0,1),d)-ld(c-vec2i(0,1),d))*.5; + textureStore(f0,c,vec4u( + pack2x16float(alb.rg), + pack2x16float(vec2f(alb.b,oct.x)), + pack2x16float(vec2f(oct.y,dv)), + pack2x16float(vec2f(dzdx,dzdy)))); + let mid=textureLoad(matid,c,0).r; + let shd=textureLoad(shadow,c,0).r; + let trp=textureLoad(transp,c,0).r; + let m1=b2(c-vec2i(0),d); let m2=b4(c-vec2i(1),d); + textureStore(f1,c,vec4u( + pack4x8unorm(vec4f(mid,0.,0.,0.)), + pack4x8unorm(vec4f(m1.r,m1.g,m1.b,m2.r)), + pack4x8unorm(vec4f(m2.g,m2.b,shd,trp)), + 0u));}`; diff --git a/cnn_v3/tools/tester.js b/cnn_v3/tools/tester.js index f056444..c1faec9 100644 --- a/cnn_v3/tools/tester.js +++ b/cnn_v3/tools/tester.js @@ -13,6 +13,7 @@ class CNNv3Tester { this.image = null; this.isVideo = false; this.viewMode= 0; // 0=cnn 1=orig 2=diff + this.targetBitmap = null; // set when a sample dir with target.png is loaded this.blend = 1.0; this.layerTextures = {}; this.lastResult = null; @@ -525,6 +526,282 @@ class CNNv3Tester { return(s?-1:1)*Math.pow(2,e-15)*(1+m/1024);}; return [f(lo),f(hi)]; } + + // ── Full G-buffer pack pipeline ─────────────────────────────────────────── + + getFullPack() { + return this.pl('fullpack', () => this.computePL(FULL_PACK_SHADER, 'main')); + } + + // Create a 1×1 rgba8unorm fallback texture with given RGBA bytes [0-255]. + makeFallbackTex(r, g, b, a) { + const tex = this.device.createTexture({size:[1,1], format:'rgba8unorm', + usage: GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_DST}); + this.device.queue.writeTexture({texture:tex}, new Uint8Array([r,g,b,a]), + {bytesPerRow:4,rowsPerImage:1}, [1,1]); + return tex; + } + + // Load an image File as a GPU rgba8unorm texture. Returns {tex, w, h}. + async loadGpuTex(file) { + const bmp = await createImageBitmap(file); + const w = bmp.width, h = bmp.height; + const tex = this.device.createTexture({size:[w,h], format:'rgba8unorm', + usage: GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_DST|GPUTextureUsage.RENDER_ATTACHMENT}); + this.device.queue.copyExternalImageToTexture({source:bmp}, {texture:tex}, [w,h]); + bmp.close(); + return {tex, w, h}; + } + + // ── Load sample directory ───────────────────────────────────────────────── + + async loadSampleDir(files) { + if (!files || files.length === 0) return; + if (!this.weightsU32) { this.setStatus('Load weights first', true); return; } + + this.setMode('full'); + const st = document.getElementById('sampleSt'); + st.textContent = 'Loading…'; + + // Match files by name pattern + const match = (pat) => { + for (const f of files) { + const n = f.name.toLowerCase(); + if (pat.some(p => n.includes(p))) return f; + } + return null; + }; + + const fAlbedo = match(['albedo', 'color']); + const fNormal = match(['normal', 'nrm']); + const fDepth = match(['depth']); + const fMatid = match(['matid', 'index', 'mat_id']); + const fShadow = match(['shadow']); + const fTransp = match(['transp', 'alpha']); + const fTarget = match(['target', 'output', 'ground_truth']); + + if (!fAlbedo) { + st.textContent = '✗ No albedo.png found'; + this.setStatus('No albedo.png in sample dir', true); + return; + } + + try { + const t0 = performance.now(); + + // Load primary albedo to get dimensions + const {tex: albTex, w, h} = await this.loadGpuTex(fAlbedo); + this.canvas.width = w; this.canvas.height = h; + this.context.configure({device:this.device, format:this.format}); + + // Load optional channels — fall back to neutral 1×1 textures + const nrmTex = fNormal ? (await this.loadGpuTex(fNormal)).tex + : this.makeFallbackTex(128, 128, 0, 255); // oct-encoded (0,0) normal + const dptTex = fDepth ? (await this.loadGpuTex(fDepth)).tex + : this.makeFallbackTex(0, 0, 0, 255); + const midTex = fMatid ? (await this.loadGpuTex(fMatid)).tex + : this.makeFallbackTex(0, 0, 0, 255); + const shdTex = fShadow ? (await this.loadGpuTex(fShadow)).tex + : this.makeFallbackTex(255, 255, 255, 255); // fully lit + const trpTex = fTransp ? (await this.loadGpuTex(fTransp)).tex + : this.makeFallbackTex(0, 0, 0, 255); // fully opaque + + // Load target if present + if (this.targetBitmap) { this.targetBitmap.close(); this.targetBitmap = null; } + if (fTarget) { + this.targetBitmap = await createImageBitmap(fTarget); + this.showTarget(); + } else { + document.getElementById('targetPane').style.display = 'none'; + } + + // Pack G-buffer into feat0/feat1 + const mk = (fmt, tw, th) => this.device.createTexture({size:[tw,th], format:fmt, + usage:GPUTextureUsage.STORAGE_BINDING|GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_SRC}); + const f0 = mk('rgba32uint', w, h); + const f1 = mk('rgba32uint', w, h); + + const ceil8 = (n) => Math.ceil(n/8); + const pl = this.getFullPack(); + const bg = this.device.createBindGroup({layout: pl.getBindGroupLayout(0), + entries: [ + {binding:0, resource: albTex.createView()}, + {binding:1, resource: nrmTex.createView()}, + {binding:2, resource: dptTex.createView()}, + {binding:3, resource: midTex.createView()}, + {binding:4, resource: shdTex.createView()}, + {binding:5, resource: trpTex.createView()}, + {binding:6, resource: this.linearSampler}, + {binding:7, resource: f0.createView()}, + {binding:8, resource: f1.createView()}, + ]}); + + const enc = this.device.createCommandEncoder(); + const cp = enc.beginComputePass(); + cp.setPipeline(pl); cp.setBindGroup(0, bg); + cp.dispatchWorkgroups(ceil8(w), ceil8(h)); + cp.end(); + this.device.queue.submit([enc.finish()]); + await this.device.queue.onSubmittedWorkDone(); + + // Cleanup source textures + [albTex, nrmTex, dptTex, midTex, shdTex, trpTex].forEach(t => t.destroy()); + + const found = [fAlbedo, fNormal, fDepth, fMatid, fShadow, fTransp] + .filter(Boolean).map(f => f.name).join(', '); + st.textContent = `✓ ${found}`; + this.log(`Sample packed: ${w}×${h}, ${((performance.now()-t0)).toFixed(0)}ms`); + + // Run inference from packed feat textures + await this.runFromFeat(f0, f1, w, h); + f0.destroy(); f1.destroy(); + + } catch(e) { + st.textContent = `✗ ${e.message}`; + this.setStatus(`Sample error: ${e.message}`, true); + this.log(`Sample error: ${e.message}`, 'err'); + } + } + + // Show target.png in the #targetPane alongside main canvas. + showTarget() { + if (!this.targetBitmap) return; + const tc = document.getElementById('targetCanvas'); + tc.width = this.targetBitmap.width; + tc.height = this.targetBitmap.height; + const ctx2d = tc.getContext('2d'); + ctx2d.drawImage(this.targetBitmap, 0, 0); + document.getElementById('targetPane').style.display = 'flex'; + } + + // Run CNN inference starting from pre-packed feat_tex0 / feat_tex1. + // Used by loadSampleDir() to skip the photo-pack step. + async runFromFeat(f0, f1, w, h) { + if (!this.weightsU32 || !this.device) return; + const t0 = performance.now(); + const W2=w>>1, H2=h>>1, W4=W2>>1, H4=H2>>1; + + this.context.configure({device:this.device, format:this.format}); + + // Create a neutral "original" texture so the display shader can still + // render Orig/Diff modes (just black for sample mode). + if (this.inputTex) this.inputTex.destroy(); + this.inputTex = this.device.createTexture({size:[w,h], format:'rgba8unorm', + usage:GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_DST|GPUTextureUsage.RENDER_ATTACHMENT}); + // Leave it cleared to black — Diff mode against target would need more work + + const mk = (fmt, tw, th) => this.device.createTexture({size:[tw,th], format:fmt, + usage:GPUTextureUsage.STORAGE_BINDING|GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_SRC}); + const e0=mk('rgba16float',w,h), e1=mk('rgba32uint',W2,H2); + const bn=mk('rgba32uint',W4,H4), d1=mk('rgba16float',W2,H2), ot=mk('rgba16float',w,h); + + if (!this.weightsGPU) { + this.weightsGPU = this.device.createBuffer({size:this.weightsBuffer.byteLength, + usage:GPUBufferUsage.STORAGE|GPUBufferUsage.COPY_DST}); + this.device.queue.writeBuffer(this.weightsGPU, 0, this.weightsBuffer); + } + const wg = this.weightsGPU; + const fp = this.filmParams(); + const wu = (data) => { + const b = this.device.createBuffer({size:data.byteLength, usage:GPUBufferUsage.UNIFORM|GPUBufferUsage.COPY_DST}); + this.device.queue.writeBuffer(b, 0, data); return b; + }; + const uE0=wu(this.u4(ENC0_OFF,fp.ge0,fp.be0)); + const uE1=wu(this.u8(ENC1_OFF,fp.ge1,fp.be1)); + const uBN=wu(this.ubn(BN_OFF)); + const uD1=wu(this.u4(DEC1_OFF,fp.gd1,fp.bd1)); + const uD0=wu(this.u4(DEC0_OFF,fp.gd0,fp.bd0)); + const dispData=new ArrayBuffer(16); + new DataView(dispData).setFloat32(4, this.blend, true); + const uDp=wu(dispData); + + const enc = this.device.createCommandEncoder(); + const bg = (pl,...entries) => this.device.createBindGroup({layout:pl.getBindGroupLayout(0), + entries:entries.map((r,i)=>({binding:i,resource:r}))}); + const rv = (t) => t.createView(); + const cp = (pl,bgr,wx,wy) => {const p=enc.beginComputePass();p.setPipeline(pl);p.setBindGroup(0,bgr);p.dispatchWorkgroups(wx,wy);p.end();}; + const ceil8 = (n) => Math.ceil(n/8); + + cp(this.getEnc0(), bg(this.getEnc0(), rv(f0),rv(f1),{buffer:wg},{buffer:uE0},rv(e0)), ceil8(w), ceil8(h)); + cp(this.getEnc1(), bg(this.getEnc1(), rv(e0),{buffer:wg},{buffer:uE1},rv(e1)), ceil8(W2), ceil8(H2)); + cp(this.getBN(), bg(this.getBN(), rv(e1),{buffer:wg},{buffer:uBN},rv(bn)), ceil8(W4), ceil8(H4)); + cp(this.getDec1(), bg(this.getDec1(), rv(bn),rv(e1),{buffer:wg},{buffer:uD1},rv(d1)), ceil8(W2), ceil8(H2)); + cp(this.getDec0(), bg(this.getDec0(), rv(d1),rv(e0),{buffer:wg},{buffer:uD0},rv(ot)), ceil8(w), ceil8(h)); + + const dbg = bg(this.getDisp(), rv(ot), rv(this.inputTex), {buffer:uDp}); + const rp = enc.beginRenderPass({colorAttachments:[{ + view:this.context.getCurrentTexture().createView(), loadOp:'clear', storeOp:'store'}]}); + rp.setPipeline(this.getDisp()); rp.setBindGroup(0, dbg); rp.draw(6); rp.end(); + + this.device.queue.submit([enc.finish()]); + await this.device.queue.onSubmittedWorkDone(); + + [uE0,uE1,uBN,uD1,uD0].forEach(b => b.destroy()); + + // Compute PSNR against target if available + let psnrStr = ''; + if (this.targetBitmap) { + this.showTarget(); + try { psnrStr = await this.computePSNR(ot, w, h); } catch(_) {} + } + + this.destroyLayerTex(); + this.layerTextures = {feat0:f0, feat1:f1, enc0:e0, enc1:e1, bn, dec1:d1, output:ot}; + this.lastResult = {ot, itex:this.inputTex, uDp, dispPL:this.getDisp(), w, h}; + this.updateVizPanel(); + + const ms = (performance.now()-t0).toFixed(1); + document.getElementById('cnnLabel').textContent = `CNN output (${ms}ms)`; + if (psnrStr) document.getElementById('psnrSt').textContent = psnrStr; + this.setStatus(`Sample: ${ms}ms · ${w}×${h}`); + this.log(`runFromFeat: ${ms}ms`); + } + + // Compute PSNR between CNN rgba16float output texture and target.png bitmap. + async computePSNR(outTex, w, h) { + const bpr = Math.ceil(w * 8 / 256) * 256; + const stg = this.device.createBuffer({size:bpr*h, + usage:GPUBufferUsage.COPY_DST|GPUBufferUsage.MAP_READ}); + const enc = this.device.createCommandEncoder(); + enc.copyTextureToBuffer({texture:outTex}, {buffer:stg, bytesPerRow:bpr, rowsPerImage:h}, [w,h]); + this.device.queue.submit([enc.finish()]); + await stg.mapAsync(GPUMapMode.READ); + const raw = new DataView(stg.getMappedRange()); + + // Decode output pixels from f16 + const f16 = (bits) => { + const s=(bits>>15)&1, e=(bits>>10)&0x1F, m=bits&0x3FF; + if(e===0) return 0; if(e===31) return s?0:1; + return Math.max(0,Math.min(1,(s?-1:1)*Math.pow(2,e-15)*(1+m/1024))); + }; + const cnnPx = new Float32Array(w*h*3); + for (let y=0;y<h;y++) for (let x=0;x<w;x++) { + const src=y*bpr+x*8, pi=(y*w+x)*3; + cnnPx[pi] = f16(raw.getUint16(src, true)); + cnnPx[pi+1]= f16(raw.getUint16(src+2, true)); + cnnPx[pi+2]= f16(raw.getUint16(src+4, true)); + } + stg.unmap(); stg.destroy(); + + // Read target pixels via offscreen canvas + const oc = document.createElement('canvas'); + oc.width = w; oc.height = h; + const ctx2d = oc.getContext('2d'); + ctx2d.drawImage(this.targetBitmap, 0, 0, w, h); + const tgtData = ctx2d.getImageData(0, 0, w, h).data; + + let mse = 0; + const n = w * h * 3; + for (let i=0; i<w*h; i++) { + const dr = cnnPx[i*3] - tgtData[i*4] /255; + const dg = cnnPx[i*3+1] - tgtData[i*4+1]/255; + const db = cnnPx[i*3+2] - tgtData[i*4+2]/255; + mse += dr*dr + dg*dg + db*db; + } + mse /= n; + const psnr = mse > 0 ? (10 * Math.log10(1 / mse)).toFixed(2) : '∞'; + return `MSE=${mse.toFixed(5)} PSNR=${psnr}dB`; + } } // ── UI helpers ─────────────────────────────────────────────────────────────── diff --git a/doc/COMPLETED.md b/doc/COMPLETED.md index 7b925f0..072c92f 100644 --- a/doc/COMPLETED.md +++ b/doc/COMPLETED.md @@ -36,6 +36,8 @@ Completed task archive. See `doc/archive/` for detailed historical documents. ## March 2026 +- [x] **CNN v3 Phase 7: Validation tools** — `GBufViewEffect` (C++ 4×5 channel grid, `cnn_v3/shaders/gbuf_view.wgsl`, `cnn_v3/src/gbuf_view_effect.{h,cc}`): renders all 20 G-buffer feature channels tiled on screen; custom BGL with `WGPUTextureSampleType_Uint`, bind group rebuilt per frame via `wgpuRenderPipelineGetBindGroupLayout`. Web tool "Load sample directory" (`cnn_v3/tools/tester.js` + `shaders.js`): `webkitdirectory` picker, `FULL_PACK_SHADER` compute (matches `gbuf_pack.wgsl`), `runFromFeat()` inference, PSNR vs `target.png`. 36/36 tests. + - [x] **CNN v3 Phase 5: Parity validation** — `test_cnn_v3_parity.cc` (2 tests: zero_weights, random_weights). Root cause: intermediate nodes declared at full res instead of W/2, W/4. Fix: `NodeRegistry::default_width()/default_height()` getters + fractional resolution in `declare_nodes()`. Final max_err=4.88e-4 ✓. 36/36 tests. - [x] **CNN v3 Phase 4: C++ CNNv3Effect + FiLM uniform upload** — `cnn_v3/src/cnn_v3_effect.{h,cc}`. 5 compute passes (enc0→enc1→bottleneck→dec1→dec0), shared f16 weights buffer, per-pass uniform buffers, `set_film_params()` API. Key fix: WGSL `vec3u` has align=16, so `CnnV3Params4ch`=64B and `CnnV3ParamsEnc1`=96B (not 48/80). Weight offsets as explicit formulas. FiLM γ/β identity defaults; real values await `train_cnn_v3.py`. 35/35 tests. diff --git a/src/effects/shaders.cc b/src/effects/shaders.cc index f64e135..37b7d09 100644 --- a/src/effects/shaders.cc +++ b/src/effects/shaders.cc @@ -117,6 +117,7 @@ const char* ntsc_rgb_shader_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_NTSC_RGB); const char* ntsc_yiq_shader_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_NTSC_YIQ); const char* gbuf_raster_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_GBUF_RASTER); const char* gbuf_pack_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_GBUF_PACK); +const char* gbuf_view_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_GBUF_VIEW); const char* cnn_v3_enc0_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_CNN_V3_ENC0); const char* cnn_v3_enc1_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_CNN_V3_ENC1); const char* cnn_v3_bottleneck_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_CNN_V3_BOTTLENECK); diff --git a/src/effects/shaders.h b/src/effects/shaders.h index 4a77597..de5dda4 100644 --- a/src/effects/shaders.h +++ b/src/effects/shaders.h @@ -23,6 +23,7 @@ extern const char* ntsc_yiq_shader_wgsl; // CNN v3 G-buffer shaders extern const char* gbuf_raster_wgsl; extern const char* gbuf_pack_wgsl; +extern const char* gbuf_view_wgsl; // CNN v3 inference shaders extern const char* cnn_v3_enc0_wgsl; diff --git a/src/gpu/demo_effects.h b/src/gpu/demo_effects.h index 66b920c..edb4a23 100644 --- a/src/gpu/demo_effects.h +++ b/src/gpu/demo_effects.h @@ -32,9 +32,10 @@ #include "effects/scratch_effect.h" #include "effects/ntsc_effect.h" -// CNN v3 G-buffer + inference +// CNN v3 G-buffer + inference + debug view #include "../../cnn_v3/src/gbuffer_effect.h" #include "../../cnn_v3/src/cnn_v3_effect.h" +#include "../../cnn_v3/src/gbuf_view_effect.h" // TODO: Port CNN effects // #include "../../cnn_v1/src/cnn_v1_effect.h" diff --git a/src/tests/gpu/test_demo_effects.cc b/src/tests/gpu/test_demo_effects.cc index f5af5a9..d6e9c8a 100644 --- a/src/tests/gpu/test_demo_effects.cc +++ b/src/tests/gpu/test_demo_effects.cc @@ -89,6 +89,11 @@ static void test_effects() { fixture.ctx(), std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, std::vector<std::string>{"cnn_v3_output"}, 0.0f, 1000.0f)}, + {"GBufViewEffect", + std::make_shared<GBufViewEffect>( + fixture.ctx(), + std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, + std::vector<std::string>{"gbuf_view_out"}, 0.0f, 1000.0f)}, }; int passed = 0; diff --git a/workspaces/main/assets.txt b/workspaces/main/assets.txt index 0af8b7b..f22552b 100644 --- a/workspaces/main/assets.txt +++ b/workspaces/main/assets.txt @@ -101,6 +101,7 @@ SHADER_RENDER_NTSC_COMMON, WGSL, ../../src/shaders/render/ntsc_common.wgsl, "NTS # --- CNN v3 G-Buffer --- SHADER_GBUF_RASTER, WGSL, ../../cnn_v3/shaders/gbuf_raster.wgsl, "CNN v3 G-buffer MRT rasterization shader" SHADER_GBUF_PACK, WGSL, ../../cnn_v3/shaders/gbuf_pack.wgsl, "CNN v3 G-buffer feature pack compute shader" +SHADER_GBUF_VIEW, WGSL, ../../cnn_v3/shaders/gbuf_view.wgsl, "CNN v3 G-buffer channel visualization (4x5 grid)" # --- CNN v3 Inference --- SHADER_CNN_V3_COMMON, WGSL, ../../cnn_v3/shaders/cnn_v3_common.wgsl, "CNN v3 shared helpers snippet (get_w, unpack_8ch)" |
