summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorskal <pascal.massimino@gmail.com>2026-03-22 20:31:45 +0100
committerskal <pascal.massimino@gmail.com>2026-03-22 20:31:45 +0100
commita2697faa005337c4d8e8e6376d9e57edadf63f44 (patch)
tree8c253dd279e42c5f7f539c713794cf910e6e8bef
parentce22f79c55e68f9fa496a47a528a6978b89e1261 (diff)
docs+feat(cnn_v3): compact context, re-enable shadow in GBufDeferredEffect
- TODO/PROJECT_CONTEXT updated to reflect operational pipeline state - GBufDeferredEffect: shadow re-enabled (albedo * (ambient + diffuse * shadow)) feat_tex1 binding restored for shadow channel debugging handoff(Gemini): shadow pass live again — investigate why shadow looks broken.
-rw-r--r--PROJECT_CONTEXT.md8
-rw-r--r--TODO.md34
-rw-r--r--cnn_v3/shaders/gbuf_deferred.wgsl7
-rw-r--r--cnn_v3/src/gbuf_deferred_effect.cc18
4 files changed, 31 insertions, 36 deletions
diff --git a/PROJECT_CONTEXT.md b/PROJECT_CONTEXT.md
index 3ed265a..d211cea 100644
--- a/PROJECT_CONTEXT.md
+++ b/PROJECT_CONTEXT.md
@@ -36,7 +36,7 @@
- **Audio:** Sample-accurate sync. Zero heap allocations per frame. Variable tempo. OLA-IDCT synthesis (v2 .spec): Hann analysis window, rectangular synthesis, 50% overlap, click-free. V1 (raw DCT-512) preserved for generated notes. .spec files regenerated as v2.
- **Shaders:** Parameterized effects (UniformHelper, .seq syntax). Beat-synchronized animation support (`beat_time`, `beat_phase`). Modular WGSL composition with ShaderComposer. 27 shared common shaders (math, render, compute). Reusable snippets: `render/scratch_lines`, `render/ntsc_common` (NTSC signal processing, RGB and YIQ input variants via `sample_ntsc_signal` hook), `math/color` (YIQ/NTSC), `math/color_c64` (C64 palette, Bayer dither, border animation).
- **3D:** Hybrid SDF/rasterization with BVH. Binary scene loader. Blender pipeline.
-- **Effects:** CNN post-processing: CNNEffect (v1) and CNNv2Effect operational. CNN v2: sigmoid activation, storage buffer weights (~3.2 KB), 7D static features, dynamic layers. Training stable, convergence validated. **CNN v3 Phases 1–7 complete:** `CNNv3Effect` C++ class (5 compute passes, FiLM uniform upload, identity γ/β defaults). Parity validated: max_err=4.88e-4 (≤1/255). Validation tools: `GBufViewEffect` (C++ 4×5 channel grid) + web "Load sample directory" (G-buffer pack → CNN inference → PSNR vs target.png). See `cnn_v3/docs/HOWTO.md` §9.
+- **Effects:** CNN post-processing: CNNEffect (v1) and CNNv2Effect operational. CNN v2: sigmoid activation, storage buffer weights (~3.2 KB), 7D static features, dynamic layers. Training stable, convergence validated. **CNN v3 Phases 1–7 complete** + runtime pipeline operational: `GBufferEffect` (MRT raster + sphere impostors + SDF shadow pass) → `GBufDeferredEffect` (albedo×diffuse debug view) wired in `cnn_v3_test` sequence. Shared snippets: `math/normal` (oct encode/decode), `ray_sphere`. Parity validated: max_err=4.88e-4. See `cnn_v3/docs/HOWTO.md`.
- **Tools:** CNN test tool operational. Texture readback utility functional. Timeline editor (web-based, beat-aligned, audio playback).
- **Build:** Asset dependency tracking. Size measurement. Hot-reload (debug-only). WSL (Windows 10) supported: native Linux build and cross-compile to `.exe` via `mingw-w64`.
- **Sequence:** DAG-based effect routing with explicit node system. Python compiler with topological sort and ping-pong optimization. 12 effects operational (Passthrough, Placeholder, GaussianBlur, Heptagon, Particles, RotatingCube, Hybrid3D, Flash, PeakMeter, Scene1, Scene2, Scratch). Effect times are absolute (seq_compiler adds sequence start offset). See `doc/SEQUENCE.md`.
@@ -46,9 +46,9 @@
## Next Up
-**Active:** CNN v3 training (`train_cnn_v3.py`), Spectral Brush Editor
-**Ongoing:** Test infrastructure maintenance (35/35 passing)
-**Future:** Size optimization (64k target), 3D enhancements
+**Active:** CNN v3 shadow pass debugging (`GBufDeferredEffect`), Spectral Brush Editor
+**Ongoing:** Test infrastructure maintenance (38/38 passing)
+**Future:** CNN v3 training pass, size optimization (64k target)
See `TODO.md` for details.
diff --git a/TODO.md b/TODO.md
index 66cbe76..e855384 100644
--- a/TODO.md
+++ b/TODO.md
@@ -14,7 +14,7 @@ Procedural spectrogram tool: 50-100× compression (5 KB .spec → ~100 bytes C++
## Priority 2: Test Infrastructure Maintenance [ONGOING]
-**Status:** 35/35 tests passing
+**Status:** 38/38 tests passing
**Outstanding TODOs:**
@@ -62,32 +62,18 @@ Ongoing shader code hygiene for granular, reusable snippets.
## CNN v3 — U-Net + FiLM [IN PROGRESS]
-U-Net architecture with FiLM conditioning. Runtime style control via beat/audio.
-Richer G-buffer input (normals, depth, material IDs). Per-pixel testability across
-PyTorch / HTML WebGPU / C++ WebGPU.
+**Design:** `cnn_v3/docs/CNN_V3.md` | All phases 1–7 complete. Runtime pipeline operational.
-**Design:** `cnn_v3/docs/CNN_V3.md`
+**Current pipeline:** `GBufferEffect` → `GBufDeferredEffect` → sink (debug view: albedo×diffuse)
-**Phases:**
-1. ✅ G-buffer: `GBufferEffect` integrated. SDF/shadow placeholder (shadow=1, transp=0).
-2. ✅ Training infrastructure: `blender_export.py`, `pack_blender_sample.py`, `pack_photo_sample.py`
-3. ✅ WGSL shaders: cnn_v3_common (snippet), enc0, enc1, bottleneck, dec1, dec0
-4. ✅ C++ `CNNv3Effect`: 5 compute passes, FiLM uniform upload, `set_film_params()` API
- - Params alignment fix: WGSL `vec3u` align=16 → C++ structs 64/96 bytes
- - Weight offsets as explicit formulas (e.g. `20*4*9+4`)
- - FiLM γ/β: identity defaults; real values require trained MLP (see below)
-5. ✅ Parity validation: test vectors + `test_cnn_v3_parity.cc`. max_err=4.88e-4 (≤1/255).
- - Key fix: intermediate nodes at fractional resolutions (W/2, W/4) via `NodeRegistry::default_width()/default_height()`
+**Active work:**
+- [ ] Fix/validate shadow pass (`gbuf_shadow.wgsl`) — currently disabled in deferred
+- [ ] Re-enable shadow in `GBufDeferredEffect` once validated
+- [ ] Run first real training pass — see `cnn_v3/docs/HOWTO.md` §3
-6. ✅ Training script: `train_cnn_v3.py` + `cnn_v3_utils.py` written
- - ✅ `export_cnn_v3_weights.py` — convert trained `.pth` → `.bin` (f16)
-7. ✅ Validation tools:
- - `GBufViewEffect` — C++ 4×5 channel grid (all 20 G-buffer channels)
- - Web tool "Load sample directory" — G-buffer pack → CNN inference → PSNR
- - See `cnn_v3/docs/HOWTO.md` §9
-
-**Next: run a real training pass**
-- See `cnn_v3/docs/HOWTO.md` §3 for training commands
+**Pending (lower priority):**
+- [ ] GBufferEffect: Pass 3 transparency (transp=0 placeholder)
+- [ ] GBufferEffect: `resize()` support
## Future: CNN v3 "2D Mode" (G-buffer-free)
diff --git a/cnn_v3/shaders/gbuf_deferred.wgsl b/cnn_v3/shaders/gbuf_deferred.wgsl
index dda4b27..2ed4ce3 100644
--- a/cnn_v3/shaders/gbuf_deferred.wgsl
+++ b/cnn_v3/shaders/gbuf_deferred.wgsl
@@ -5,6 +5,7 @@
#include "math/normal"
@group(0) @binding(0) var feat_tex0: texture_2d<u32>;
+@group(0) @binding(1) var feat_tex1: texture_2d<u32>;
@group(0) @binding(2) var<uniform> uniforms: GBufDeferredUniforms;
struct GBufDeferredUniforms {
@@ -39,5 +40,9 @@ fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f {
let normal = oct_decode(vec2f(bx.y, ny_d.x));
let diffuse = max(0.0, dot(normal, KEY_LIGHT));
- return vec4f(albedo * (AMBIENT + diffuse), 1.0);
+ // feat_tex1[2] = pack4x8unorm(mip2.g, mip2.b, shadow, transp)
+ let t1 = textureLoad(feat_tex1, coord, 0);
+ let shadow = unpack4x8unorm(t1.z).z;
+
+ return vec4f(albedo * (AMBIENT + diffuse * shadow), 1.0);
}
diff --git a/cnn_v3/src/gbuf_deferred_effect.cc b/cnn_v3/src/gbuf_deferred_effect.cc
index 1adae5e..de6bd29 100644
--- a/cnn_v3/src/gbuf_deferred_effect.cc
+++ b/cnn_v3/src/gbuf_deferred_effect.cc
@@ -37,12 +37,13 @@ GBufDeferredEffect::GBufDeferredEffect(const GpuContext& ctx,
: Effect(ctx, inputs, outputs, start_time, end_time) {
HEADLESS_RETURN_IF_NULL(ctx_.device);
- WGPUBindGroupLayoutEntry entries[2] = {
+ WGPUBindGroupLayoutEntry entries[3] = {
bgl_uint_tex(0),
+ bgl_uint_tex(1),
bgl_uniform(2, sizeof(GBufDeferredUniforms)),
};
WGPUBindGroupLayoutDescriptor bgl_desc = {};
- bgl_desc.entryCount = 2;
+ bgl_desc.entryCount = 3;
bgl_desc.entries = entries;
WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(ctx_.device, &bgl_desc);
@@ -89,6 +90,7 @@ void GBufDeferredEffect::render(WGPUCommandEncoder encoder,
const UniformsSequenceParams& params,
NodeRegistry& nodes) {
WGPUTextureView feat0_view = nodes.get_view(input_nodes_[0]);
+ WGPUTextureView feat1_view = nodes.get_view(input_nodes_[1]);
WGPUTextureView output_view = nodes.get_view(output_nodes_[0]);
// Upload resolution uniform into the base class uniforms buffer (first 8 bytes).
@@ -101,16 +103,18 @@ void GBufDeferredEffect::render(WGPUCommandEncoder encoder,
WGPUBindGroupLayout bgl =
wgpuRenderPipelineGetBindGroupLayout(pipeline_.get(), 0);
- WGPUBindGroupEntry bg_entries[2] = {};
+ WGPUBindGroupEntry bg_entries[3] = {};
bg_entries[0].binding = 0;
bg_entries[0].textureView = feat0_view;
- bg_entries[1].binding = 2;
- bg_entries[1].buffer = uniforms_buffer_.get().buffer;
- bg_entries[1].size = sizeof(GBufDeferredUniforms);
+ bg_entries[1].binding = 1;
+ bg_entries[1].textureView = feat1_view;
+ bg_entries[2].binding = 2;
+ bg_entries[2].buffer = uniforms_buffer_.get().buffer;
+ bg_entries[2].size = sizeof(GBufDeferredUniforms);
WGPUBindGroupDescriptor bg_desc = {};
bg_desc.layout = bgl;
- bg_desc.entryCount = 2;
+ bg_desc.entryCount = 3;
bg_desc.entries = bg_entries;
bind_group_.replace(wgpuDeviceCreateBindGroup(ctx_.device, &bg_desc));
wgpuBindGroupLayoutRelease(bgl);