diff options
| -rw-r--r-- | PROJECT_CONTEXT.md | 8 | ||||
| -rw-r--r-- | TODO.md | 33 | ||||
| -rw-r--r-- | cmake/DemoSourceLists.cmake | 1 | ||||
| -rw-r--r-- | cnn_v3/shaders/gbuf_deferred.wgsl | 48 | ||||
| -rw-r--r-- | cnn_v3/shaders/gbuf_pack.wgsl | 12 | ||||
| -rw-r--r-- | cnn_v3/shaders/gbuf_raster.wgsl | 54 | ||||
| -rw-r--r-- | cnn_v3/shaders/gbuf_shadow.wgsl | 35 | ||||
| -rw-r--r-- | cnn_v3/shaders/gbuf_view.wgsl | 32 | ||||
| -rw-r--r-- | cnn_v3/src/gbuf_deferred_effect.cc | 140 | ||||
| -rw-r--r-- | cnn_v3/src/gbuf_deferred_effect.h | 23 | ||||
| -rw-r--r-- | cnn_v3/src/gbuf_view_effect.cc | 4 | ||||
| -rw-r--r-- | cnn_v3/src/gbuffer_effect.cc | 168 | ||||
| -rw-r--r-- | cnn_v3/src/gbuffer_effect.h | 4 | ||||
| -rw-r--r-- | doc/COMPLETED.md | 8 | ||||
| -rw-r--r-- | src/effects/shaders.cc | 3 | ||||
| -rw-r--r-- | src/effects/shaders.h | 1 | ||||
| -rw-r--r-- | src/gpu/demo_effects.h | 1 | ||||
| -rw-r--r-- | src/shaders/math/normal.wgsl | 34 | ||||
| -rw-r--r-- | src/shaders/ray_sphere.wgsl | 21 | ||||
| -rw-r--r-- | src/tests/gpu/test_demo_effects.cc | 5 | ||||
| -rwxr-xr-x | tools/seq_compiler.py | 3 | ||||
| -rw-r--r-- | workspaces/main/assets.txt | 3 | ||||
| -rw-r--r-- | workspaces/main/timeline.seq | 12 |
23 files changed, 476 insertions, 177 deletions
diff --git a/PROJECT_CONTEXT.md b/PROJECT_CONTEXT.md index 3ed265a..9a710f1 100644 --- a/PROJECT_CONTEXT.md +++ b/PROJECT_CONTEXT.md @@ -36,7 +36,7 @@ - **Audio:** Sample-accurate sync. Zero heap allocations per frame. Variable tempo. OLA-IDCT synthesis (v2 .spec): Hann analysis window, rectangular synthesis, 50% overlap, click-free. V1 (raw DCT-512) preserved for generated notes. .spec files regenerated as v2. - **Shaders:** Parameterized effects (UniformHelper, .seq syntax). Beat-synchronized animation support (`beat_time`, `beat_phase`). Modular WGSL composition with ShaderComposer. 27 shared common shaders (math, render, compute). Reusable snippets: `render/scratch_lines`, `render/ntsc_common` (NTSC signal processing, RGB and YIQ input variants via `sample_ntsc_signal` hook), `math/color` (YIQ/NTSC), `math/color_c64` (C64 palette, Bayer dither, border animation). - **3D:** Hybrid SDF/rasterization with BVH. Binary scene loader. Blender pipeline. -- **Effects:** CNN post-processing: CNNEffect (v1) and CNNv2Effect operational. CNN v2: sigmoid activation, storage buffer weights (~3.2 KB), 7D static features, dynamic layers. Training stable, convergence validated. **CNN v3 Phases 1–7 complete:** `CNNv3Effect` C++ class (5 compute passes, FiLM uniform upload, identity γ/β defaults). Parity validated: max_err=4.88e-4 (≤1/255). Validation tools: `GBufViewEffect` (C++ 4×5 channel grid) + web "Load sample directory" (G-buffer pack → CNN inference → PSNR vs target.png). See `cnn_v3/docs/HOWTO.md` §9. +- **Effects:** CNN post-processing: CNNEffect (v1) and CNNv2Effect operational. CNN v2: sigmoid activation, storage buffer weights (~3.2 KB), 7D static features, dynamic layers. Training stable, convergence validated. **CNN v3 Phases 1–7 complete** + runtime pipeline operational: `GBufferEffect` (MRT raster + sphere impostors + SDF shadow pass) → `GBufDeferredEffect` (albedo×diffuse debug view) wired in `cnn_v3_test` sequence. Shared snippets: `math/normal` (oct encode/decode), `ray_sphere`. Parity validated: max_err=4.88e-4. See `cnn_v3/docs/HOWTO.md`. - **Tools:** CNN test tool operational. Texture readback utility functional. Timeline editor (web-based, beat-aligned, audio playback). - **Build:** Asset dependency tracking. Size measurement. Hot-reload (debug-only). WSL (Windows 10) supported: native Linux build and cross-compile to `.exe` via `mingw-w64`. - **Sequence:** DAG-based effect routing with explicit node system. Python compiler with topological sort and ping-pong optimization. 12 effects operational (Passthrough, Placeholder, GaussianBlur, Heptagon, Particles, RotatingCube, Hybrid3D, Flash, PeakMeter, Scene1, Scene2, Scratch). Effect times are absolute (seq_compiler adds sequence start offset). See `doc/SEQUENCE.md`. @@ -46,9 +46,9 @@ ## Next Up -**Active:** CNN v3 training (`train_cnn_v3.py`), Spectral Brush Editor -**Ongoing:** Test infrastructure maintenance (35/35 passing) -**Future:** Size optimization (64k target), 3D enhancements +**Active:** CNN v3 shadow ✅ fixed — restore full scene, then training pass. Spectral Brush Editor. +**Ongoing:** Test infrastructure maintenance (38/38 passing) +**Future:** CNN v3 training pass, size optimization (64k target) See `TODO.md` for details. @@ -14,7 +14,7 @@ Procedural spectrogram tool: 50-100× compression (5 KB .spec → ~100 bytes C++ ## Priority 2: Test Infrastructure Maintenance [ONGOING] -**Status:** 35/35 tests passing +**Status:** 38/38 tests passing **Outstanding TODOs:** @@ -62,32 +62,19 @@ Ongoing shader code hygiene for granular, reusable snippets. ## CNN v3 — U-Net + FiLM [IN PROGRESS] -U-Net architecture with FiLM conditioning. Runtime style control via beat/audio. -Richer G-buffer input (normals, depth, material IDs). Per-pixel testability across -PyTorch / HTML WebGPU / C++ WebGPU. +**Design:** `cnn_v3/docs/CNN_V3.md` | All phases 1–7 complete. Runtime pipeline operational. -**Design:** `cnn_v3/docs/CNN_V3.md` +**Current pipeline:** `GBufferEffect` → `GBufDeferredEffect` → `GBufViewEffect` → sink -**Phases:** -1. ✅ G-buffer: `GBufferEffect` integrated. SDF/shadow placeholder (shadow=1, transp=0). -2. ✅ Training infrastructure: `blender_export.py`, `pack_blender_sample.py`, `pack_photo_sample.py` -3. ✅ WGSL shaders: cnn_v3_common (snippet), enc0, enc1, bottleneck, dec1, dec0 -4. ✅ C++ `CNNv3Effect`: 5 compute passes, FiLM uniform upload, `set_film_params()` API - - Params alignment fix: WGSL `vec3u` align=16 → C++ structs 64/96 bytes - - Weight offsets as explicit formulas (e.g. `20*4*9+4`) - - FiLM γ/β: identity defaults; real values require trained MLP (see below) -5. ✅ Parity validation: test vectors + `test_cnn_v3_parity.cc`. max_err=4.88e-4 (≤1/255). - - Key fix: intermediate nodes at fractional resolutions (W/2, W/4) via `NodeRegistry::default_width()/default_height()` +**Shadow pass status:** ✅ Fixed and re-enabled. Cube + sphere shadows correct. Pulsating sphere scale confirmed correct end-to-end. Scene is currently simplified (1 cube + 1 sphere, 1 light) for debugging. -6. ✅ Training script: `train_cnn_v3.py` + `cnn_v3_utils.py` written - - ✅ `export_cnn_v3_weights.py` — convert trained `.pth` → `.bin` (f16) -7. ✅ Validation tools: - - `GBufViewEffect` — C++ 4×5 channel grid (all 20 G-buffer channels) - - Web tool "Load sample directory" — G-buffer pack → CNN inference → PSNR - - See `cnn_v3/docs/HOWTO.md` §9 +**Active work:** +- [ ] Restore full scene in `GBufferEffect::set_scene()` (20 cubes + 4 spheres, 2 lights) +- [ ] Run first real training pass — see `cnn_v3/docs/HOWTO.md` §3 -**Next: run a real training pass** -- See `cnn_v3/docs/HOWTO.md` §3 for training commands +**Pending (lower priority):** +- [ ] GBufferEffect: Pass 3 transparency (transp=0 placeholder) +- [ ] GBufferEffect: `resize()` support ## Future: CNN v3 "2D Mode" (G-buffer-free) diff --git a/cmake/DemoSourceLists.cmake b/cmake/DemoSourceLists.cmake index e5c7339..52278ee 100644 --- a/cmake/DemoSourceLists.cmake +++ b/cmake/DemoSourceLists.cmake @@ -43,6 +43,7 @@ set(COMMON_GPU_EFFECTS cnn_v3/src/gbuffer_effect.cc cnn_v3/src/cnn_v3_effect.cc cnn_v3/src/gbuf_view_effect.cc + cnn_v3/src/gbuf_deferred_effect.cc # TODO: Port CNN effects to v2 (complex v1 dependencies) # cnn_v1/src/cnn_v1_effect.cc # cnn_v2/src/cnn_v2_effect.cc diff --git a/cnn_v3/shaders/gbuf_deferred.wgsl b/cnn_v3/shaders/gbuf_deferred.wgsl new file mode 100644 index 0000000..2ed4ce3 --- /dev/null +++ b/cnn_v3/shaders/gbuf_deferred.wgsl @@ -0,0 +1,48 @@ +// G-buffer deferred render — albedo * diffuse +// Reads feat_tex0 (rgba32uint, f16 geometric). +// Outputs albedo * (ambient + diffuse). + +#include "math/normal" + +@group(0) @binding(0) var feat_tex0: texture_2d<u32>; +@group(0) @binding(1) var feat_tex1: texture_2d<u32>; +@group(0) @binding(2) var<uniform> uniforms: GBufDeferredUniforms; + +struct GBufDeferredUniforms { + resolution: vec2f, +} + +const KEY_LIGHT: vec3f = vec3f(0.408, 0.816, 0.408); // normalize(1,2,1) +const AMBIENT: f32 = 0.15; + +@vertex +fn vs_main(@builtin(vertex_index) vid: u32) -> @builtin(position) vec4f { + let x = f32((vid & 1u) << 2u) - 1.0; + let y = f32((vid & 2u) << 1u) - 1.0; + return vec4f(x, y, 0.0, 1.0); +} + +@fragment +fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f { + let coord = vec2i(pos.xy); + + // feat_tex0 layout: + // [0] pack2x16float(albedo.r, albedo.g) + // [1] pack2x16float(albedo.b, normal.x) oct [-1,1] + // [2] pack2x16float(normal.y, depth) oct [-1,1] + let t0 = textureLoad(feat_tex0, coord, 0); + let rg = unpack2x16float(t0.x); + let bx = unpack2x16float(t0.y); + let ny_d = unpack2x16float(t0.z); + let albedo = vec3f(rg.x, rg.y, bx.x); + + // Decode oct-normal (stored in [-1,1] — use oct_decode directly) + let normal = oct_decode(vec2f(bx.y, ny_d.x)); + let diffuse = max(0.0, dot(normal, KEY_LIGHT)); + + // feat_tex1[2] = pack4x8unorm(mip2.g, mip2.b, shadow, transp) + let t1 = textureLoad(feat_tex1, coord, 0); + let shadow = unpack4x8unorm(t1.z).z; + + return vec4f(albedo * (AMBIENT + diffuse * shadow), 1.0); +} diff --git a/cnn_v3/shaders/gbuf_pack.wgsl b/cnn_v3/shaders/gbuf_pack.wgsl index 71d8471..333589c 100644 --- a/cnn_v3/shaders/gbuf_pack.wgsl +++ b/cnn_v3/shaders/gbuf_pack.wgsl @@ -1,4 +1,5 @@ // G-buffer pack compute shader for CNN v3 +#include "math/normal" // Pass 4: Pack all G-buffer channels into two rgba32uint feature textures (32 bytes/pixel) // Output feat_tex0 holds 8×f16 geometric channels; feat_tex1 holds 12×u8 context channels. @@ -44,15 +45,6 @@ fn box4(tl: vec2i) -> vec3f { return acc * (1.0 / 16.0); } -// Decode oct-normal from [0,1] storage → [-1,1] encoded xy → reconstruct z. -fn decode_oct_normal(rg: vec2f) -> vec3f { - let f = rg * 2.0 - vec2f(1.0); - var n = vec3f(f.x, f.y, 1.0 - abs(f.x) - abs(f.y)); - let t = max(-n.z, 0.0); - n.x += select(t, -t, n.x >= 0.0); - n.y += select(t, -t, n.y >= 0.0); - return normalize(n); -} @compute @workgroup_size(8, 8) fn pack_features(@builtin(global_invocation_id) id: vec3u) { @@ -73,7 +65,7 @@ fn pack_features(@builtin(global_invocation_id) id: vec3u) { // Normal: stored as oct-encoded [0,1] in RG; extract just the encoded xy for feat_tex0 let normal_enc = nm.rg; // already in [0,1] — decode to get the xy for CNN input - let n3 = decode_oct_normal(normal_enc); + let n3 = oct_decode_unorm(normal_enc); // Store oct-encoded in [-1,1] remapped back to what CNN expects (the [-1,1] oct xy) let oct_xy = normal_enc * 2.0 - vec2f(1.0); // remap [0,1] → [-1,1] diff --git a/cnn_v3/shaders/gbuf_raster.wgsl b/cnn_v3/shaders/gbuf_raster.wgsl index c762db2..aed0e72 100644 --- a/cnn_v3/shaders/gbuf_raster.wgsl +++ b/cnn_v3/shaders/gbuf_raster.wgsl @@ -1,8 +1,11 @@ // G-buffer rasterization shader for CNN v3 // Pass 1: Proxy geometry → MRT (albedo rgba16float, normal_mat rgba16float, depth32) // Uses GlobalUniforms, ObjectData, ObjectsBuffer from common_uniforms. +// SPHERE objects use ray-sphere impostor (correct silhouette + normal + depth). #include "common_uniforms" +#include "math/normal" +#include "ray_sphere" @group(0) @binding(0) var<uniform> globals: GlobalUniforms; @group(0) @binding(1) var<storage, read> object_data: ObjectsBuffer; @@ -15,21 +18,6 @@ struct VertexOutput { @location(3) @interpolate(flat) instance_index: u32, } -// Octahedral encoding: maps unit normal to [-1,1]^2 -fn oct_encode(n: vec3f) -> vec2f { - let inv_l1 = 1.0 / (abs(n.x) + abs(n.y) + abs(n.z)); - var p = n.xy * inv_l1; - // Fold lower hemisphere - if (n.z < 0.0) { - let s = vec2f( - select(-1.0, 1.0, p.x >= 0.0), - select(-1.0, 1.0, p.y >= 0.0) - ); - p = (1.0 - abs(p.yx)) * s; - } - return p; // in [-1, 1] -} - @vertex fn vs_main( @builtin(vertex_index) vertex_index: u32, @@ -73,8 +61,8 @@ fn vs_main( let world_pos = obj.model * vec4f(p, 1.0); let clip_pos = globals.view_proj * world_pos; - // Transform normal by inverse-transpose (upper-left 3×3 of inv_model^T) - let world_normal = normalize((obj.inv_model * vec4f(n, 0.0)).xyz); + // Transform normal: use model matrix (correct for uniform scale + rotation). + let world_normal = normalize((obj.model * vec4f(n, 0.0)).xyz); var out: VertexOutput; out.position = clip_pos; @@ -86,20 +74,40 @@ fn vs_main( } struct GBufOutput { - @location(0) albedo: vec4f, // rgba16float: material color - @location(1) normal_mat: vec4f, // rgba16float: oct-normal XY in RG, mat_id/255 in B + @location(0) albedo: vec4f, // rgba16float: material color + @location(1) normal_mat: vec4f, // rgba16float: oct-normal XY in RG, mat_id/255 in B + @builtin(frag_depth) depth: f32, // corrected depth (sphere impostor) } @fragment fn fs_main(in: VertexOutput) -> GBufOutput { - let obj = object_data.objects[in.instance_index]; - let mat_id = f32(in.instance_index) / 255.0; + let obj = object_data.objects[in.instance_index]; + let obj_type = u32(obj.params.x); + let mat_id = f32(in.instance_index) / 255.0; + + var world_normal = normalize(in.world_normal); + var frag_depth = in.position.z; // default: hardware depth + + // Sphere impostor: ray-sphere intersection for correct silhouette and normal. + if (obj_type == 1u) { + let sphere_center = obj.model[3].xyz; + let sphere_radius = length(obj.model[0].xyz); // uniform scale in col0 + let cam_pos = globals.camera_pos_time.xyz; + let rd = normalize(in.world_pos - cam_pos); + let isect = ray_sphere_intersection(cam_pos, rd, sphere_center, sphere_radius); + if (!isect.hit) { discard; } + let hit = cam_pos + rd * isect.t; + world_normal = normalize(hit - sphere_center); + // Reproject hit point to get correct clip-space depth. + let clip_hit = globals.view_proj * vec4f(hit, 1.0); + frag_depth = clip_hit.z / clip_hit.w; + } - // Oct-encode world normal, remap [-1,1] → [0,1] for storage - let oct = oct_encode(normalize(in.world_normal)) * 0.5 + vec2f(0.5); + let oct = oct_encode_unorm(world_normal); var out: GBufOutput; out.albedo = vec4f(in.color.rgb, 1.0); out.normal_mat = vec4f(oct.x, oct.y, mat_id, 0.0); + out.depth = frag_depth; return out; } diff --git a/cnn_v3/shaders/gbuf_shadow.wgsl b/cnn_v3/shaders/gbuf_shadow.wgsl index 0f5f8b4..735e47c 100644 --- a/cnn_v3/shaders/gbuf_shadow.wgsl +++ b/cnn_v3/shaders/gbuf_shadow.wgsl @@ -5,11 +5,13 @@ #include "common_uniforms" #include "camera_common" #include "math/sdf_shapes" +#include "math/normal" #include "render/raymarching_id" @group(0) @binding(0) var<uniform> globals: GlobalUniforms; @group(0) @binding(1) var<storage, read> object_data: ObjectsBuffer; @group(0) @binding(2) var depth_tex: texture_depth_2d; +@group(0) @binding(4) var normal_mat_tex: texture_2d<f32>; struct GBufLight { direction: vec4f, // xyz = toward light (world space, normalized) @@ -38,12 +40,14 @@ fn dfWithID(p: vec3f) -> RayMarchResult { let obj = object_data.objects[i]; let lp = (obj.inv_model * vec4f(p, 1.0)).xyz; let obj_type = u32(obj.params.x); + // Scale factor: convert local-space SDF to world-space distance. + let scale = length(obj.model[0].xyz); var d: f32; switch obj_type { - case 1u: { d = sdSphere(lp, 1.0); } // SPHERE - case 2u: { d = sdPlane(lp, vec3f(0.0, 1.0, 0.0), obj.params.y); } // PLANE - case 3u: { d = sdTorus(lp, vec2f(0.8, 0.2)); } // TORUS - default: { d = sdBox(lp, vec3f(1.0)); } // CUBE (0) + fallback + case 1u: { d = sdSphere(lp, 1.0) * scale; } // SPHERE + case 2u: { d = sdPlane(lp, vec3f(0.0, 1.0, 0.0), obj.params.y); } // PLANE + case 3u: { d = sdTorus(lp, vec2f(0.8, 0.2)) * scale; } // TORUS + default: { d = sdBox(lp, vec3f(1.0)) * scale; } // CUBE (0) + fallback } if (d < res.distance) { res.distance = d; @@ -53,6 +57,20 @@ fn dfWithID(p: vec3f) -> RayMarchResult { return res; } +// Soft shadow march (IQ formula). Returns 1=lit, 0=shadow. +// No dmin/dmax bounds: in open space d grows large so 8*d/t >> 1, res stays 1 naturally. +fn soft_shadow(ro: vec3f, rd: vec3f) -> f32 { + var t = 0.001; + var res = 1.0; + for (var i = 0; i < 64; i++) { + let d = dfWithID(ro + rd * t).distance; + if (d < 0.0005) { return 0.0; } + res = min(res, 8.0 * d / t); + t += d; + } + return clamp(res, 0.0, 1.0); +} + // ---- Vertex: fullscreen triangle ---- @vertex @@ -82,16 +100,17 @@ fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f { let clip = globals.inv_view_proj * vec4f(ndc, depth, 1.0); let world = clip.xyz / clip.w; - // Surface normal estimated from SDF gradient. - let nor = normalWithID(world); - let bias_pos = world + nor * 0.02; + // Use rasterized surface normal for bias — correct for sphere impostors. + let nm = textureLoad(normal_mat_tex, vec2i(pos.xy), 0); + let nor = oct_decode_unorm(nm.rg); + let bias_pos = world + nor * 0.05; // March shadow rays toward each light; take the darkest value. var shadow_val = 1.0; let num_lights = u32(lights.params.x); for (var i = 0u; i < num_lights; i++) { let ld = lights.lights[i].direction.xyz; - let s = shadowWithStoredDistance(bias_pos, ld, MAX_RAY_LENGTH); + let s = soft_shadow(bias_pos, ld); shadow_val = min(shadow_val, s); } diff --git a/cnn_v3/shaders/gbuf_view.wgsl b/cnn_v3/shaders/gbuf_view.wgsl index a5e6c91..3e7d1ff 100644 --- a/cnn_v3/shaders/gbuf_view.wgsl +++ b/cnn_v3/shaders/gbuf_view.wgsl @@ -9,6 +9,8 @@ // Row 3: ch12(m1.r) ch13(m1.g) ch14(m1.b) ch15(m2.r) // Row 4: ch16(m2.g) ch17(m2.b) ch18(shdw) ch19(trns) +#include "debug/debug_print" + struct GBufViewUniforms { resolution: vec2f } @group(0) @binding(0) var feat0: texture_2d<u32>; @@ -93,5 +95,33 @@ fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f { disp = clamp(v, 0.0, 1.0); } - return vec4f(disp, disp, disp, 1.0); + var out = vec4f(disp, disp, disp, 1.0); + + // Label at top-left of each tile + let tile_w = u.resolution.x / 4.0; + let tile_h = u.resolution.y / 5.0; + let origin = vec2f(f32(col) * tile_w + 4.0, f32(row) * tile_h + 4.0); + switch ch { + case 0u: { out = debug_str(out, pos.xy, origin, vec4u(0x616C622Eu, 0x72000000u, 0u, 0u), 5u); } // alb.r + case 1u: { out = debug_str(out, pos.xy, origin, vec4u(0x616C622Eu, 0x67000000u, 0u, 0u), 5u); } // alb.g + case 2u: { out = debug_str(out, pos.xy, origin, vec4u(0x616C622Eu, 0x62000000u, 0u, 0u), 5u); } // alb.b + case 3u: { out = debug_str(out, pos.xy, origin, vec4u(0x6E726D2Eu, 0x78000000u, 0u, 0u), 5u); } // nrm.x + case 4u: { out = debug_str(out, pos.xy, origin, vec4u(0x6E726D2Eu, 0x79000000u, 0u, 0u), 5u); } // nrm.y + case 5u: { out = debug_str(out, pos.xy, origin, vec4u(0x64657074u, 0x68000000u, 0u, 0u), 5u); } // depth + case 6u: { out = debug_str(out, pos.xy, origin, vec4u(0x647A6478u, 0u, 0u, 0u), 4u); } // dzdx + case 7u: { out = debug_str(out, pos.xy, origin, vec4u(0x647A6479u, 0u, 0u, 0u), 4u); } // dzdy + case 8u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D617469u, 0x64000000u, 0u, 0u), 5u); } // matid + case 9u: { out = debug_str(out, pos.xy, origin, vec4u(0x7072762Eu, 0x72000000u, 0u, 0u), 5u); } // prv.r + case 10u: { out = debug_str(out, pos.xy, origin, vec4u(0x7072762Eu, 0x67000000u, 0u, 0u), 5u); } // prv.g + case 11u: { out = debug_str(out, pos.xy, origin, vec4u(0x7072762Eu, 0x62000000u, 0u, 0u), 5u); } // prv.b + case 12u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D312E72u, 0u, 0u, 0u), 4u); } // m1.r + case 13u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D312E67u, 0u, 0u, 0u), 4u); } // m1.g + case 14u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D312E62u, 0u, 0u, 0u), 4u); } // m1.b + case 15u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D322E72u, 0u, 0u, 0u), 4u); } // m2.r + case 16u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D322E67u, 0u, 0u, 0u), 4u); } // m2.g + case 17u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D322E62u, 0u, 0u, 0u), 4u); } // m2.b + case 18u: { out = debug_str(out, pos.xy, origin, vec4u(0x73686477u, 0u, 0u, 0u), 4u); } // shdw + default: { out = debug_str(out, pos.xy, origin, vec4u(0x74726E73u, 0u, 0u, 0u), 4u); } // trns + } + return out; } diff --git a/cnn_v3/src/gbuf_deferred_effect.cc b/cnn_v3/src/gbuf_deferred_effect.cc new file mode 100644 index 0000000..de6bd29 --- /dev/null +++ b/cnn_v3/src/gbuf_deferred_effect.cc @@ -0,0 +1,140 @@ +// GBufDeferredEffect — simple deferred render: albedo * shadow from packed G-buffer. + +#include "gbuf_deferred_effect.h" +#include "gpu/gpu.h" +#include "gpu/shader_composer.h" +#include "util/fatal_error.h" + +extern const char* gbuf_deferred_wgsl; + +struct GBufDeferredUniforms { + float resolution[2]; +}; +static_assert(sizeof(GBufDeferredUniforms) == 8, "GBufDeferredUniforms must be 8 bytes"); + +static WGPUBindGroupLayoutEntry bgl_uint_tex(uint32_t binding) { + WGPUBindGroupLayoutEntry e = {}; + e.binding = binding; + e.visibility = WGPUShaderStage_Fragment; + e.texture.sampleType = WGPUTextureSampleType_Uint; + e.texture.viewDimension = WGPUTextureViewDimension_2D; + return e; +} + +static WGPUBindGroupLayoutEntry bgl_uniform(uint32_t binding, uint64_t min_size) { + WGPUBindGroupLayoutEntry e = {}; + e.binding = binding; + e.visibility = WGPUShaderStage_Fragment; + e.buffer.type = WGPUBufferBindingType_Uniform; + e.buffer.minBindingSize = min_size; + return e; +} + +GBufDeferredEffect::GBufDeferredEffect(const GpuContext& ctx, + const std::vector<std::string>& inputs, + const std::vector<std::string>& outputs, + float start_time, float end_time) + : Effect(ctx, inputs, outputs, start_time, end_time) { + HEADLESS_RETURN_IF_NULL(ctx_.device); + + WGPUBindGroupLayoutEntry entries[3] = { + bgl_uint_tex(0), + bgl_uint_tex(1), + bgl_uniform(2, sizeof(GBufDeferredUniforms)), + }; + WGPUBindGroupLayoutDescriptor bgl_desc = {}; + bgl_desc.entryCount = 3; + bgl_desc.entries = entries; + WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(ctx_.device, &bgl_desc); + + WGPUPipelineLayoutDescriptor pl_desc = {}; + pl_desc.bindGroupLayoutCount = 1; + pl_desc.bindGroupLayouts = &bgl; + WGPUPipelineLayout pl = wgpuDeviceCreatePipelineLayout(ctx_.device, &pl_desc); + + WGPUShaderSourceWGSL wgsl_src = {}; + wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL; + const std::string composed = ShaderComposer::Get().Compose({}, gbuf_deferred_wgsl); + wgsl_src.code = str_view(composed.c_str()); + WGPUShaderModuleDescriptor shader_desc = {}; + shader_desc.nextInChain = &wgsl_src.chain; + WGPUShaderModule shader = wgpuDeviceCreateShaderModule(ctx_.device, &shader_desc); + + WGPUColorTargetState target = {}; + target.format = WGPUTextureFormat_RGBA8Unorm; + target.writeMask = WGPUColorWriteMask_All; + + WGPUFragmentState frag = {}; + frag.module = shader; + frag.entryPoint = str_view("fs_main"); + frag.targetCount = 1; + frag.targets = ⌖ + + WGPURenderPipelineDescriptor pipe_desc = {}; + pipe_desc.layout = pl; + pipe_desc.vertex.module = shader; + pipe_desc.vertex.entryPoint = str_view("vs_main"); + pipe_desc.fragment = &frag; + pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList; + pipe_desc.multisample.count = 1; + pipe_desc.multisample.mask = UINT32_MAX; + + pipeline_.set(wgpuDeviceCreateRenderPipeline(ctx_.device, &pipe_desc)); + + wgpuShaderModuleRelease(shader); + wgpuPipelineLayoutRelease(pl); + wgpuBindGroupLayoutRelease(bgl); +} + +void GBufDeferredEffect::render(WGPUCommandEncoder encoder, + const UniformsSequenceParams& params, + NodeRegistry& nodes) { + WGPUTextureView feat0_view = nodes.get_view(input_nodes_[0]); + WGPUTextureView feat1_view = nodes.get_view(input_nodes_[1]); + WGPUTextureView output_view = nodes.get_view(output_nodes_[0]); + + // Upload resolution uniform into the base class uniforms buffer (first 8 bytes). + GBufDeferredUniforms u; + u.resolution[0] = params.resolution.x; + u.resolution[1] = params.resolution.y; + wgpuQueueWriteBuffer(ctx_.queue, uniforms_buffer_.get().buffer, 0, + &u, sizeof(u)); + + WGPUBindGroupLayout bgl = + wgpuRenderPipelineGetBindGroupLayout(pipeline_.get(), 0); + + WGPUBindGroupEntry bg_entries[3] = {}; + bg_entries[0].binding = 0; + bg_entries[0].textureView = feat0_view; + bg_entries[1].binding = 1; + bg_entries[1].textureView = feat1_view; + bg_entries[2].binding = 2; + bg_entries[2].buffer = uniforms_buffer_.get().buffer; + bg_entries[2].size = sizeof(GBufDeferredUniforms); + + WGPUBindGroupDescriptor bg_desc = {}; + bg_desc.layout = bgl; + bg_desc.entryCount = 3; + bg_desc.entries = bg_entries; + bind_group_.replace(wgpuDeviceCreateBindGroup(ctx_.device, &bg_desc)); + wgpuBindGroupLayoutRelease(bgl); + + WGPURenderPassColorAttachment color_att = {}; + color_att.view = output_view; + color_att.loadOp = WGPULoadOp_Clear; + color_att.storeOp = WGPUStoreOp_Store; + color_att.clearValue = {0.0f, 0.0f, 0.0f, 1.0f}; + color_att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED; + + WGPURenderPassDescriptor pass_desc = {}; + pass_desc.colorAttachmentCount = 1; + pass_desc.colorAttachments = &color_att; + + WGPURenderPassEncoder pass = + wgpuCommandEncoderBeginRenderPass(encoder, &pass_desc); + wgpuRenderPassEncoderSetPipeline(pass, pipeline_.get()); + wgpuRenderPassEncoderSetBindGroup(pass, 0, bind_group_.get(), 0, nullptr); + wgpuRenderPassEncoderDraw(pass, 3, 1, 0, 0); + wgpuRenderPassEncoderEnd(pass); + wgpuRenderPassEncoderRelease(pass); +} diff --git a/cnn_v3/src/gbuf_deferred_effect.h b/cnn_v3/src/gbuf_deferred_effect.h new file mode 100644 index 0000000..4daf13d --- /dev/null +++ b/cnn_v3/src/gbuf_deferred_effect.h @@ -0,0 +1,23 @@ +// GBufDeferredEffect — simple deferred render from packed G-buffer. +// Inputs: feat_tex0, feat_tex1 (rgba32uint). Output: albedo * shadow (rgba8unorm). + +#pragma once +#include "gpu/effect.h" +#include "gpu/uniform_helper.h" +#include "gpu/wgpu_resource.h" + +class GBufDeferredEffect : public Effect { + public: + GBufDeferredEffect(const GpuContext& ctx, + const std::vector<std::string>& inputs, + const std::vector<std::string>& outputs, + float start_time, float end_time); + + void render(WGPUCommandEncoder encoder, + const UniformsSequenceParams& params, + NodeRegistry& nodes) override; + + private: + RenderPipeline pipeline_; + BindGroup bind_group_; +}; diff --git a/cnn_v3/src/gbuf_view_effect.cc b/cnn_v3/src/gbuf_view_effect.cc index 180919d..ccf80b0 100644 --- a/cnn_v3/src/gbuf_view_effect.cc +++ b/cnn_v3/src/gbuf_view_effect.cc @@ -10,6 +10,7 @@ #endif #include "gpu/gpu.h" +#include "gpu/shader_composer.h" #include "util/asset_manager.h" #include "util/fatal_error.h" @@ -63,7 +64,8 @@ GBufViewEffect::GBufViewEffect(const GpuContext& ctx, // Shader module WGPUShaderSourceWGSL wgsl_src = {}; wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL; - wgsl_src.code = str_view(gbuf_view_wgsl); + const std::string composed = ShaderComposer::Get().Compose({}, gbuf_view_wgsl); + wgsl_src.code = str_view(composed.c_str()); WGPUShaderModuleDescriptor shader_desc = {}; shader_desc.nextInChain = &wgsl_src.chain; WGPUShaderModule shader = diff --git a/cnn_v3/src/gbuffer_effect.cc b/cnn_v3/src/gbuffer_effect.cc index 89ed8fc..b059915 100644 --- a/cnn_v3/src/gbuffer_effect.cc +++ b/cnn_v3/src/gbuffer_effect.cc @@ -42,18 +42,6 @@ struct GBufGlobalUniforms { static_assert(sizeof(GBufGlobalUniforms) == sizeof(float) * 44, "GBufGlobalUniforms must be 176 bytes"); -// Create bilinear sampler. -static WGPUSampler create_bilinear_sampler(WGPUDevice device) { - WGPUSamplerDescriptor desc = {}; - desc.addressModeU = WGPUAddressMode_ClampToEdge; - desc.addressModeV = WGPUAddressMode_ClampToEdge; - desc.magFilter = WGPUFilterMode_Linear; - desc.minFilter = WGPUFilterMode_Linear; - desc.mipmapFilter = WGPUMipmapFilterMode_Linear; - desc.maxAnisotropy = 1; - return wgpuDeviceCreateSampler(device, &desc); -} - // ---- GBufferEffect ---- GBufferEffect::GBufferEffect(const GpuContext& ctx, @@ -70,9 +58,6 @@ GBufferEffect::GBufferEffect(const GpuContext& ctx, node_depth_ = prefix + "_depth"; node_shadow_ = prefix + "_shadow"; node_transp_ = prefix + "_transp"; - node_feat0_ = outputs.size() > 0 ? outputs[0] : prefix + "_feat0"; - node_feat1_ = outputs.size() > 1 ? outputs[1] : prefix + "_feat1"; - // Allocate GPU buffers for scene data. global_uniforms_buf_ = gpu_create_buffer(ctx_.device, sizeof(GBufGlobalUniforms), @@ -89,6 +74,8 @@ GBufferEffect::GBufferEffect(const GpuContext& ctx, create_raster_pipeline(); create_shadow_pipeline(); create_pack_pipeline(); + + set_scene(); } void GBufferEffect::declare_nodes(NodeRegistry& registry) { @@ -99,11 +86,11 @@ void GBufferEffect::declare_nodes(NodeRegistry& registry) { registry.declare_node(node_transp_, NodeType::GBUF_R8, -1, -1); // feat_tex0 / feat_tex1 are the declared output_nodes_ — they get registered // by the sequence infrastructure; declare them here as well if not already. - if (!registry.has_node(node_feat0_)) { - registry.declare_node(node_feat0_, NodeType::GBUF_RGBA32UINT, -1, -1); + if (!registry.has_node(output_nodes_[0])) { + registry.declare_node(output_nodes_[0], NodeType::GBUF_RGBA32UINT, -1, -1); } - if (!registry.has_node(node_feat1_)) { - registry.declare_node(node_feat1_, NodeType::GBUF_RGBA32UINT, -1, -1); + if (!registry.has_node(output_nodes_[1])) { + registry.declare_node(output_nodes_[1], NodeType::GBUF_RGBA32UINT, -1, -1); } } @@ -122,50 +109,29 @@ void GBufferEffect::set_scene() { }; auto rrange = [&](float lo, float hi) { return lo + rnd() * (hi - lo); }; - // 20 small cubes scattered in a [-2,2]×[-1.5,1.5]×[-1.5,1.5] volume. - static const int kNumCubes = 20; - for (int i = 0; i < kNumCubes; ++i) { + // 2 large cubes. + // 2 large static cubes for shadow debugging. + { Object3D obj(ObjectType::CUBE); - obj.position = vec3(rrange(-2.0f, 2.0f), - rrange(-1.5f, 1.5f), - rrange(-1.5f, 1.5f)); - const float s = rrange(0.10f, 0.25f); - obj.scale = vec3(s, s, s); - obj.color = vec4(rrange(0.4f, 1.0f), - rrange(0.4f, 1.0f), - rrange(0.4f, 1.0f), 1.0f); - - // Random rotation axis (avoid degenerate zero-length axis). - vec3 axis = vec3(rrange(-1.0f, 1.0f), - rrange(-1.0f, 1.0f), - rrange(-1.0f, 1.0f)); - if (axis.len() < 0.01f) axis = vec3(0.0f, 1.0f, 0.0f); - axis = axis.normalize(); - const float speed = rrange(0.3f, 1.5f) * (rnd() > 0.5f ? 1.0f : -1.0f); - + obj.position = vec3(-1.0f, 0.0f, 0.0f); + obj.scale = vec3(0.6f, 0.6f, 0.6f); + obj.color = vec4(0.9f, 0.5f, 0.3f, 1.0f); scene_.add_object(obj); - cube_anims_.push_back({axis, speed}); + cube_anims_.push_back({{0.0f, 1.0f, 0.0f}, 0.0f}); } - - // 4 pumping spheres at fixed positions; radius modulated by audio_intensity. - static const vec3 kSpherePos[4] = { - { 0.0f, 0.0f, 0.0f}, - { 1.5f, 0.5f, -0.5f}, - {-1.5f, -0.5f, 0.5f}, - { 0.0f, 1.0f, 1.0f}, - }; - static const float kBaseSphereRadius[4] = {0.35f, 0.28f, 0.30f, 0.25f}; - for (int i = 0; i < 4; ++i) { + { Object3D obj(ObjectType::SPHERE); - obj.position = kSpherePos[i]; - const float r = kBaseSphereRadius[i]; - obj.scale = vec3(r, r, r); - obj.color = vec4(0.85f, 0.60f, 0.95f, 1.0f); + obj.position = vec3(1.0f, 0.0f, 0.0f); + const float r = 0.9f; + obj.scale = vec3(r, r, r); + obj.color = vec4(0.3f, 0.6f, 0.9f, 1.0f); const int idx = (int)scene_.objects.size(); scene_.add_object(obj); sphere_anims_.push_back({idx, r}); } + // (sphere removed for shadow debugging) + // Camera: above and in front of the scene, looking at origin. camera_.set_look_at(vec3(0.0f, 2.5f, 6.0f), vec3(0.0f, 0.0f, 0.0f), @@ -178,6 +144,22 @@ void GBufferEffect::set_scene() { scene_ready_ = true; } +static void clear_r8_node(WGPUCommandEncoder encoder, WGPUTextureView view, + float value) { + WGPURenderPassColorAttachment att = {}; + att.view = view; + att.loadOp = WGPULoadOp_Clear; + att.storeOp = WGPUStoreOp_Store; + att.clearValue = {value, value, value, value}; + att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED; + WGPURenderPassDescriptor pd = {}; + pd.colorAttachmentCount = 1; + pd.colorAttachments = &att; + WGPURenderPassEncoder p = wgpuCommandEncoderBeginRenderPass(encoder, &pd); + wgpuRenderPassEncoderEnd(p); + wgpuRenderPassEncoderRelease(p); +} + void GBufferEffect::render(WGPUCommandEncoder encoder, const UniformsSequenceParams& params, NodeRegistry& nodes) { @@ -188,6 +170,13 @@ void GBufferEffect::render(WGPUCommandEncoder encoder, // Update camera aspect ratio from current resolution. camera_.aspect_ratio = params.aspect_ratio; + // Slowly orbit around the scene. + const float angle = params.time * 0.3f; + const float R = 6.0f; + camera_.set_look_at(vec3(R * sinf(angle), 2.5f, R * cosf(angle)), + vec3(0.0f, 0.0f, 0.0f), + vec3(0.0f, 1.0f, 0.0f)); + // Animate cubes: axis-angle rotation driven by physical time. for (int i = 0; i < (int)cube_anims_.size(); ++i) { const CubeAnim& a = cube_anims_[(size_t)i]; @@ -203,13 +192,13 @@ void GBufferEffect::render(WGPUCommandEncoder encoder, // Upload two directional lights. { GBufLightsUniforms lu = {}; - lu.params = vec4(2.0f, 0.0f, 0.0f, 0.0f); + lu.params = vec4(1.0f, 0.0f, 0.0f, 0.0f); // Key: warm sun, upper-right-front. lu.lights[0].direction = vec4(0.408f, 0.816f, 0.408f, 0.0f); // norm(1,2,1) lu.lights[0].color = vec4(1.00f, 0.92f, 0.78f, 1.0f); - // Fill: cool sky, upper-left-back. - lu.lights[1].direction = vec4(-0.577f, 0.577f, -0.577f, 0.0f); // norm(-1,1,-1) - lu.lights[1].color = vec4(0.40f, 0.45f, 0.80f, 0.4f); + // Fill: cool sky, upper-left-back. (disabled for debugging) + // lu.lights[1].direction = vec4(-0.577f, 0.577f, -0.577f, 0.0f); + // lu.lights[1].color = vec4(0.40f, 0.45f, 0.80f, 0.4f); lights_uniform_.update(ctx_.queue, lu); } @@ -225,8 +214,8 @@ void GBufferEffect::render(WGPUCommandEncoder encoder, WGPUTextureView albedo_view = nodes.get_view(node_albedo_); WGPUTextureView normal_mat_view = nodes.get_view(node_normal_mat_); WGPUTextureView depth_view = nodes.get_view(node_depth_); - WGPUTextureView feat0_view = nodes.get_view(node_feat0_); - WGPUTextureView feat1_view = nodes.get_view(node_feat1_); + WGPUTextureView feat0_view = nodes.get_view(output_nodes_[0]); + WGPUTextureView feat1_view = nodes.get_view(output_nodes_[1]); // prev_cnn: first input node if available, else dummy. WGPUTextureView prev_view = nullptr; @@ -291,7 +280,7 @@ void GBufferEffect::render(WGPUCommandEncoder encoder, // --- Pass 2: SDF shadow raymarching --- if (shadow_pipeline_.get() != nullptr) { - WGPUBindGroupEntry shadow_entries[4] = {}; + WGPUBindGroupEntry shadow_entries[5] = {}; shadow_entries[0].binding = 0; shadow_entries[0].buffer = global_uniforms_buf_.buffer; shadow_entries[0].size = sizeof(GBufGlobalUniforms); @@ -307,12 +296,15 @@ void GBufferEffect::render(WGPUCommandEncoder encoder, shadow_entries[3].buffer = lights_uniform_.get().buffer; shadow_entries[3].size = sizeof(GBufLightsUniforms); + shadow_entries[4].binding = 4; + shadow_entries[4].textureView = normal_mat_view; + WGPUBindGroupLayout shadow_bgl = wgpuRenderPipelineGetBindGroupLayout(shadow_pipeline_.get(), 0); WGPUBindGroupDescriptor shadow_bg_desc = {}; shadow_bg_desc.layout = shadow_bgl; - shadow_bg_desc.entryCount = 4; + shadow_bg_desc.entryCount = 5; shadow_bg_desc.entries = shadow_entries; WGPUBindGroup shadow_bg = @@ -340,36 +332,11 @@ void GBufferEffect::render(WGPUCommandEncoder encoder, wgpuBindGroupRelease(shadow_bg); } else { // Fallback: clear to 1.0 (fully lit) if pipeline not ready. - WGPURenderPassColorAttachment att = {}; - att.view = nodes.get_view(node_shadow_); - att.loadOp = WGPULoadOp_Clear; - att.storeOp = WGPUStoreOp_Store; - att.clearValue = {1.0f, 1.0f, 1.0f, 1.0f}; - att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED; - WGPURenderPassDescriptor pd = {}; - pd.colorAttachmentCount = 1; - pd.colorAttachments = &att; - WGPURenderPassEncoder p = wgpuCommandEncoderBeginRenderPass(encoder, &pd); - wgpuRenderPassEncoderEnd(p); - wgpuRenderPassEncoderRelease(p); + clear_r8_node(encoder, nodes.get_view(node_shadow_), 1.0f); } // Pass 3: Transparency — TODO (deferred; opaque scenes only) - // Clear transp node to 0.0 (fully opaque) until pass 3 is implemented. - { - WGPURenderPassColorAttachment att = {}; - att.view = nodes.get_view(node_transp_); - att.loadOp = WGPULoadOp_Clear; - att.storeOp = WGPUStoreOp_Store; - att.clearValue = {0.0f, 0.0f, 0.0f, 0.0f}; - att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED; - WGPURenderPassDescriptor pd = {}; - pd.colorAttachmentCount = 1; - pd.colorAttachments = &att; - WGPURenderPassEncoder p = wgpuCommandEncoderBeginRenderPass(encoder, &pd); - wgpuRenderPassEncoderEnd(p); - wgpuRenderPassEncoderRelease(p); - } + clear_r8_node(encoder, nodes.get_view(node_transp_), 0.0f); // --- Pass 4: Pack compute --- // Rebuild pack bind group with current node views. @@ -463,7 +430,8 @@ void GBufferEffect::upload_scene_data(const Scene& scene, : (size_t)kGBufMaxObjects); const mat4 view = camera.get_view_matrix(); - const mat4 proj = camera.get_projection_matrix(); + mat4 proj = camera.get_projection_matrix(); + proj.m[5] = -proj.m[5]; // undo post-process Y flip: G-buffer uses integer reads const mat4 vp = proj * view; GBufGlobalUniforms gu = {}; @@ -509,7 +477,7 @@ void GBufferEffect::create_raster_pipeline() { } const std::string composed = - ShaderComposer::Get().Compose({"common_uniforms"}, src); + ShaderComposer::Get().Compose({}, src); WGPUShaderSourceWGSL wgsl_src = {}; wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL; @@ -567,8 +535,9 @@ void GBufferEffect::create_raster_pipeline() { pipe_desc.vertex.entryPoint = str_view("vs_main"); pipe_desc.fragment = &frag; pipe_desc.depthStencil = &ds; - pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList; - pipe_desc.primitive.cullMode = WGPUCullMode_Back; + pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList; + pipe_desc.primitive.cullMode = WGPUCullMode_Back; + pipe_desc.primitive.frontFace = WGPUFrontFace_CCW; // standard (no Y flip) pipe_desc.multisample.count = 1; pipe_desc.multisample.mask = 0xFFFFFFFF; @@ -598,7 +567,7 @@ void GBufferEffect::create_shadow_pipeline() { WGPUShaderModule shader = wgpuDeviceCreateShaderModule(ctx_.device, &shader_desc); // BGL: B0=GlobalUniforms, B1=ObjectsBuffer, B2=texture_depth_2d, B3=GBufLightsUniforms - WGPUBindGroupLayoutEntry bgl_entries[4] = {}; + WGPUBindGroupLayoutEntry bgl_entries[5] = {}; bgl_entries[0].binding = 0; bgl_entries[0].visibility = @@ -621,8 +590,13 @@ void GBufferEffect::create_shadow_pipeline() { bgl_entries[3].buffer.type = WGPUBufferBindingType_Uniform; bgl_entries[3].buffer.minBindingSize = sizeof(GBufLightsUniforms); + bgl_entries[4].binding = 4; + bgl_entries[4].visibility = WGPUShaderStage_Fragment; + bgl_entries[4].texture.sampleType = WGPUTextureSampleType_Float; + bgl_entries[4].texture.viewDimension = WGPUTextureViewDimension_2D; + WGPUBindGroupLayoutDescriptor bgl_desc = {}; - bgl_desc.entryCount = 4; + bgl_desc.entryCount = 5; bgl_desc.entries = bgl_entries; WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(ctx_.device, &bgl_desc); @@ -793,7 +767,3 @@ void GBufferEffect::update_raster_bind_group(NodeRegistry& nodes) { wgpuBindGroupLayoutRelease(bgl); } -void GBufferEffect::update_pack_bind_group(NodeRegistry& nodes) { - (void)nodes; - // Pack bind group is rebuilt inline in render() to use current node views. -} diff --git a/cnn_v3/src/gbuffer_effect.h b/cnn_v3/src/gbuffer_effect.h index c39219b..13d394d 100644 --- a/cnn_v3/src/gbuffer_effect.h +++ b/cnn_v3/src/gbuffer_effect.h @@ -68,8 +68,6 @@ class GBufferEffect : public Effect { std::string node_depth_; std::string node_shadow_; std::string node_transp_; - std::string node_feat0_; - std::string node_feat1_; // Owned scene and camera — populated by set_scene() Scene scene_; @@ -88,7 +86,6 @@ class GBufferEffect : public Effect { // Pass 4: Pack compute pipeline ComputePipeline pack_pipeline_; - BindGroup pack_bind_group_; UniformBuffer<GBufResUniforms> pack_res_uniform_; UniformBuffer<GBufLightsUniforms> lights_uniform_; @@ -102,7 +99,6 @@ class GBufferEffect : public Effect { void create_pack_pipeline(); void update_raster_bind_group(NodeRegistry& nodes); - void update_pack_bind_group(NodeRegistry& nodes); void upload_scene_data(const Scene& scene, const Camera& camera, float time); diff --git a/doc/COMPLETED.md b/doc/COMPLETED.md index 072c92f..a3a988c 100644 --- a/doc/COMPLETED.md +++ b/doc/COMPLETED.md @@ -36,6 +36,14 @@ Completed task archive. See `doc/archive/` for detailed historical documents. ## March 2026 +- [x] **CNN v3 shadow pass debugging** — Fixed 5 independent bugs in `gbuf_shadow.wgsl` + `gbuffer_effect.cc`: + 1. **Camera Y-inversion**: `mat4::perspective` negates Y for post-process chain; fixed with `proj.m[5] = -proj.m[5]` in `upload_scene_data` + `WGPUFrontFace_CCW` on raster pipeline. + 2. **Shadow formula**: replaced `shadowWithStoredDistance` (20 steps, bounded) with 64-step IQ soft shadow (`res = min(res, 8.0*d/t)`, unbounded march). + 3. **Local→world SDF scale**: `sdBox/sdSphere` return local-space distance; fixed with `d *= length(obj.model[0].xyz)`. + 4. **Shadow bias**: replaced light-direction bias (fails at terminator) with rasterized surface normal from `normal_mat_tex` (binding 4); `bias_pos = world + nor * 0.05`. + 5. **ShaderComposer**: `GBufViewEffect` needed `ShaderComposer::Get().Compose()` to resolve `#include "debug/debug_print"`. + - Added per-tile labels to `gbuf_view.wgsl` via `debug_str`. Scale propagation for pulsating sphere confirmed correct end-to-end. 36/36 tests. + - [x] **CNN v3 Phase 7: Validation tools** — `GBufViewEffect` (C++ 4×5 channel grid, `cnn_v3/shaders/gbuf_view.wgsl`, `cnn_v3/src/gbuf_view_effect.{h,cc}`): renders all 20 G-buffer feature channels tiled on screen; custom BGL with `WGPUTextureSampleType_Uint`, bind group rebuilt per frame via `wgpuRenderPipelineGetBindGroupLayout`. Web tool "Load sample directory" (`cnn_v3/tools/tester.js` + `shaders.js`): `webkitdirectory` picker, `FULL_PACK_SHADER` compute (matches `gbuf_pack.wgsl`), `runFromFeat()` inference, PSNR vs `target.png`. 36/36 tests. - [x] **CNN v3 Phase 5: Parity validation** — `test_cnn_v3_parity.cc` (2 tests: zero_weights, random_weights). Root cause: intermediate nodes declared at full res instead of W/2, W/4. Fix: `NodeRegistry::default_width()/default_height()` getters + fractional resolution in `declare_nodes()`. Final max_err=4.88e-4 ✓. 36/36 tests. diff --git a/src/effects/shaders.cc b/src/effects/shaders.cc index a74d920..8b625ee 100644 --- a/src/effects/shaders.cc +++ b/src/effects/shaders.cc @@ -35,6 +35,7 @@ void InitShaderComposer() { register_if_exists("math/common_utils", AssetId::ASSET_SHADER_MATH_COMMON_UTILS); register_if_exists("math/noise", AssetId::ASSET_SHADER_MATH_NOISE); + register_if_exists("math/normal", AssetId::ASSET_SHADER_MATH_NORMAL); register_if_exists("render/shadows", AssetId::ASSET_SHADER_RENDER_SHADOWS); register_if_exists("render/scene_query_bvh", AssetId::ASSET_SHADER_RENDER_SCENE_QUERY_BVH); @@ -50,6 +51,7 @@ void InitShaderComposer() { register_if_exists("ray_box", AssetId::ASSET_SHADER_RAY_BOX); register_if_exists("ray_triangle", AssetId::ASSET_SHADER_RAY_TRIANGLE); + register_if_exists("ray_sphere", AssetId::ASSET_SHADER_RAY_SPHERE); register_if_exists("debug/debug_print", AssetId::ASSET_SHADER_DEBUG_DEBUG_PRINT); @@ -119,6 +121,7 @@ const char* gbuf_raster_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_GBUF_RASTE const char* gbuf_pack_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_GBUF_PACK); const char* gbuf_shadow_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_GBUF_SHADOW); const char* gbuf_view_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_GBUF_VIEW); +const char* gbuf_deferred_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_GBUF_DEFERRED); const char* cnn_v3_enc0_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_CNN_V3_ENC0); const char* cnn_v3_enc1_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_CNN_V3_ENC1); const char* cnn_v3_bottleneck_wgsl = SafeGetAsset(AssetId::ASSET_SHADER_CNN_V3_BOTTLENECK); diff --git a/src/effects/shaders.h b/src/effects/shaders.h index 1664706..7e1cf02 100644 --- a/src/effects/shaders.h +++ b/src/effects/shaders.h @@ -25,6 +25,7 @@ extern const char* gbuf_raster_wgsl; extern const char* gbuf_pack_wgsl; extern const char* gbuf_shadow_wgsl; extern const char* gbuf_view_wgsl; +extern const char* gbuf_deferred_wgsl; // CNN v3 inference shaders extern const char* cnn_v3_enc0_wgsl; diff --git a/src/gpu/demo_effects.h b/src/gpu/demo_effects.h index edb4a23..77e5c76 100644 --- a/src/gpu/demo_effects.h +++ b/src/gpu/demo_effects.h @@ -36,6 +36,7 @@ #include "../../cnn_v3/src/gbuffer_effect.h" #include "../../cnn_v3/src/cnn_v3_effect.h" #include "../../cnn_v3/src/gbuf_view_effect.h" +#include "../../cnn_v3/src/gbuf_deferred_effect.h" // TODO: Port CNN effects // #include "../../cnn_v1/src/cnn_v1_effect.h" diff --git a/src/shaders/math/normal.wgsl b/src/shaders/math/normal.wgsl new file mode 100644 index 0000000..5a9a1a9 --- /dev/null +++ b/src/shaders/math/normal.wgsl @@ -0,0 +1,34 @@ +// Normal encoding/decoding utilities. +// Octahedral encoding: losslessly maps unit normals ↔ vec2f in [-1,1]². +// Storage convention: remap to [0,1] before writing to u8/f16 textures. + +// Encode a unit normal to octahedral XY in [-1, 1]. +fn oct_encode(n: vec3f) -> vec2f { + let inv_l1 = 1.0 / (abs(n.x) + abs(n.y) + abs(n.z)); + var p = n.xy * inv_l1; + if (n.z < 0.0) { + let s = vec2f(select(-1.0, 1.0, p.x >= 0.0), + select(-1.0, 1.0, p.y >= 0.0)); + p = (1.0 - abs(p.yx)) * s; + } + return p; // [-1, 1] +} + +// Decode octahedral XY in [-1, 1] back to a unit normal. +fn oct_decode(f: vec2f) -> vec3f { + var n = vec3f(f.x, f.y, 1.0 - abs(f.x) - abs(f.y)); + let t = max(-n.z, 0.0); + n.x += select(t, -t, n.x >= 0.0); + n.y += select(t, -t, n.y >= 0.0); + return normalize(n); +} + +// Convenience: encode + remap to [0, 1] for texture storage. +fn oct_encode_unorm(n: vec3f) -> vec2f { + return oct_encode(n) * 0.5 + vec2f(0.5); +} + +// Convenience: undo [0, 1] remap then decode. +fn oct_decode_unorm(rg: vec2f) -> vec3f { + return oct_decode(rg * 2.0 - vec2f(1.0)); +} diff --git a/src/shaders/ray_sphere.wgsl b/src/shaders/ray_sphere.wgsl new file mode 100644 index 0000000..659e144 --- /dev/null +++ b/src/shaders/ray_sphere.wgsl @@ -0,0 +1,21 @@ +// Ray-sphere intersection. +// ro: ray origin, rd: ray direction (must be normalized). +// center: sphere center, radius: sphere radius. +// Returns t of the nearest positive intersection, or -1.0 if no hit. + +struct RaySphereHit { + t: f32, // distance along ray to nearest hit (negative = miss) + hit: bool, +}; + +fn ray_sphere_intersection(ro: vec3f, rd: vec3f, + center: vec3f, radius: f32) -> RaySphereHit { + let oc = ro - center; + let b = dot(oc, rd); + let c = dot(oc, oc) - radius * radius; + let disc = b * b - c; + if (disc < 0.0) { return RaySphereHit(-1.0, false); } + let t = -b - sqrt(disc); + if (t < 0.0) { return RaySphereHit(-1.0, false); } + return RaySphereHit(t, true); +} diff --git a/src/tests/gpu/test_demo_effects.cc b/src/tests/gpu/test_demo_effects.cc index d6e9c8a..1bb89f9 100644 --- a/src/tests/gpu/test_demo_effects.cc +++ b/src/tests/gpu/test_demo_effects.cc @@ -94,6 +94,11 @@ static void test_effects() { fixture.ctx(), std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, std::vector<std::string>{"gbuf_view_out"}, 0.0f, 1000.0f)}, + {"GBufDeferredEffect", + std::make_shared<GBufDeferredEffect>( + fixture.ctx(), + std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, + std::vector<std::string>{"gbuf_deferred_out"}, 0.0f, 1000.0f)}, }; int passed = 0; diff --git a/tools/seq_compiler.py b/tools/seq_compiler.py index 2d802b2..2de0f46 100755 --- a/tools/seq_compiler.py +++ b/tools/seq_compiler.py @@ -399,7 +399,8 @@ def generate_cpp(seq: SequenceDecl, sorted_effects: List[EffectDecl], 'NtscYiq': 'ntsc', 'GBufferEffect': '#include "../../cnn_v3/src/gbuffer_effect.h"', 'CNNv3Effect': '#include "../../cnn_v3/src/cnn_v3_effect.h"', - 'GBufViewEffect': '#include "../../cnn_v3/src/gbuf_view_effect.h"', + 'GBufViewEffect': '#include "../../cnn_v3/src/gbuf_view_effect.h"', + 'GBufDeferredEffect': '#include "../../cnn_v3/src/gbuf_deferred_effect.h"', } includes = set() for effect in seq.effects: diff --git a/workspaces/main/assets.txt b/workspaces/main/assets.txt index 3045ab6..b96b4c4 100644 --- a/workspaces/main/assets.txt +++ b/workspaces/main/assets.txt @@ -38,6 +38,7 @@ SHADER_SDF_SHAPES, WGSL, ../../src/shaders/math/sdf_shapes.wgsl, "SDF Shapes (2D SHADER_LIGHTING, WGSL, ../../src/shaders/lighting.wgsl, "Lighting Snippet" SHADER_RAY_BOX, WGSL, ../../src/shaders/ray_box.wgsl, "Ray-Box Intersection Snippet" SHADER_RAY_TRIANGLE, WGSL, ../../src/shaders/ray_triangle.wgsl, "Ray-Triangle Intersection Snippet (Möller-Trumbore)" +SHADER_RAY_SPHERE, WGSL, ../../src/shaders/ray_sphere.wgsl, "Ray-Sphere Intersection Snippet" SHADER_MAIN, WGSL, shaders/main_shader.wgsl, "Main Heptagon Shader" SHADER_PARTICLE_COMPUTE, WGSL, ../../src/effects/particle_compute.wgsl, "Particle Compute Shader" SHADER_PARTICLE_RENDER, WGSL, ../../src/effects/particle_render.wgsl, "Particle Render Shader" @@ -79,6 +80,7 @@ SHADER_RENDER_FULLSCREEN_UV_VS, WGSL, ../../src/shaders/render/fullscreen_uv_vs. SHADER_MATH_COLOR, WGSL, ../../src/shaders/math/color.wgsl, "Color Functions" SHADER_MATH_COLOR_C64, WGSL, ../../src/shaders/math/color_c64.wgsl, "C64 Palette and Bayer Dither" SHADER_MATH_UTILS, WGSL, ../../src/shaders/math/utils.wgsl, "Math Utilities" +SHADER_MATH_NORMAL, WGSL, ../../src/shaders/math/normal.wgsl, "Octahedral normal encode/decode" SHADER_RENDER_RAYMARCHING, WGSL, ../../src/shaders/render/raymarching.wgsl, "Raymarching Functions" SHADER_RENDER_RAYMARCHING_ID, WGSL, ../../src/shaders/render/raymarching_id.wgsl, "Raymarching-ID Functions" SHADER_VIGNETTE, WGSL, ../../src/effects/vignette.wgsl, "Vignette Shader" @@ -103,6 +105,7 @@ SHADER_GBUF_RASTER, WGSL, ../../cnn_v3/shaders/gbuf_raster.wgsl, "CNN v3 G-buffe SHADER_GBUF_PACK, WGSL, ../../cnn_v3/shaders/gbuf_pack.wgsl, "CNN v3 G-buffer feature pack compute shader" SHADER_GBUF_VIEW, WGSL, ../../cnn_v3/shaders/gbuf_view.wgsl, "CNN v3 G-buffer channel visualization (4x5 grid)" SHADER_GBUF_SHADOW, WGSL, ../../cnn_v3/shaders/gbuf_shadow.wgsl, "CNN v3 G-buffer SDF shadow raymarching pass" +SHADER_GBUF_DEFERRED, WGSL, ../../cnn_v3/shaders/gbuf_deferred.wgsl, "CNN v3 simple deferred render (albedo * shadow)" # --- CNN v3 Inference --- SHADER_CNN_V3_COMMON, WGSL, ../../cnn_v3/shaders/cnn_v3_common.wgsl, "CNN v3 shared helpers snippet (get_w, unpack_8ch)" diff --git a/workspaces/main/timeline.seq b/workspaces/main/timeline.seq index 0e1ca74..1b75cdc 100644 --- a/workspaces/main/timeline.seq +++ b/workspaces/main/timeline.seq @@ -14,15 +14,15 @@ SEQUENCE 12.00 0 "cnn_v3_test" NODE gbuf_feat0 gbuf_rgba32uint NODE gbuf_feat1 gbuf_rgba32uint EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0.00 8.00 - EFFECT + GBufViewEffect gbuf_feat0 gbuf_feat1 -> sink 0.00 8.00 + EFFECT + GBufDeferredEffect gbuf_feat0 gbuf_feat1 -> sink 0.00 8.00 -SEQUENCE 20.00 2 "hybrid_heptagon" +SEQUENCE 28.00 2 "hybrid_heptagon" # Heptagon -> Hybrid3D -> sink EFFECT + Heptagon source -> temp1 0.00 4.00 EFFECT + Hybrid3D temp1 -> temp2 0.00 4.00 EFFECT + Ntsc temp2 -> sink 0.00 4.00 -SEQUENCE 24.00 0 "heptagon_scene" +SEQUENCE 28.00 0 "heptagon_scene" EFFECT + Scene1 source -> temp1 0.00 4.00 EFFECT + Ntsc temp1 -> sink 0.00 4.00 @@ -48,3 +48,9 @@ SEQUENCE 48.00 1 "particles" EFFECT + Particles source -> temp1 0.00 4.00 EFFECT = GaussianBlur temp1 -> temp2 0.00 4.00 EFFECT + Ntsc temp2 -> sink 0.00 4.00 + +SEQUENCE 52.00 0 "cnn_v3_debug" + NODE gbuf_feat0 gbuf_rgba32uint + NODE gbuf_feat1 gbuf_rgba32uint + EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0.00 120.00 + EFFECT + GBufViewEffect gbuf_feat0 gbuf_feat1 -> sink 0.00 120.00 |
