diff options
Diffstat (limited to 'cnn_v3')
44 files changed, 1683 insertions, 191 deletions
diff --git a/cnn_v3/README.md b/cnn_v3/README.md index f161bf4..a844b1b 100644 --- a/cnn_v3/README.md +++ b/cnn_v3/README.md @@ -31,9 +31,18 @@ Add images directly to these directories and commit them. ## Status -**Phase 1 complete.** G-buffer integrated (raster + pack), 35/35 tests pass. -Training infrastructure ready. U-Net WGSL shaders are next. +**Phases 1–7 complete.** 36/36 tests pass. -See `cnn_v3/docs/HOWTO.md` for the practical playbook. +| Phase | Status | +|-------|--------| +| 1 — G-buffer (raster + pack) | ✅ | +| 2 — Training infrastructure | ✅ | +| 3 — WGSL U-Net shaders | ✅ | +| 4 — C++ CNNv3Effect + FiLM | ✅ | +| 5 — Parity validation | ✅ max_err=4.88e-4 | +| 6 — Training script | ✅ train_cnn_v3.py | +| 7 — Validation tools | ✅ GBufViewEffect + web sample loader | + +See `cnn_v3/docs/HOWTO.md` for the practical playbook (§9 covers validation tools). See `cnn_v3/docs/CNN_V3.md` for full design. See `cnn_v2/` for reference implementation. diff --git a/cnn_v3/docs/HOWTO.md b/cnn_v3/docs/HOWTO.md index 983e8b7..5c5cc2a 100644 --- a/cnn_v3/docs/HOWTO.md +++ b/cnn_v3/docs/HOWTO.md @@ -22,57 +22,141 @@ It rasterizes proxy geometry to MRT G-buffer textures and packs them into two ### Adding to a Sequence -`GBufferEffect` does not exist in `seq_compiler.py` as a named effect yet -(no `.seq` syntax integration for Phase 1). Wire it directly in C++ alongside -your scene code, or add it to the timeline when the full CNNv3Effect is ready. +Both `GBufferEffect` and `GBufViewEffect` are registered in `seq_compiler.py` +(`CLASS_TO_HEADER`) and can be wired directly in `timeline.seq`. -**C++ wiring example** (e.g. inside a Sequence or main.cc): +**Debug view (G-buffer → sink)**: +```seq +SEQUENCE 12.00 0 "cnn_v3_test" + NODE gbuf_feat0 gbuf_rgba32uint + NODE gbuf_feat1 gbuf_rgba32uint + EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0.00 8.00 + EFFECT + GBufViewEffect gbuf_feat0 gbuf_feat1 -> sink 0.00 8.00 +``` -```cpp -#include "../../cnn_v3/src/gbuffer_effect.h" +**Full CNN pipeline**: +```seq +SEQUENCE 12.00 0 "cnn_v3_test" + NODE gbuf_feat0 gbuf_rgba32uint + NODE gbuf_feat1 gbuf_rgba32uint + NODE cnn_v3_out gbuf_albedo + EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0.00 8.00 + EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> cnn_v3_out 0.00 8.00 + EFFECT + Passthrough cnn_v3_out -> sink 0.00 8.00 +``` -// Allocate once alongside your scene -auto gbuf = std::make_shared<GBufferEffect>( - ctx, /*inputs=*/{"prev_cnn"}, // or any dummy node - /*outputs=*/{"gbuf_feat0", "gbuf_feat1"}, - /*start=*/0.0f, /*end=*/60.0f); +### Internal scene -gbuf->set_scene(&my_scene, &my_camera); +Call `set_scene()` once before the first render to populate the built-in demo +scene. No external `Scene` or `Camera` pointer is required — the effect owns +them. -// In render loop, call before CNN pass: -gbuf->render(encoder, params, nodes); -``` +**What `set_scene()` creates:** +- **20 small cubes** — random positions in [-2,2]×[-1.5,1.5]³, scale 0.1–0.25, + random colors. Each has a random rotation axis and speed; animated each frame + via `quat::from_axis(axis, time * speed)`. +- **4 pumping spheres** — at fixed world positions, base radii 0.25–0.35. + Scale driven by `audio_intensity`: `r = base_r * (1 + audio_intensity * 0.8)`. +- **Camera** — position (0, 2.5, 6), target (0, 0, 0), 45° FOV. + Aspect ratio updated each frame from `params.aspect_ratio`. +- **Two directional lights** (uploaded to `lights_uniform_`, ready for shadow pass): + - Key: warm white (1.0, 0.92, 0.78), direction `normalize(1, 2, 1)` (upper-right-front) + - Fill: cool blue (0.4, 0.45, 0.8 × 0.4), direction `normalize(-1, 1, -1)` (upper-left-back) ### Internal passes Each frame, `GBufferEffect::render()` executes: -1. **Pass 1 — MRT rasterization** (`gbuf_raster.wgsl`) +1. **Pass 1 — MRT rasterization** (`gbuf_raster.wgsl`) ✅ - Proxy box (36 verts) × N objects, instanced - MRT outputs: `gbuf_albedo` (rgba16float), `gbuf_normal_mat` (rgba16float) - Depth test + write into `gbuf_depth` (depth32float) + - `obj.type` written to `ObjectData.params.x` for future SDF branching -2. **Pass 2/3 — SDF + Lighting** — TODO (placeholder: shadow=1, transp=0) +2. **Pass 2 — SDF shadow raymarching** (`gbuf_shadow.wgsl`) ✅ + - See implementation plan below. -3. **Pass 4 — Pack compute** (`gbuf_pack.wgsl`) +3. **Pass 3 — Transparency** — TODO (deferred; transp=0 for opaque scenes) + +4. **Pass 4 — Pack compute** (`gbuf_pack.wgsl`) ✅ - Reads all G-buffer textures + `prev_cnn` input - Writes `feat_tex0` + `feat_tex1` (rgba32uint, 20 channels, 32 bytes/pixel) + - Shadow / transp nodes cleared to 1.0 / 0.0 via zero-draw render passes + until Pass 2/3 are implemented. ### Output node names -By default the outputs are named from the `outputs` vector passed to the -constructor. Use these names when binding the CNN effect input: +Outputs are named from the `outputs` vector passed to the constructor: ``` outputs[0] → feat_tex0 (rgba32uint: albedo.rgb, normal.xy, depth, depth_grad.xy) outputs[1] → feat_tex1 (rgba32uint: mat_id, prev.rgb, mip1.rgb, mip2.rgb, shadow, transp) ``` -### Scene data +--- + +## 1b. GBufferEffect — Implementation Plan (Pass 2: SDF Shadow) + +### What remains + +| Item | Status | Notes | +|------|--------|-------| +| Pass 1: MRT raster | ✅ Done | proxy box, all object types | +| Pass 4: Pack compute | ✅ Done | 20 channels packed | +| Internal scene + animation | ✅ Done | cubes + spheres + 2 lights | +| Pass 2: SDF shadow | ✅ Done | `gbuf_shadow.wgsl`, proxy-box SDF per object | +| Pass 3: Transparency | ❌ TODO | low priority, opaque scenes only | +| Phase 4: type-aware SDF | ✅ Done | switch on `obj.params.x` in `dfWithID` | + +### Pass 2: SDF shadow raymarching -Call `set_scene(scene, camera)` before the first render. The effect uploads -`GlobalUniforms` (view-proj, camera pos, resolution) and `ObjectData` (model -matrix, color) to GPU storage buffers each frame. +**New file: `cnn_v3/shaders/gbuf_shadow.wgsl`** — fullscreen render pass. + +Bind layout: + +| Binding | Type | Content | +|---------|------|---------| +| 0 | `uniform` | `GlobalUniforms` (`#include "common_uniforms"`) | +| 1 | `storage read` | `ObjectsBuffer` | +| 2 | `texture_depth_2d` | depth from Pass 1 | +| 3 | `sampler` (non-filtering) | depth load | +| 4 | `uniform` | `GBufLightsUniforms` (2 lights) | + +Algorithm per fragment: +1. Reconstruct world position from NDC depth + `globals.inv_view_proj` +2. For each object: `sdBox((inv_model * world_pos).xyz, vec3(1.0))` — proxy box in local space +3. For each light: offset ray origin by `0.02 * surface_normal`; march shadow ray toward `light.direction` +4. Soft shadow via `shadowWithStoredDistance()` from `render/raymarching_id` +5. Combine lights: `shadow = min(shadow_light0, shadow_light1)` +6. Discard fragments where depth == 1.0 (sky/background → shadow = 1.0) +7. Output shadow factor to RGBA8Unorm render target (`.r` = shadow) + +**C++ additions (`gbuffer_effect.h/.cc`):** +```cpp +RenderPipeline shadow_pipeline_; +void create_shadow_pipeline(); +``` +In `render()` between Pass 1 and the shadow/transp node clears: +- Build bind group (global_uniforms_buf_, objects_buf_, depth_view, sampler_, lights_uniform_) +- Run fullscreen triangle → `node_shadow_` color attachment +- Remove the `clear_node(node_shadow_, 1.0f)` placeholder once the pass is live + +**Register:** +- `cnn_v3/shaders/gbuf_shadow.wgsl` → `SHADER_GBUF_SHADOW` in `assets.txt` +- `extern const char* gbuf_shadow_wgsl;` in `gbuffer_effect.cc` + +### Phase 4: Object-type-aware SDF (optional) + +Branch on `obj.params.x` (populated since this commit) using `math/sdf_shapes`: + +| Type value | ObjectType | SDF | +|------------|-----------|-----| +| 0 | CUBE | `sdBox(local_p, vec3(1))` | +| 1 | SPHERE | `sdSphere(local_p, 1.0)` | +| 2 | PLANE | `sdPlane(local_p, vec3(0,1,0), obj.params.y)` | +| 3 | TORUS | `sdTorus(local_p, vec2(0.8, 0.2))` | + +Only worth adding after Pass 2 is validated visually. --- @@ -253,12 +337,14 @@ Test vectors generated by `cnn_v3/training/gen_test_vectors.py` (PyTorch referen | Phase | Status | Notes | |-------|--------|-------| | 1 — G-buffer (raster + pack) | ✅ Done | Integrated, 36/36 tests pass | -| 1 — G-buffer (SDF + shadow passes) | TODO | Placeholder: shadow=1, transp=0 | +| 1 — G-buffer (SDF shadow pass) | ✅ Done | `gbuf_shadow.wgsl`, proxy-box SDF | | 2 — Training infrastructure | ✅ Done | blender_export.py, pack_*_sample.py | | 3 — WGSL U-Net shaders | ✅ Done | 5 compute shaders + cnn_v3/common snippet | | 4 — C++ CNNv3Effect | ✅ Done | FiLM uniform upload, 36/36 tests pass | | 5 — Parity validation | ✅ Done | test_cnn_v3_parity.cc, max_err=4.88e-4 | | 6 — FiLM MLP training | ✅ Done | train_cnn_v3.py + cnn_v3_utils.py written | +| 7 — G-buffer visualizer (C++) | ✅ Done | GBufViewEffect, 36/36 tests pass | +| 7 — Sample loader (web tool) | ✅ Done | "Load sample directory" in cnn_v3/tools/ | --- @@ -337,9 +423,142 @@ auto src = ShaderComposer::Get().Compose({"cnn_v3/common"}, raw_wgsl); --- -## 9. See Also +## 9. Validation Workflow + +Two complementary tools let you verify each stage of the pipeline before training +or integrating into the demo. + +### 9a. C++ — GBufViewEffect (G-buffer channel grid) + +`GBufViewEffect` renders all 20 feature channels from `feat_tex0` / `feat_tex1` +in a **4×5 tiled grid** so you can see the G-buffer at a glance. + +**Registration (already done)** + +| File | What changed | +|------|-------------| +| `cnn_v3/shaders/gbuf_view.wgsl` | New fragment shader | +| `cnn_v3/src/gbuf_view_effect.h` | Effect class declaration | +| `cnn_v3/src/gbuf_view_effect.cc` | Effect class implementation | +| `workspaces/main/assets.txt` | `SHADER_GBUF_VIEW` asset | +| `cmake/DemoSourceLists.cmake` | `gbuf_view_effect.cc` in COMMON_GPU_EFFECTS | +| `src/gpu/demo_effects.h` | `#include "../../cnn_v3/src/gbuf_view_effect.h"` | +| `src/effects/shaders.h/.cc` | `gbuf_view_wgsl` extern declaration + definition | +| `src/tests/gpu/test_demo_effects.cc` | GBufViewEffect test | + +**Constructor signature** + +```cpp +GBufViewEffect(const GpuContext& ctx, + const std::vector<std::string>& inputs, // {feat_tex0, feat_tex1} + const std::vector<std::string>& outputs, // {gbuf_view_out} + float start_time, float end_time) +``` + +**Wiring example** (alongside GBufferEffect): + +```cpp +auto gbuf = std::make_shared<GBufferEffect>(ctx, + std::vector<std::string>{"prev_cnn"}, + std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, 0.0f, 60.0f); +auto gview = std::make_shared<GBufViewEffect>(ctx, + std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, + std::vector<std::string>{"gbuf_view_out"}, 0.0f, 60.0f); +``` + +**Grid layout** (output resolution = input resolution, channel cells each 1/4 W × 1/5 H): + +| Row | Col 0 | Col 1 | Col 2 | Col 3 | +|-----|-------|-------|-------|-------| +| 0 | `alb.r` | `alb.g` | `alb.b` | `nrm.x` remap→[0,1] | +| 1 | `nrm.y` remap→[0,1] | `depth` (inverted) | `dzdx` ×20+0.5 | `dzdy` ×20+0.5 | +| 2 | `mat_id` | `prev.r` | `prev.g` | `prev.b` | +| 3 | `mip1.r` | `mip1.g` | `mip1.b` | `mip2.r` | +| 4 | `mip2.g` | `mip2.b` | `shadow` | `transp` | + +All channels displayed as grayscale. 1-pixel gray grid lines separate cells. Dark background for out-of-range cells. + +**Shader binding layout** (no sampler needed — integer texture): + +| Binding | Type | Content | +|---------|------|---------| +| 0 | `texture_2d<u32>` | `feat_tex0` (8 f16 channels via `pack2x16float`) | +| 1 | `texture_2d<u32>` | `feat_tex1` (12 u8 channels via `pack4x8unorm`) | +| 2 | `uniform` (8 B) | `GBufViewUniforms { resolution: vec2f }` | + +The BGL is built manually in the constructor (no sampler) — this is an exception to the +standard post-process pattern because `rgba32uint` textures use `WGPUTextureSampleType_Uint` +and cannot be sampled, only loaded via `textureLoad()`. + +**Implementation note — bind group recreation** + +`render()` calls `wgpuRenderPipelineGetBindGroupLayout(pipeline_, 0)` each frame to +extract the BGL, creates a new `BindGroup`, then immediately releases the BGL handle. +This avoids storing a raw BGL as a member (no RAII wrapper exists for it) while +remaining correct across ping-pong buffer swaps. + +--- + +### 9b. Web tool — "Load sample directory" + +`cnn_v3/tools/index.html` has a **"Load sample directory"** button that: +1. Opens a `webkitdirectory` picker to select a sample folder +2. Loads all G-buffer component PNGs as `rgba8unorm` GPU textures +3. Runs the `FULL_PACK_SHADER` compute shader to assemble `feat_tex0` / `feat_tex1` +4. Runs full CNN inference (enc0 → enc1 → bottleneck → dec1 → dec0) +5. Displays the CNN output on the main canvas +6. If `target.png` is present, shows it side-by-side and prints PSNR + +**File name matching** (case-insensitive, substring): + +| Channel | Matched patterns | Fallback | +|---------|-----------------|---------| +| Albedo (required) | `albedo`, `color` | — (error if missing) | +| Normal | `normal`, `nrm` | `rgb(128,128,0,255)` — flat (0,0) oct-encoded | +| Depth | `depth` | `0` — zero depth | +| Mat ID | `matid`, `index`, `mat_id` | `0` — no material | +| Shadow | `shadow` | `255` — fully lit | +| Transparency | `transp`, `alpha` | `0` — fully opaque | +| Target | `target`, `output`, `ground_truth` | not shown | + +**`FULL_PACK_SHADER`** (defined in `cnn_v3/tools/shaders.js`) + +WebGPU compute shader (`@workgroup_size(8,8)`) with 9 bindings: + +| Binding | Resource | Format | +|---------|----------|--------| +| 0–5 | albedo, normal, depth, matid, shadow, transp | `texture_2d<f32>` (rgba8unorm, R channel for single-channel maps) | +| 6 | feat_tex0 output | `texture_storage_2d<rgba32uint,write>` | +| 7 | feat_tex1 output | `texture_storage_2d<rgba32uint,write>` | + +No sampler — all reads use `textureLoad()` (integer texel coordinates). + +Packs channels identically to `gbuf_pack.wgsl`: +- `feat_tex0`: `pack2x16float(alb.rg)`, `pack2x16float(alb.b, nrm.x)`, `pack2x16float(nrm.y, depth)`, `pack2x16float(dzdx, dzdy)` +- `feat_tex1`: `pack4x8unorm(matid,0,0,0)`, `pack4x8unorm(mip1.rgb, mip2.r)`, `pack4x8unorm(mip2.gb, shadow, transp)` +- Depth gradients: central differences on depth R channel +- Mip1 / Mip2: box2 (2×2) / box4 (4×4) average filter on albedo + +**PSNR computation** (`computePSNR`) + +- CNN output (`rgba16float`) copied to CPU staging buffer via `copyTextureToBuffer` +- f16→float32 decoded in JavaScript +- Target drawn to offscreen `<canvas>` via `drawImage`, pixels read with `getImageData` +- MSE and PSNR computed over all RGB pixels (alpha ignored) +- Result displayed below target canvas as `MSE=X.XXXXX PSNR=XX.XXdB` + +**`runFromFeat(f0, f1, w, h)`** + +Called by `loadSampleDir()` after packing, or can be called directly if feat textures +are already available. Skips the photo-pack step, runs all 5 CNN passes, and displays +the result. Intermediate textures are stored in `this.layerTextures` so the Layer +Visualization panel still works. + +--- + +## 10. See Also - `cnn_v3/docs/CNN_V3.md` — Full architecture design (U-Net, FiLM, feature layout) - `doc/EFFECT_WORKFLOW.md` — General effect integration guide - `cnn_v2/docs/CNN_V2.md` — Reference implementation (simpler, operational) -- `src/tests/gpu/test_demo_effects.cc` — GBufferEffect construction test +- `src/tests/gpu/test_demo_effects.cc` — GBufferEffect + GBufViewEffect tests diff --git a/cnn_v3/shaders/gbuf_deferred.wgsl b/cnn_v3/shaders/gbuf_deferred.wgsl new file mode 100644 index 0000000..2ed4ce3 --- /dev/null +++ b/cnn_v3/shaders/gbuf_deferred.wgsl @@ -0,0 +1,48 @@ +// G-buffer deferred render — albedo * diffuse +// Reads feat_tex0 (rgba32uint, f16 geometric). +// Outputs albedo * (ambient + diffuse). + +#include "math/normal" + +@group(0) @binding(0) var feat_tex0: texture_2d<u32>; +@group(0) @binding(1) var feat_tex1: texture_2d<u32>; +@group(0) @binding(2) var<uniform> uniforms: GBufDeferredUniforms; + +struct GBufDeferredUniforms { + resolution: vec2f, +} + +const KEY_LIGHT: vec3f = vec3f(0.408, 0.816, 0.408); // normalize(1,2,1) +const AMBIENT: f32 = 0.15; + +@vertex +fn vs_main(@builtin(vertex_index) vid: u32) -> @builtin(position) vec4f { + let x = f32((vid & 1u) << 2u) - 1.0; + let y = f32((vid & 2u) << 1u) - 1.0; + return vec4f(x, y, 0.0, 1.0); +} + +@fragment +fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f { + let coord = vec2i(pos.xy); + + // feat_tex0 layout: + // [0] pack2x16float(albedo.r, albedo.g) + // [1] pack2x16float(albedo.b, normal.x) oct [-1,1] + // [2] pack2x16float(normal.y, depth) oct [-1,1] + let t0 = textureLoad(feat_tex0, coord, 0); + let rg = unpack2x16float(t0.x); + let bx = unpack2x16float(t0.y); + let ny_d = unpack2x16float(t0.z); + let albedo = vec3f(rg.x, rg.y, bx.x); + + // Decode oct-normal (stored in [-1,1] — use oct_decode directly) + let normal = oct_decode(vec2f(bx.y, ny_d.x)); + let diffuse = max(0.0, dot(normal, KEY_LIGHT)); + + // feat_tex1[2] = pack4x8unorm(mip2.g, mip2.b, shadow, transp) + let t1 = textureLoad(feat_tex1, coord, 0); + let shadow = unpack4x8unorm(t1.z).z; + + return vec4f(albedo * (AMBIENT + diffuse * shadow), 1.0); +} diff --git a/cnn_v3/shaders/gbuf_pack.wgsl b/cnn_v3/shaders/gbuf_pack.wgsl index 71d8471..333589c 100644 --- a/cnn_v3/shaders/gbuf_pack.wgsl +++ b/cnn_v3/shaders/gbuf_pack.wgsl @@ -1,4 +1,5 @@ // G-buffer pack compute shader for CNN v3 +#include "math/normal" // Pass 4: Pack all G-buffer channels into two rgba32uint feature textures (32 bytes/pixel) // Output feat_tex0 holds 8×f16 geometric channels; feat_tex1 holds 12×u8 context channels. @@ -44,15 +45,6 @@ fn box4(tl: vec2i) -> vec3f { return acc * (1.0 / 16.0); } -// Decode oct-normal from [0,1] storage → [-1,1] encoded xy → reconstruct z. -fn decode_oct_normal(rg: vec2f) -> vec3f { - let f = rg * 2.0 - vec2f(1.0); - var n = vec3f(f.x, f.y, 1.0 - abs(f.x) - abs(f.y)); - let t = max(-n.z, 0.0); - n.x += select(t, -t, n.x >= 0.0); - n.y += select(t, -t, n.y >= 0.0); - return normalize(n); -} @compute @workgroup_size(8, 8) fn pack_features(@builtin(global_invocation_id) id: vec3u) { @@ -73,7 +65,7 @@ fn pack_features(@builtin(global_invocation_id) id: vec3u) { // Normal: stored as oct-encoded [0,1] in RG; extract just the encoded xy for feat_tex0 let normal_enc = nm.rg; // already in [0,1] — decode to get the xy for CNN input - let n3 = decode_oct_normal(normal_enc); + let n3 = oct_decode_unorm(normal_enc); // Store oct-encoded in [-1,1] remapped back to what CNN expects (the [-1,1] oct xy) let oct_xy = normal_enc * 2.0 - vec2f(1.0); // remap [0,1] → [-1,1] diff --git a/cnn_v3/shaders/gbuf_raster.wgsl b/cnn_v3/shaders/gbuf_raster.wgsl index c762db2..aed0e72 100644 --- a/cnn_v3/shaders/gbuf_raster.wgsl +++ b/cnn_v3/shaders/gbuf_raster.wgsl @@ -1,8 +1,11 @@ // G-buffer rasterization shader for CNN v3 // Pass 1: Proxy geometry → MRT (albedo rgba16float, normal_mat rgba16float, depth32) // Uses GlobalUniforms, ObjectData, ObjectsBuffer from common_uniforms. +// SPHERE objects use ray-sphere impostor (correct silhouette + normal + depth). #include "common_uniforms" +#include "math/normal" +#include "ray_sphere" @group(0) @binding(0) var<uniform> globals: GlobalUniforms; @group(0) @binding(1) var<storage, read> object_data: ObjectsBuffer; @@ -15,21 +18,6 @@ struct VertexOutput { @location(3) @interpolate(flat) instance_index: u32, } -// Octahedral encoding: maps unit normal to [-1,1]^2 -fn oct_encode(n: vec3f) -> vec2f { - let inv_l1 = 1.0 / (abs(n.x) + abs(n.y) + abs(n.z)); - var p = n.xy * inv_l1; - // Fold lower hemisphere - if (n.z < 0.0) { - let s = vec2f( - select(-1.0, 1.0, p.x >= 0.0), - select(-1.0, 1.0, p.y >= 0.0) - ); - p = (1.0 - abs(p.yx)) * s; - } - return p; // in [-1, 1] -} - @vertex fn vs_main( @builtin(vertex_index) vertex_index: u32, @@ -73,8 +61,8 @@ fn vs_main( let world_pos = obj.model * vec4f(p, 1.0); let clip_pos = globals.view_proj * world_pos; - // Transform normal by inverse-transpose (upper-left 3×3 of inv_model^T) - let world_normal = normalize((obj.inv_model * vec4f(n, 0.0)).xyz); + // Transform normal: use model matrix (correct for uniform scale + rotation). + let world_normal = normalize((obj.model * vec4f(n, 0.0)).xyz); var out: VertexOutput; out.position = clip_pos; @@ -86,20 +74,40 @@ fn vs_main( } struct GBufOutput { - @location(0) albedo: vec4f, // rgba16float: material color - @location(1) normal_mat: vec4f, // rgba16float: oct-normal XY in RG, mat_id/255 in B + @location(0) albedo: vec4f, // rgba16float: material color + @location(1) normal_mat: vec4f, // rgba16float: oct-normal XY in RG, mat_id/255 in B + @builtin(frag_depth) depth: f32, // corrected depth (sphere impostor) } @fragment fn fs_main(in: VertexOutput) -> GBufOutput { - let obj = object_data.objects[in.instance_index]; - let mat_id = f32(in.instance_index) / 255.0; + let obj = object_data.objects[in.instance_index]; + let obj_type = u32(obj.params.x); + let mat_id = f32(in.instance_index) / 255.0; + + var world_normal = normalize(in.world_normal); + var frag_depth = in.position.z; // default: hardware depth + + // Sphere impostor: ray-sphere intersection for correct silhouette and normal. + if (obj_type == 1u) { + let sphere_center = obj.model[3].xyz; + let sphere_radius = length(obj.model[0].xyz); // uniform scale in col0 + let cam_pos = globals.camera_pos_time.xyz; + let rd = normalize(in.world_pos - cam_pos); + let isect = ray_sphere_intersection(cam_pos, rd, sphere_center, sphere_radius); + if (!isect.hit) { discard; } + let hit = cam_pos + rd * isect.t; + world_normal = normalize(hit - sphere_center); + // Reproject hit point to get correct clip-space depth. + let clip_hit = globals.view_proj * vec4f(hit, 1.0); + frag_depth = clip_hit.z / clip_hit.w; + } - // Oct-encode world normal, remap [-1,1] → [0,1] for storage - let oct = oct_encode(normalize(in.world_normal)) * 0.5 + vec2f(0.5); + let oct = oct_encode_unorm(world_normal); var out: GBufOutput; out.albedo = vec4f(in.color.rgb, 1.0); out.normal_mat = vec4f(oct.x, oct.y, mat_id, 0.0); + out.depth = frag_depth; return out; } diff --git a/cnn_v3/shaders/gbuf_shadow.wgsl b/cnn_v3/shaders/gbuf_shadow.wgsl new file mode 100644 index 0000000..735e47c --- /dev/null +++ b/cnn_v3/shaders/gbuf_shadow.wgsl @@ -0,0 +1,118 @@ +// G-buffer shadow raymarching shader for CNN v3 +// Pass 2: Reads depth from Pass 1, marches shadow rays toward lights, +// outputs shadow factor (1.0=lit, 0.0=shadow) to RGBA8Unorm render target (.r). + +#include "common_uniforms" +#include "camera_common" +#include "math/sdf_shapes" +#include "math/normal" +#include "render/raymarching_id" + +@group(0) @binding(0) var<uniform> globals: GlobalUniforms; +@group(0) @binding(1) var<storage, read> object_data: ObjectsBuffer; +@group(0) @binding(2) var depth_tex: texture_depth_2d; +@group(0) @binding(4) var normal_mat_tex: texture_2d<f32>; + +struct GBufLight { + direction: vec4f, // xyz = toward light (world space, normalized) + color: vec4f, // rgb = color, a = intensity +} +struct GBufLightsUniforms { + lights: array<GBufLight, 2>, + params: vec4f, // x = num_lights +} +@group(0) @binding(3) var<uniform> lights: GBufLightsUniforms; + +// ---- SDF scene (proxy box per object in local space) ---- + +// Stub required by render/raymarching (shadow() / rayMarch() call df()). +fn df(p: vec3f) -> f32 { return MAX_RAY_LENGTH; } + +// SDF of the full scene: proxy box for each object transformed to local space. +fn dfWithID(p: vec3f) -> RayMarchResult { + var res: RayMarchResult; + res.distance = MAX_RAY_LENGTH; + res.distance_max = MAX_RAY_LENGTH; + res.object_id = 0.0; + + let n = u32(globals.params.x); + for (var i = 0u; i < n; i++) { + let obj = object_data.objects[i]; + let lp = (obj.inv_model * vec4f(p, 1.0)).xyz; + let obj_type = u32(obj.params.x); + // Scale factor: convert local-space SDF to world-space distance. + let scale = length(obj.model[0].xyz); + var d: f32; + switch obj_type { + case 1u: { d = sdSphere(lp, 1.0) * scale; } // SPHERE + case 2u: { d = sdPlane(lp, vec3f(0.0, 1.0, 0.0), obj.params.y); } // PLANE + case 3u: { d = sdTorus(lp, vec2f(0.8, 0.2)) * scale; } // TORUS + default: { d = sdBox(lp, vec3f(1.0)) * scale; } // CUBE (0) + fallback + } + if (d < res.distance) { + res.distance = d; + res.object_id = f32(i + 1u); + } + } + return res; +} + +// Soft shadow march (IQ formula). Returns 1=lit, 0=shadow. +// No dmin/dmax bounds: in open space d grows large so 8*d/t >> 1, res stays 1 naturally. +fn soft_shadow(ro: vec3f, rd: vec3f) -> f32 { + var t = 0.001; + var res = 1.0; + for (var i = 0; i < 64; i++) { + let d = dfWithID(ro + rd * t).distance; + if (d < 0.0005) { return 0.0; } + res = min(res, 8.0 * d / t); + t += d; + } + return clamp(res, 0.0, 1.0); +} + +// ---- Vertex: fullscreen triangle ---- + +@vertex +fn vs_main(@builtin(vertex_index) vid: u32) -> @builtin(position) vec4f { + let x = f32((vid & 1u) << 2u) - 1.0; + let y = f32((vid & 2u) << 1u) - 1.0; + return vec4f(x, y, 0.0, 1.0); +} + +// ---- Fragment: shadow factor per pixel ---- + +@fragment +fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f { + let depth = textureLoad(depth_tex, vec2i(pos.xy), 0); + + // Sky / background: fully lit. + if (depth >= 1.0) { + return vec4f(1.0); + } + + // Reconstruct world-space position from NDC + depth. + let res = globals.resolution; + let ndc = vec2f( + (pos.x / res.x) * 2.0 - 1.0, + 1.0 - (pos.y / res.y) * 2.0 + ); + let clip = globals.inv_view_proj * vec4f(ndc, depth, 1.0); + let world = clip.xyz / clip.w; + + // Use rasterized surface normal for bias — correct for sphere impostors. + let nm = textureLoad(normal_mat_tex, vec2i(pos.xy), 0); + let nor = oct_decode_unorm(nm.rg); + let bias_pos = world + nor * 0.05; + + // March shadow rays toward each light; take the darkest value. + var shadow_val = 1.0; + let num_lights = u32(lights.params.x); + for (var i = 0u; i < num_lights; i++) { + let ld = lights.lights[i].direction.xyz; + let s = soft_shadow(bias_pos, ld); + shadow_val = min(shadow_val, s); + } + + return vec4f(shadow_val, shadow_val, shadow_val, 1.0); +} diff --git a/cnn_v3/shaders/gbuf_view.wgsl b/cnn_v3/shaders/gbuf_view.wgsl new file mode 100644 index 0000000..3e7d1ff --- /dev/null +++ b/cnn_v3/shaders/gbuf_view.wgsl @@ -0,0 +1,127 @@ +// G-buffer channel visualization — 4×5 grid of 20 feature channels. +// Takes feat_tex0 (rgba32uint, ch 0-7 f16) and feat_tex1 (rgba32uint, ch 8-19 unorm8). +// Outputs tiled channel view to a standard rgba8unorm render target. +// +// Channel layout (row×col): +// Row 0: ch0(alb.r) ch1(alb.g) ch2(alb.b) ch3(nrm.x) +// Row 1: ch4(nrm.y) ch5(depth) ch6(dzdx) ch7(dzdy) +// Row 2: ch8(matid) ch9(prv.r) ch10(prv.g) ch11(prv.b) +// Row 3: ch12(m1.r) ch13(m1.g) ch14(m1.b) ch15(m2.r) +// Row 4: ch16(m2.g) ch17(m2.b) ch18(shdw) ch19(trns) + +#include "debug/debug_print" + +struct GBufViewUniforms { resolution: vec2f } + +@group(0) @binding(0) var feat0: texture_2d<u32>; +@group(0) @binding(1) var feat1: texture_2d<u32>; +@group(0) @binding(2) var<uniform> u: GBufViewUniforms; + +@vertex +fn vs_main(@builtin(vertex_index) vid: u32) -> @builtin(position) vec4f { + var corners = array<vec2f, 3>( + vec2f(-1.0, -1.0), vec2f(3.0, -1.0), vec2f(-1.0, 3.0)); + return vec4f(corners[vid], 0.0, 1.0); +} + +@fragment +fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f { + let uv = pos.xy / u.resolution; + + let COLS = 4.0; + let ROWS = 5.0; + let col = u32(uv.x * COLS); + let row = u32(uv.y * ROWS); + let ch = row * 4u + col; + + if (col >= 4u || ch >= 20u) { + return vec4f(0.05, 0.05, 0.05, 1.0); + } + + // 1-pixel grid lines (thin border per cell) + let lx = fract(uv.x * COLS); + let ly = fract(uv.y * ROWS); + if (lx < 0.005 || lx > 0.995 || ly < 0.005 || ly > 0.995) { + return vec4f(0.25, 0.25, 0.25, 1.0); + } + + // Map local UV to texel coordinate + let dim = vec2i(textureDimensions(feat0)); + let tc = clamp(vec2i(vec2f(lx, ly) * vec2f(dim)), vec2i(0), dim - vec2i(1)); + + var v: f32 = 0.0; + + if (ch < 8u) { + // feat0: 4 × pack2x16float — each u32 component holds two f16 values + let t = textureLoad(feat0, tc, 0); + let pair_idx = ch >> 1u; + let sub = ch & 1u; + var p: vec2f; + if (pair_idx == 0u) { p = unpack2x16float(t.x); } + else if (pair_idx == 1u) { p = unpack2x16float(t.y); } + else if (pair_idx == 2u) { p = unpack2x16float(t.z); } + else { p = unpack2x16float(t.w); } + v = select(p.y, p.x, sub == 0u); + } else { + // feat1: 3 × pack4x8unorm — components .x/.y/.z hold 4 u8 values each + let t = textureLoad(feat1, tc, 0); + let ch1 = ch - 8u; + let comp_idx = ch1 / 4u; + let sub = ch1 % 4u; + var bytes: vec4f; + if (comp_idx == 0u) { bytes = unpack4x8unorm(t.x); } + else if (comp_idx == 1u) { bytes = unpack4x8unorm(t.y); } + else { bytes = unpack4x8unorm(t.z); } + var ba = array<f32, 4>(bytes.x, bytes.y, bytes.z, bytes.w); + v = ba[sub]; + } + + // Channel-specific normalization for display clarity + var disp: f32; + if (ch <= 2u) { + // Albedo: already [0,1] + disp = clamp(v, 0.0, 1.0); + } else if (ch == 3u || ch == 4u) { + // Normals oct-encoded in [-1,1] → remap to [0,1] + disp = clamp(v * 0.5 + 0.5, 0.0, 1.0); + } else if (ch == 5u) { + // Depth [0,1]: invert so near=white, far=dark + disp = clamp(1.0 - v, 0.0, 1.0); + } else if (ch == 6u || ch == 7u) { + // Depth gradients (signed, small values): amplify × 20 + 0.5 for visibility + disp = clamp(v * 20.0 + 0.5, 0.0, 1.0); + } else { + // Everything else: clamp to [0,1] + disp = clamp(v, 0.0, 1.0); + } + + var out = vec4f(disp, disp, disp, 1.0); + + // Label at top-left of each tile + let tile_w = u.resolution.x / 4.0; + let tile_h = u.resolution.y / 5.0; + let origin = vec2f(f32(col) * tile_w + 4.0, f32(row) * tile_h + 4.0); + switch ch { + case 0u: { out = debug_str(out, pos.xy, origin, vec4u(0x616C622Eu, 0x72000000u, 0u, 0u), 5u); } // alb.r + case 1u: { out = debug_str(out, pos.xy, origin, vec4u(0x616C622Eu, 0x67000000u, 0u, 0u), 5u); } // alb.g + case 2u: { out = debug_str(out, pos.xy, origin, vec4u(0x616C622Eu, 0x62000000u, 0u, 0u), 5u); } // alb.b + case 3u: { out = debug_str(out, pos.xy, origin, vec4u(0x6E726D2Eu, 0x78000000u, 0u, 0u), 5u); } // nrm.x + case 4u: { out = debug_str(out, pos.xy, origin, vec4u(0x6E726D2Eu, 0x79000000u, 0u, 0u), 5u); } // nrm.y + case 5u: { out = debug_str(out, pos.xy, origin, vec4u(0x64657074u, 0x68000000u, 0u, 0u), 5u); } // depth + case 6u: { out = debug_str(out, pos.xy, origin, vec4u(0x647A6478u, 0u, 0u, 0u), 4u); } // dzdx + case 7u: { out = debug_str(out, pos.xy, origin, vec4u(0x647A6479u, 0u, 0u, 0u), 4u); } // dzdy + case 8u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D617469u, 0x64000000u, 0u, 0u), 5u); } // matid + case 9u: { out = debug_str(out, pos.xy, origin, vec4u(0x7072762Eu, 0x72000000u, 0u, 0u), 5u); } // prv.r + case 10u: { out = debug_str(out, pos.xy, origin, vec4u(0x7072762Eu, 0x67000000u, 0u, 0u), 5u); } // prv.g + case 11u: { out = debug_str(out, pos.xy, origin, vec4u(0x7072762Eu, 0x62000000u, 0u, 0u), 5u); } // prv.b + case 12u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D312E72u, 0u, 0u, 0u), 4u); } // m1.r + case 13u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D312E67u, 0u, 0u, 0u), 4u); } // m1.g + case 14u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D312E62u, 0u, 0u, 0u), 4u); } // m1.b + case 15u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D322E72u, 0u, 0u, 0u), 4u); } // m2.r + case 16u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D322E67u, 0u, 0u, 0u), 4u); } // m2.g + case 17u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D322E62u, 0u, 0u, 0u), 4u); } // m2.b + case 18u: { out = debug_str(out, pos.xy, origin, vec4u(0x73686477u, 0u, 0u, 0u), 4u); } // shdw + default: { out = debug_str(out, pos.xy, origin, vec4u(0x74726E73u, 0u, 0u, 0u), 4u); } // trns + } + return out; +} diff --git a/cnn_v3/src/gbuf_deferred_effect.cc b/cnn_v3/src/gbuf_deferred_effect.cc new file mode 100644 index 0000000..de6bd29 --- /dev/null +++ b/cnn_v3/src/gbuf_deferred_effect.cc @@ -0,0 +1,140 @@ +// GBufDeferredEffect — simple deferred render: albedo * shadow from packed G-buffer. + +#include "gbuf_deferred_effect.h" +#include "gpu/gpu.h" +#include "gpu/shader_composer.h" +#include "util/fatal_error.h" + +extern const char* gbuf_deferred_wgsl; + +struct GBufDeferredUniforms { + float resolution[2]; +}; +static_assert(sizeof(GBufDeferredUniforms) == 8, "GBufDeferredUniforms must be 8 bytes"); + +static WGPUBindGroupLayoutEntry bgl_uint_tex(uint32_t binding) { + WGPUBindGroupLayoutEntry e = {}; + e.binding = binding; + e.visibility = WGPUShaderStage_Fragment; + e.texture.sampleType = WGPUTextureSampleType_Uint; + e.texture.viewDimension = WGPUTextureViewDimension_2D; + return e; +} + +static WGPUBindGroupLayoutEntry bgl_uniform(uint32_t binding, uint64_t min_size) { + WGPUBindGroupLayoutEntry e = {}; + e.binding = binding; + e.visibility = WGPUShaderStage_Fragment; + e.buffer.type = WGPUBufferBindingType_Uniform; + e.buffer.minBindingSize = min_size; + return e; +} + +GBufDeferredEffect::GBufDeferredEffect(const GpuContext& ctx, + const std::vector<std::string>& inputs, + const std::vector<std::string>& outputs, + float start_time, float end_time) + : Effect(ctx, inputs, outputs, start_time, end_time) { + HEADLESS_RETURN_IF_NULL(ctx_.device); + + WGPUBindGroupLayoutEntry entries[3] = { + bgl_uint_tex(0), + bgl_uint_tex(1), + bgl_uniform(2, sizeof(GBufDeferredUniforms)), + }; + WGPUBindGroupLayoutDescriptor bgl_desc = {}; + bgl_desc.entryCount = 3; + bgl_desc.entries = entries; + WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(ctx_.device, &bgl_desc); + + WGPUPipelineLayoutDescriptor pl_desc = {}; + pl_desc.bindGroupLayoutCount = 1; + pl_desc.bindGroupLayouts = &bgl; + WGPUPipelineLayout pl = wgpuDeviceCreatePipelineLayout(ctx_.device, &pl_desc); + + WGPUShaderSourceWGSL wgsl_src = {}; + wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL; + const std::string composed = ShaderComposer::Get().Compose({}, gbuf_deferred_wgsl); + wgsl_src.code = str_view(composed.c_str()); + WGPUShaderModuleDescriptor shader_desc = {}; + shader_desc.nextInChain = &wgsl_src.chain; + WGPUShaderModule shader = wgpuDeviceCreateShaderModule(ctx_.device, &shader_desc); + + WGPUColorTargetState target = {}; + target.format = WGPUTextureFormat_RGBA8Unorm; + target.writeMask = WGPUColorWriteMask_All; + + WGPUFragmentState frag = {}; + frag.module = shader; + frag.entryPoint = str_view("fs_main"); + frag.targetCount = 1; + frag.targets = ⌖ + + WGPURenderPipelineDescriptor pipe_desc = {}; + pipe_desc.layout = pl; + pipe_desc.vertex.module = shader; + pipe_desc.vertex.entryPoint = str_view("vs_main"); + pipe_desc.fragment = &frag; + pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList; + pipe_desc.multisample.count = 1; + pipe_desc.multisample.mask = UINT32_MAX; + + pipeline_.set(wgpuDeviceCreateRenderPipeline(ctx_.device, &pipe_desc)); + + wgpuShaderModuleRelease(shader); + wgpuPipelineLayoutRelease(pl); + wgpuBindGroupLayoutRelease(bgl); +} + +void GBufDeferredEffect::render(WGPUCommandEncoder encoder, + const UniformsSequenceParams& params, + NodeRegistry& nodes) { + WGPUTextureView feat0_view = nodes.get_view(input_nodes_[0]); + WGPUTextureView feat1_view = nodes.get_view(input_nodes_[1]); + WGPUTextureView output_view = nodes.get_view(output_nodes_[0]); + + // Upload resolution uniform into the base class uniforms buffer (first 8 bytes). + GBufDeferredUniforms u; + u.resolution[0] = params.resolution.x; + u.resolution[1] = params.resolution.y; + wgpuQueueWriteBuffer(ctx_.queue, uniforms_buffer_.get().buffer, 0, + &u, sizeof(u)); + + WGPUBindGroupLayout bgl = + wgpuRenderPipelineGetBindGroupLayout(pipeline_.get(), 0); + + WGPUBindGroupEntry bg_entries[3] = {}; + bg_entries[0].binding = 0; + bg_entries[0].textureView = feat0_view; + bg_entries[1].binding = 1; + bg_entries[1].textureView = feat1_view; + bg_entries[2].binding = 2; + bg_entries[2].buffer = uniforms_buffer_.get().buffer; + bg_entries[2].size = sizeof(GBufDeferredUniforms); + + WGPUBindGroupDescriptor bg_desc = {}; + bg_desc.layout = bgl; + bg_desc.entryCount = 3; + bg_desc.entries = bg_entries; + bind_group_.replace(wgpuDeviceCreateBindGroup(ctx_.device, &bg_desc)); + wgpuBindGroupLayoutRelease(bgl); + + WGPURenderPassColorAttachment color_att = {}; + color_att.view = output_view; + color_att.loadOp = WGPULoadOp_Clear; + color_att.storeOp = WGPUStoreOp_Store; + color_att.clearValue = {0.0f, 0.0f, 0.0f, 1.0f}; + color_att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED; + + WGPURenderPassDescriptor pass_desc = {}; + pass_desc.colorAttachmentCount = 1; + pass_desc.colorAttachments = &color_att; + + WGPURenderPassEncoder pass = + wgpuCommandEncoderBeginRenderPass(encoder, &pass_desc); + wgpuRenderPassEncoderSetPipeline(pass, pipeline_.get()); + wgpuRenderPassEncoderSetBindGroup(pass, 0, bind_group_.get(), 0, nullptr); + wgpuRenderPassEncoderDraw(pass, 3, 1, 0, 0); + wgpuRenderPassEncoderEnd(pass); + wgpuRenderPassEncoderRelease(pass); +} diff --git a/cnn_v3/src/gbuf_deferred_effect.h b/cnn_v3/src/gbuf_deferred_effect.h new file mode 100644 index 0000000..4daf13d --- /dev/null +++ b/cnn_v3/src/gbuf_deferred_effect.h @@ -0,0 +1,23 @@ +// GBufDeferredEffect — simple deferred render from packed G-buffer. +// Inputs: feat_tex0, feat_tex1 (rgba32uint). Output: albedo * shadow (rgba8unorm). + +#pragma once +#include "gpu/effect.h" +#include "gpu/uniform_helper.h" +#include "gpu/wgpu_resource.h" + +class GBufDeferredEffect : public Effect { + public: + GBufDeferredEffect(const GpuContext& ctx, + const std::vector<std::string>& inputs, + const std::vector<std::string>& outputs, + float start_time, float end_time); + + void render(WGPUCommandEncoder encoder, + const UniformsSequenceParams& params, + NodeRegistry& nodes) override; + + private: + RenderPipeline pipeline_; + BindGroup bind_group_; +}; diff --git a/cnn_v3/src/gbuf_view_effect.cc b/cnn_v3/src/gbuf_view_effect.cc new file mode 100644 index 0000000..ccf80b0 --- /dev/null +++ b/cnn_v3/src/gbuf_view_effect.cc @@ -0,0 +1,146 @@ +// GBufViewEffect — G-buffer channel grid visualization +// Renders 20 feature channels from feat_tex0/feat_tex1 in a 4×5 tiled layout. + +#include "gbuf_view_effect.h" + +#if defined(USE_TEST_ASSETS) +#include "test_assets.h" +#else +#include "generated/assets.h" +#endif + +#include "gpu/gpu.h" +#include "gpu/shader_composer.h" +#include "util/asset_manager.h" +#include "util/fatal_error.h" + +extern const char* gbuf_view_wgsl; + +// BGL entry: texture_2d<u32> read binding (fragment stage) +static WGPUBindGroupLayoutEntry bgl_uint_tex_frag(uint32_t binding) { + WGPUBindGroupLayoutEntry e = {}; + e.binding = binding; + e.visibility = WGPUShaderStage_Fragment; + e.texture.sampleType = WGPUTextureSampleType_Uint; + e.texture.viewDimension = WGPUTextureViewDimension_2D; + return e; +} + +// BGL entry: uniform buffer (fragment stage) +static WGPUBindGroupLayoutEntry bgl_uniform_frag(uint32_t binding, + uint64_t min_size) { + WGPUBindGroupLayoutEntry e = {}; + e.binding = binding; + e.visibility = WGPUShaderStage_Fragment; + e.buffer.type = WGPUBufferBindingType_Uniform; + e.buffer.minBindingSize = min_size; + return e; +} + +GBufViewEffect::GBufViewEffect(const GpuContext& ctx, + const std::vector<std::string>& inputs, + const std::vector<std::string>& outputs, + float start_time, float end_time) + : Effect(ctx, inputs, outputs, start_time, end_time) { + HEADLESS_RETURN_IF_NULL(ctx_.device); + + // Build BGL: binding 0 = feat0 (u32 tex), 1 = feat1 (u32 tex), 2 = uniforms + WGPUBindGroupLayoutEntry entries[3] = { + bgl_uint_tex_frag(0), + bgl_uint_tex_frag(1), + bgl_uniform_frag(2, 8), // only resolution (vec2f = 8 bytes) is read + }; + WGPUBindGroupLayoutDescriptor bgl_desc = {}; + bgl_desc.entryCount = 3; + bgl_desc.entries = entries; + WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(ctx_.device, &bgl_desc); + + // Pipeline layout + WGPUPipelineLayoutDescriptor pl_desc = {}; + pl_desc.bindGroupLayoutCount = 1; + pl_desc.bindGroupLayouts = &bgl; + WGPUPipelineLayout pl = wgpuDeviceCreatePipelineLayout(ctx_.device, &pl_desc); + + // Shader module + WGPUShaderSourceWGSL wgsl_src = {}; + wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL; + const std::string composed = ShaderComposer::Get().Compose({}, gbuf_view_wgsl); + wgsl_src.code = str_view(composed.c_str()); + WGPUShaderModuleDescriptor shader_desc = {}; + shader_desc.nextInChain = &wgsl_src.chain; + WGPUShaderModule shader = + wgpuDeviceCreateShaderModule(ctx_.device, &shader_desc); + + // Render pipeline + WGPUColorTargetState target = {}; + target.format = WGPUTextureFormat_RGBA8Unorm; + target.writeMask = WGPUColorWriteMask_All; + + WGPUFragmentState frag = {}; + frag.module = shader; + frag.entryPoint = str_view("fs_main"); + frag.targetCount = 1; + frag.targets = ⌖ + + WGPURenderPipelineDescriptor pipe_desc = {}; + pipe_desc.layout = pl; + pipe_desc.vertex.module = shader; + pipe_desc.vertex.entryPoint = str_view("vs_main"); + pipe_desc.fragment = &frag; + pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList; + pipe_desc.multisample.count = 1; + pipe_desc.multisample.mask = UINT32_MAX; + + pipeline_.set(wgpuDeviceCreateRenderPipeline(ctx_.device, &pipe_desc)); + + wgpuShaderModuleRelease(shader); + wgpuPipelineLayoutRelease(pl); + wgpuBindGroupLayoutRelease(bgl); +} + +void GBufViewEffect::render(WGPUCommandEncoder encoder, + const UniformsSequenceParams& params, + NodeRegistry& nodes) { + WGPUTextureView feat0_view = nodes.get_view(input_nodes_[0]); + WGPUTextureView feat1_view = nodes.get_view(input_nodes_[1]); + WGPUTextureView output_view = nodes.get_view(output_nodes_[0]); + + // Rebuild bind group (views may change with ping-pong or resize) + WGPUBindGroupLayout bgl = + wgpuRenderPipelineGetBindGroupLayout(pipeline_.get(), 0); + + WGPUBindGroupEntry bg_entries[3] = {}; + bg_entries[0].binding = 0; + bg_entries[0].textureView = feat0_view; + bg_entries[1].binding = 1; + bg_entries[1].textureView = feat1_view; + bg_entries[2].binding = 2; + bg_entries[2].buffer = uniforms_buffer_.get().buffer; + bg_entries[2].size = sizeof(UniformsSequenceParams); + + WGPUBindGroupDescriptor bg_desc = {}; + bg_desc.layout = bgl; + bg_desc.entryCount = 3; + bg_desc.entries = bg_entries; + bind_group_.replace(wgpuDeviceCreateBindGroup(ctx_.device, &bg_desc)); + wgpuBindGroupLayoutRelease(bgl); + + WGPURenderPassColorAttachment color_att = {}; + color_att.view = output_view; + color_att.loadOp = WGPULoadOp_Clear; + color_att.storeOp = WGPUStoreOp_Store; + color_att.clearValue = {0.0f, 0.0f, 0.0f, 1.0f}; + color_att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED; + + WGPURenderPassDescriptor pass_desc = {}; + pass_desc.colorAttachmentCount = 1; + pass_desc.colorAttachments = &color_att; + + WGPURenderPassEncoder pass = + wgpuCommandEncoderBeginRenderPass(encoder, &pass_desc); + wgpuRenderPassEncoderSetPipeline(pass, pipeline_.get()); + wgpuRenderPassEncoderSetBindGroup(pass, 0, bind_group_.get(), 0, nullptr); + wgpuRenderPassEncoderDraw(pass, 3, 1, 0, 0); + wgpuRenderPassEncoderEnd(pass); + wgpuRenderPassEncoderRelease(pass); +} diff --git a/cnn_v3/src/gbuf_view_effect.h b/cnn_v3/src/gbuf_view_effect.h new file mode 100644 index 0000000..d4d8139 --- /dev/null +++ b/cnn_v3/src/gbuf_view_effect.h @@ -0,0 +1,25 @@ +// GBufViewEffect: Visualizes G-buffer feature textures as a 4×5 channel grid. +// Inputs: feat_tex0 (rgba32uint, ch 0-7 f16), feat_tex1 (rgba32uint, ch 8-19 unorm8) +// Output: rgba8unorm tiled channel visualization (downscaled 4× per channel) + +#pragma once + +#include "gpu/effect.h" +#include "gpu/sequence.h" +#include "gpu/wgpu_resource.h" + +class GBufViewEffect : public Effect { + public: + GBufViewEffect(const GpuContext& ctx, + const std::vector<std::string>& inputs, + const std::vector<std::string>& outputs, + float start_time, float end_time); + + void render(WGPUCommandEncoder encoder, + const UniformsSequenceParams& params, + NodeRegistry& nodes) override; + + private: + RenderPipeline pipeline_; + BindGroup bind_group_; +}; diff --git a/cnn_v3/src/gbuffer_effect.cc b/cnn_v3/src/gbuffer_effect.cc index 750188f..b059915 100644 --- a/cnn_v3/src/gbuffer_effect.cc +++ b/cnn_v3/src/gbuffer_effect.cc @@ -14,6 +14,7 @@ // For standalone use outside the asset system, the caller must ensure the WGSL // source strings are available. They are declared here as weak-linkable externs. extern const char* gbuf_raster_wgsl; +extern const char* gbuf_shadow_wgsl; extern const char* gbuf_pack_wgsl; // Maximum number of objects the G-buffer supports per frame. @@ -41,53 +42,6 @@ struct GBufGlobalUniforms { static_assert(sizeof(GBufGlobalUniforms) == sizeof(float) * 44, "GBufGlobalUniforms must be 176 bytes"); -// Helper: create a 1×1 placeholder texture of a given format cleared to `value`. -static WGPUTexture create_placeholder_tex(WGPUDevice device, - WGPUTextureFormat format, - float value) { - WGPUTextureDescriptor desc = {}; - desc.usage = (WGPUTextureUsage)(WGPUTextureUsage_TextureBinding | - WGPUTextureUsage_CopyDst); - desc.dimension = WGPUTextureDimension_2D; - desc.size = {1, 1, 1}; - desc.format = format; - desc.mipLevelCount = 1; - desc.sampleCount = 1; - WGPUTexture tex = wgpuDeviceCreateTexture(device, &desc); - return tex; -} - -// Helper: write a single RGBA float pixel to a texture via queue. -static void write_placeholder_pixel(WGPUQueue queue, WGPUTexture tex, - float r, float g, float b, float a) { - const float data[4] = {r, g, b, a}; - WGPUTexelCopyTextureInfo dst = {}; - dst.texture = tex; - dst.mipLevel = 0; - dst.origin = {0, 0, 0}; - dst.aspect = WGPUTextureAspect_All; - - WGPUTexelCopyBufferLayout layout = {}; - layout.offset = 0; - layout.bytesPerRow = 16; // 4 × sizeof(float) - layout.rowsPerImage = 1; - - const WGPUExtent3D extent = {1, 1, 1}; - wgpuQueueWriteTexture(queue, &dst, data, sizeof(data), &layout, &extent); -} - -// Create bilinear sampler. -static WGPUSampler create_bilinear_sampler(WGPUDevice device) { - WGPUSamplerDescriptor desc = {}; - desc.addressModeU = WGPUAddressMode_ClampToEdge; - desc.addressModeV = WGPUAddressMode_ClampToEdge; - desc.magFilter = WGPUFilterMode_Linear; - desc.minFilter = WGPUFilterMode_Linear; - desc.mipmapFilter = WGPUMipmapFilterMode_Linear; - desc.maxAnisotropy = 1; - return wgpuDeviceCreateSampler(device, &desc); -} - // ---- GBufferEffect ---- GBufferEffect::GBufferEffect(const GpuContext& ctx, @@ -104,9 +58,6 @@ GBufferEffect::GBufferEffect(const GpuContext& ctx, node_depth_ = prefix + "_depth"; node_shadow_ = prefix + "_shadow"; node_transp_ = prefix + "_transp"; - node_feat0_ = outputs.size() > 0 ? outputs[0] : prefix + "_feat0"; - node_feat1_ = outputs.size() > 1 ? outputs[1] : prefix + "_feat1"; - // Allocate GPU buffers for scene data. global_uniforms_buf_ = gpu_create_buffer(ctx_.device, sizeof(GBufGlobalUniforms), @@ -116,34 +67,15 @@ GBufferEffect::GBufferEffect(const GpuContext& ctx, // Resolution uniform for pack shader. pack_res_uniform_.init(ctx_.device); + lights_uniform_.init(ctx_.device); - // Placeholder shadow (1.0 = fully lit) and transp (0.0 = opaque) textures. - shadow_placeholder_tex_.set( - create_placeholder_tex(ctx_.device, WGPUTextureFormat_RGBA32Float, 1.0f)); - write_placeholder_pixel(ctx_.queue, - shadow_placeholder_tex_.get(), 1.0f, 0.0f, 0.0f, 1.0f); - - transp_placeholder_tex_.set( - create_placeholder_tex(ctx_.device, WGPUTextureFormat_RGBA32Float, 0.0f)); - write_placeholder_pixel(ctx_.queue, - transp_placeholder_tex_.get(), 0.0f, 0.0f, 0.0f, 1.0f); - - WGPUTextureViewDescriptor vd = {}; - vd.format = WGPUTextureFormat_RGBA32Float; - vd.dimension = WGPUTextureViewDimension_2D; - vd.baseMipLevel = 0; - vd.mipLevelCount = 1; - vd.baseArrayLayer = 0; - vd.arrayLayerCount = 1; - vd.aspect = WGPUTextureAspect_All; - - shadow_placeholder_view_.set( - wgpuTextureCreateView(shadow_placeholder_tex_.get(), &vd)); - transp_placeholder_view_.set( - wgpuTextureCreateView(transp_placeholder_tex_.get(), &vd)); + create_linear_sampler(); create_raster_pipeline(); + create_shadow_pipeline(); create_pack_pipeline(); + + set_scene(); } void GBufferEffect::declare_nodes(NodeRegistry& registry) { @@ -154,27 +86,123 @@ void GBufferEffect::declare_nodes(NodeRegistry& registry) { registry.declare_node(node_transp_, NodeType::GBUF_R8, -1, -1); // feat_tex0 / feat_tex1 are the declared output_nodes_ — they get registered // by the sequence infrastructure; declare them here as well if not already. - if (!registry.has_node(node_feat0_)) { - registry.declare_node(node_feat0_, NodeType::GBUF_RGBA32UINT, -1, -1); + if (!registry.has_node(output_nodes_[0])) { + registry.declare_node(output_nodes_[0], NodeType::GBUF_RGBA32UINT, -1, -1); + } + if (!registry.has_node(output_nodes_[1])) { + registry.declare_node(output_nodes_[1], NodeType::GBUF_RGBA32UINT, -1, -1); } - if (!registry.has_node(node_feat1_)) { - registry.declare_node(node_feat1_, NodeType::GBUF_RGBA32UINT, -1, -1); +} + +void GBufferEffect::set_scene() { + scene_.clear(); + cube_anims_.clear(); + sphere_anims_.clear(); + + // Deterministic pseudo-random (xorshift32). + uint32_t seed = 0xBEEF1234u; + auto rnd = [&]() -> float { + seed ^= seed << 13; + seed ^= seed >> 17; + seed ^= seed << 5; + return (float)(seed >> 8) / 16777216.0f; // [0, 1) + }; + auto rrange = [&](float lo, float hi) { return lo + rnd() * (hi - lo); }; + + // 2 large cubes. + // 2 large static cubes for shadow debugging. + { + Object3D obj(ObjectType::CUBE); + obj.position = vec3(-1.0f, 0.0f, 0.0f); + obj.scale = vec3(0.6f, 0.6f, 0.6f); + obj.color = vec4(0.9f, 0.5f, 0.3f, 1.0f); + scene_.add_object(obj); + cube_anims_.push_back({{0.0f, 1.0f, 0.0f}, 0.0f}); } + { + Object3D obj(ObjectType::SPHERE); + obj.position = vec3(1.0f, 0.0f, 0.0f); + const float r = 0.9f; + obj.scale = vec3(r, r, r); + obj.color = vec4(0.3f, 0.6f, 0.9f, 1.0f); + const int idx = (int)scene_.objects.size(); + scene_.add_object(obj); + sphere_anims_.push_back({idx, r}); + } + + // (sphere removed for shadow debugging) + + // Camera: above and in front of the scene, looking at origin. + camera_.set_look_at(vec3(0.0f, 2.5f, 6.0f), + vec3(0.0f, 0.0f, 0.0f), + vec3(0.0f, 1.0f, 0.0f)); + camera_.fov_y_rad = 0.7854f; // 45° + camera_.near_plane = 0.1f; + camera_.far_plane = 20.0f; + // aspect_ratio is updated each frame from params.resolution. + + scene_ready_ = true; } -void GBufferEffect::set_scene(const Scene* scene, const Camera* camera) { - scene_ = scene; - camera_ = camera; +static void clear_r8_node(WGPUCommandEncoder encoder, WGPUTextureView view, + float value) { + WGPURenderPassColorAttachment att = {}; + att.view = view; + att.loadOp = WGPULoadOp_Clear; + att.storeOp = WGPUStoreOp_Store; + att.clearValue = {value, value, value, value}; + att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED; + WGPURenderPassDescriptor pd = {}; + pd.colorAttachmentCount = 1; + pd.colorAttachments = &att; + WGPURenderPassEncoder p = wgpuCommandEncoderBeginRenderPass(encoder, &pd); + wgpuRenderPassEncoderEnd(p); + wgpuRenderPassEncoderRelease(p); } void GBufferEffect::render(WGPUCommandEncoder encoder, const UniformsSequenceParams& params, NodeRegistry& nodes) { - if (!scene_ || !camera_) { + if (!scene_ready_) { return; } - upload_scene_data(*scene_, *camera_, params.time); + // Update camera aspect ratio from current resolution. + camera_.aspect_ratio = params.aspect_ratio; + + // Slowly orbit around the scene. + const float angle = params.time * 0.3f; + const float R = 6.0f; + camera_.set_look_at(vec3(R * sinf(angle), 2.5f, R * cosf(angle)), + vec3(0.0f, 0.0f, 0.0f), + vec3(0.0f, 1.0f, 0.0f)); + + // Animate cubes: axis-angle rotation driven by physical time. + for (int i = 0; i < (int)cube_anims_.size(); ++i) { + const CubeAnim& a = cube_anims_[(size_t)i]; + scene_.objects[(size_t)i].rotation = + quat::from_axis(a.axis, params.time * a.speed); + } + // Pump spheres: scale with audio_intensity. + for (const SphereAnim& a : sphere_anims_) { + const float r = a.base_radius * (1.0f + params.audio_intensity * 0.8f); + scene_.objects[(size_t)a.obj_idx].scale = vec3(r, r, r); + } + + // Upload two directional lights. + { + GBufLightsUniforms lu = {}; + lu.params = vec4(1.0f, 0.0f, 0.0f, 0.0f); + // Key: warm sun, upper-right-front. + lu.lights[0].direction = vec4(0.408f, 0.816f, 0.408f, 0.0f); // norm(1,2,1) + lu.lights[0].color = vec4(1.00f, 0.92f, 0.78f, 1.0f); + // Fill: cool sky, upper-left-back. (disabled for debugging) + // lu.lights[1].direction = vec4(-0.577f, 0.577f, -0.577f, 0.0f); + // lu.lights[1].color = vec4(0.40f, 0.45f, 0.80f, 0.4f); + lights_uniform_.update(ctx_.queue, lu); + } + + upload_scene_data(scene_, camera_, params.time); // Update resolution uniform for pack shader. GBufResUniforms res_uni; @@ -186,8 +214,8 @@ void GBufferEffect::render(WGPUCommandEncoder encoder, WGPUTextureView albedo_view = nodes.get_view(node_albedo_); WGPUTextureView normal_mat_view = nodes.get_view(node_normal_mat_); WGPUTextureView depth_view = nodes.get_view(node_depth_); - WGPUTextureView feat0_view = nodes.get_view(node_feat0_); - WGPUTextureView feat1_view = nodes.get_view(node_feat1_); + WGPUTextureView feat0_view = nodes.get_view(output_nodes_[0]); + WGPUTextureView feat1_view = nodes.get_view(output_nodes_[1]); // prev_cnn: first input node if available, else dummy. WGPUTextureView prev_view = nullptr; @@ -228,8 +256,8 @@ void GBufferEffect::render(WGPUCommandEncoder encoder, raster_pass_desc.depthStencilAttachment = &depth_attachment; const int num_objects = - (int)(scene_->objects.size() < (size_t)kGBufMaxObjects - ? scene_->objects.size() + (int)(scene_.objects.size() < (size_t)kGBufMaxObjects + ? scene_.objects.size() : (size_t)kGBufMaxObjects); if (num_objects > 0 && raster_pipeline_.get() != nullptr) { @@ -250,13 +278,69 @@ void GBufferEffect::render(WGPUCommandEncoder encoder, wgpuRenderPassEncoderRelease(raster_pass); } - // Pass 2: SDF raymarching — TODO (placeholder: shadow=1, transp=0 already set) - // Pass 3: Lighting/shadow — TODO + // --- Pass 2: SDF shadow raymarching --- + if (shadow_pipeline_.get() != nullptr) { + WGPUBindGroupEntry shadow_entries[5] = {}; + shadow_entries[0].binding = 0; + shadow_entries[0].buffer = global_uniforms_buf_.buffer; + shadow_entries[0].size = sizeof(GBufGlobalUniforms); + + shadow_entries[1].binding = 1; + shadow_entries[1].buffer = objects_buf_.buffer; + shadow_entries[1].size = (size_t)objects_buf_capacity_ * sizeof(GBufObjectData); + + shadow_entries[2].binding = 2; + shadow_entries[2].textureView = depth_view; + + shadow_entries[3].binding = 3; + shadow_entries[3].buffer = lights_uniform_.get().buffer; + shadow_entries[3].size = sizeof(GBufLightsUniforms); + + shadow_entries[4].binding = 4; + shadow_entries[4].textureView = normal_mat_view; + + WGPUBindGroupLayout shadow_bgl = + wgpuRenderPipelineGetBindGroupLayout(shadow_pipeline_.get(), 0); + + WGPUBindGroupDescriptor shadow_bg_desc = {}; + shadow_bg_desc.layout = shadow_bgl; + shadow_bg_desc.entryCount = 5; + shadow_bg_desc.entries = shadow_entries; + + WGPUBindGroup shadow_bg = + wgpuDeviceCreateBindGroup(ctx_.device, &shadow_bg_desc); + wgpuBindGroupLayoutRelease(shadow_bgl); + + WGPURenderPassColorAttachment shadow_att = {}; + shadow_att.view = nodes.get_view(node_shadow_); + shadow_att.loadOp = WGPULoadOp_Clear; + shadow_att.storeOp = WGPUStoreOp_Store; + shadow_att.clearValue = {1.0f, 1.0f, 1.0f, 1.0f}; + shadow_att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED; + + WGPURenderPassDescriptor shadow_pass_desc = {}; + shadow_pass_desc.colorAttachmentCount = 1; + shadow_pass_desc.colorAttachments = &shadow_att; + + WGPURenderPassEncoder shadow_pass = + wgpuCommandEncoderBeginRenderPass(encoder, &shadow_pass_desc); + wgpuRenderPassEncoderSetPipeline(shadow_pass, shadow_pipeline_.get()); + wgpuRenderPassEncoderSetBindGroup(shadow_pass, 0, shadow_bg, 0, nullptr); + wgpuRenderPassEncoderDraw(shadow_pass, 3, 1, 0, 0); + wgpuRenderPassEncoderEnd(shadow_pass); + wgpuRenderPassEncoderRelease(shadow_pass); + wgpuBindGroupRelease(shadow_bg); + } else { + // Fallback: clear to 1.0 (fully lit) if pipeline not ready. + clear_r8_node(encoder, nodes.get_view(node_shadow_), 1.0f); + } + + // Pass 3: Transparency — TODO (deferred; opaque scenes only) + clear_r8_node(encoder, nodes.get_view(node_transp_), 0.0f); // --- Pass 4: Pack compute --- // Rebuild pack bind group with current node views. - // Construct a temporary bilinear sampler for this pass. - WGPUSampler bilinear = create_bilinear_sampler(ctx_.device); + WGPUSampler bilinear = sampler_.get(); // Get texture views from nodes. // shadow / transp are GBUF_R8 nodes; use their views. @@ -320,7 +404,7 @@ void GBufferEffect::render(WGPUCommandEncoder encoder, wgpuComputePassEncoderRelease(compute_pass); wgpuBindGroupRelease(pack_bg); - wgpuSamplerRelease(bilinear); + // bilinear is owned by sampler_ — no release here. } // ---- private helpers ---- @@ -346,7 +430,8 @@ void GBufferEffect::upload_scene_data(const Scene& scene, : (size_t)kGBufMaxObjects); const mat4 view = camera.get_view_matrix(); - const mat4 proj = camera.get_projection_matrix(); + mat4 proj = camera.get_projection_matrix(); + proj.m[5] = -proj.m[5]; // undo post-process Y flip: G-buffer uses integer reads const mat4 vp = proj * view; GBufGlobalUniforms gu = {}; @@ -373,7 +458,7 @@ void GBufferEffect::upload_scene_data(const Scene& scene, d.model = m; d.inv_model = m.inverse(); d.color = obj.color; - d.params = vec4(0.0f, 0.0f, 0.0f, 0.0f); + d.params = vec4((float)(int)obj.type, 0.0f, 0.0f, 0.0f); obj_data.push_back(d); } wgpuQueueWriteBuffer(ctx_.queue, objects_buf_.buffer, 0, @@ -392,7 +477,7 @@ void GBufferEffect::create_raster_pipeline() { } const std::string composed = - ShaderComposer::Get().Compose({"common_uniforms"}, src); + ShaderComposer::Get().Compose({}, src); WGPUShaderSourceWGSL wgsl_src = {}; wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL; @@ -450,8 +535,9 @@ void GBufferEffect::create_raster_pipeline() { pipe_desc.vertex.entryPoint = str_view("vs_main"); pipe_desc.fragment = &frag; pipe_desc.depthStencil = &ds; - pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList; - pipe_desc.primitive.cullMode = WGPUCullMode_Back; + pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList; + pipe_desc.primitive.cullMode = WGPUCullMode_Back; + pipe_desc.primitive.frontFace = WGPUFrontFace_CCW; // standard (no Y flip) pipe_desc.multisample.count = 1; pipe_desc.multisample.mask = 0xFFFFFFFF; @@ -462,6 +548,91 @@ void GBufferEffect::create_raster_pipeline() { wgpuShaderModuleRelease(shader); } +void GBufferEffect::create_shadow_pipeline() { + HEADLESS_RETURN_IF_NULL(ctx_.device); + + const char* src = gbuf_shadow_wgsl; + if (!src) { + return; + } + + const std::string composed = ShaderComposer::Get().Compose({}, src); + + WGPUShaderSourceWGSL wgsl_src = {}; + wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL; + wgsl_src.code = str_view(composed.c_str()); + + WGPUShaderModuleDescriptor shader_desc = {}; + shader_desc.nextInChain = &wgsl_src.chain; + WGPUShaderModule shader = wgpuDeviceCreateShaderModule(ctx_.device, &shader_desc); + + // BGL: B0=GlobalUniforms, B1=ObjectsBuffer, B2=texture_depth_2d, B3=GBufLightsUniforms + WGPUBindGroupLayoutEntry bgl_entries[5] = {}; + + bgl_entries[0].binding = 0; + bgl_entries[0].visibility = + (WGPUShaderStage)(WGPUShaderStage_Vertex | WGPUShaderStage_Fragment); + bgl_entries[0].buffer.type = WGPUBufferBindingType_Uniform; + bgl_entries[0].buffer.minBindingSize = sizeof(GBufGlobalUniforms); + + bgl_entries[1].binding = 1; + bgl_entries[1].visibility = WGPUShaderStage_Fragment; + bgl_entries[1].buffer.type = WGPUBufferBindingType_ReadOnlyStorage; + bgl_entries[1].buffer.minBindingSize = sizeof(GBufObjectData); + + bgl_entries[2].binding = 2; + bgl_entries[2].visibility = WGPUShaderStage_Fragment; + bgl_entries[2].texture.sampleType = WGPUTextureSampleType_Depth; + bgl_entries[2].texture.viewDimension = WGPUTextureViewDimension_2D; + + bgl_entries[3].binding = 3; + bgl_entries[3].visibility = WGPUShaderStage_Fragment; + bgl_entries[3].buffer.type = WGPUBufferBindingType_Uniform; + bgl_entries[3].buffer.minBindingSize = sizeof(GBufLightsUniforms); + + bgl_entries[4].binding = 4; + bgl_entries[4].visibility = WGPUShaderStage_Fragment; + bgl_entries[4].texture.sampleType = WGPUTextureSampleType_Float; + bgl_entries[4].texture.viewDimension = WGPUTextureViewDimension_2D; + + WGPUBindGroupLayoutDescriptor bgl_desc = {}; + bgl_desc.entryCount = 5; + bgl_desc.entries = bgl_entries; + WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(ctx_.device, &bgl_desc); + + WGPUPipelineLayoutDescriptor pl_desc = {}; + pl_desc.bindGroupLayoutCount = 1; + pl_desc.bindGroupLayouts = &bgl; + WGPUPipelineLayout pl = wgpuDeviceCreatePipelineLayout(ctx_.device, &pl_desc); + + // Color target: RGBA8Unorm (NodeType::GBUF_R8) + WGPUColorTargetState color_target = {}; + color_target.format = WGPUTextureFormat_RGBA8Unorm; + color_target.writeMask = WGPUColorWriteMask_All; + + WGPUFragmentState frag = {}; + frag.module = shader; + frag.entryPoint = str_view("fs_main"); + frag.targetCount = 1; + frag.targets = &color_target; + + WGPURenderPipelineDescriptor pipe_desc = {}; + pipe_desc.layout = pl; + pipe_desc.vertex.module = shader; + pipe_desc.vertex.entryPoint = str_view("vs_main"); + pipe_desc.fragment = &frag; + pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList; + pipe_desc.primitive.cullMode = WGPUCullMode_None; + pipe_desc.multisample.count = 1; + pipe_desc.multisample.mask = 0xFFFFFFFF; + + shadow_pipeline_.set(wgpuDeviceCreateRenderPipeline(ctx_.device, &pipe_desc)); + + wgpuPipelineLayoutRelease(pl); + wgpuBindGroupLayoutRelease(bgl); + wgpuShaderModuleRelease(shader); +} + void GBufferEffect::create_pack_pipeline() { HEADLESS_RETURN_IF_NULL(ctx_.device); @@ -596,7 +767,3 @@ void GBufferEffect::update_raster_bind_group(NodeRegistry& nodes) { wgpuBindGroupLayoutRelease(bgl); } -void GBufferEffect::update_pack_bind_group(NodeRegistry& nodes) { - (void)nodes; - // Pack bind group is rebuilt inline in render() to use current node views. -} diff --git a/cnn_v3/src/gbuffer_effect.h b/cnn_v3/src/gbuffer_effect.h index 42fb0ec..13d394d 100644 --- a/cnn_v3/src/gbuffer_effect.h +++ b/cnn_v3/src/gbuffer_effect.h @@ -10,6 +10,7 @@ #include "gpu/uniform_helper.h" #include "gpu/wgpu_resource.h" #include "util/mini_math.h" +#include <vector> // Uniform for the pack compute shader struct GBufResUniforms { @@ -20,6 +21,20 @@ struct GBufResUniforms { static_assert(sizeof(GBufResUniforms) == 16, "GBufResUniforms must be 16 bytes"); +// Single directional light: direction points *toward* the light source (world space). +struct GBufLight { + vec4 direction; // xyz = normalized direction toward light, w = unused + vec4 color; // rgb = color, a = intensity +}; +static_assert(sizeof(GBufLight) == 32, "GBufLight must be 32 bytes"); + +struct GBufLightsUniforms { + GBufLight lights[2]; + vec4 params; // x = num_lights +}; +static_assert(sizeof(GBufLightsUniforms) == 80, + "GBufLightsUniforms must be 80 bytes"); + class GBufferEffect : public Effect { public: GBufferEffect(const GpuContext& ctx, const std::vector<std::string>& inputs, @@ -31,47 +46,59 @@ class GBufferEffect : public Effect { void render(WGPUCommandEncoder encoder, const UniformsSequenceParams& params, NodeRegistry& nodes) override; - void set_scene(const Scene* scene, const Camera* camera); + // Populate the internal scene with ~20 rotating cubes and a few pumping + // spheres. Must be called once before the first render(). + void set_scene(); private: + // Per-cube animation state (axis-angle rotation) + struct CubeAnim { + vec3 axis; + float speed; // radians/second, may be negative + }; + // Per-sphere animation state (radius driven by audio_intensity) + struct SphereAnim { + int obj_idx; // index into scene_.objects + float base_radius; + }; + // Internal G-buffer node names std::string node_albedo_; std::string node_normal_mat_; std::string node_depth_; std::string node_shadow_; std::string node_transp_; - std::string node_feat0_; - std::string node_feat1_; - const Scene* scene_ = nullptr; - const Camera* camera_ = nullptr; + // Owned scene and camera — populated by set_scene() + Scene scene_; + Camera camera_; + bool scene_ready_ = false; + + std::vector<CubeAnim> cube_anims_; + std::vector<SphereAnim> sphere_anims_; // Pass 1: MRT rasterization pipeline RenderPipeline raster_pipeline_; BindGroup raster_bind_group_; + // Pass 2: SDF shadow pipeline + RenderPipeline shadow_pipeline_; + // Pass 4: Pack compute pipeline ComputePipeline pack_pipeline_; - BindGroup pack_bind_group_; - UniformBuffer<GBufResUniforms> pack_res_uniform_; - - // Placeholder textures for shadow/transp (white/black cleared once) - Texture shadow_placeholder_tex_; - TextureView shadow_placeholder_view_; - Texture transp_placeholder_tex_; - TextureView transp_placeholder_view_; + UniformBuffer<GBufResUniforms> pack_res_uniform_; + UniformBuffer<GBufLightsUniforms> lights_uniform_; // GPU-side object data buffers (global uniforms + objects storage) - // These mirror the layout expected by gbuf_raster.wgsl GpuBuffer global_uniforms_buf_; GpuBuffer objects_buf_; - int objects_buf_capacity_ = 0; // number of ObjectData slots allocated + int objects_buf_capacity_ = 0; void create_raster_pipeline(); + void create_shadow_pipeline(); void create_pack_pipeline(); void update_raster_bind_group(NodeRegistry& nodes); - void update_pack_bind_group(NodeRegistry& nodes); void upload_scene_data(const Scene& scene, const Camera& camera, float time); diff --git a/cnn_v3/tools/index.html b/cnn_v3/tools/index.html index eba532e..26fee9b 100644 --- a/cnn_v3/tools/index.html +++ b/cnn_v3/tools/index.html @@ -64,9 +64,11 @@ video{display:none} <div class="left"> <input type="file" id="wFile" accept=".bin" style="display:none"> <input type="file" id="fFile" accept=".bin" style="display:none"> + <input type="file" id="sFile" webkitdirectory style="display:none" onchange="tester.loadSampleDir(this.files)"> <div class="dz" id="wDrop" onclick="document.getElementById('wFile').click()">Drop cnn_v3_weights.bin</div> <div class="dz" id="fDrop" onclick="document.getElementById('fFile').click()">Drop cnn_v3_film_mlp.bin (optional)</div> + <button onclick="tester.preload()" style="font-size:9px;margin-top:2px">↺ Reload from workspace weights/</button> <div class="panel"> <div class="ph" onclick="togglePanel(this)">Input Mode <span>▼</span></div> @@ -78,6 +80,10 @@ video{display:none} <div id="fullHelp" style="display:none;margin-top:6px;font-size:9px;color:#555;line-height:1.6"> Drop PNGs: *albedo*/color · *normal* · *depth* · *matid*/index · *shadow* · *transp*/alpha </div> + <div style="margin-top:8px;border-top:1px solid #333;padding-top:8px"> + <button onclick="document.getElementById('sFile').click()" style="width:100%">↑ Load sample directory</button> + <div id="sampleSt" style="font-size:9px;color:#555;margin-top:3px"></div> + </div> </div> </div> @@ -120,15 +126,29 @@ video{display:none} <div class="sep"></div> <button onclick="tester.savePNG()">Save PNG</button> </div> - <canvas id="canvas"></canvas> + <div style="display:flex;gap:12px;align-items:flex-start"> + <div style="display:flex;flex-direction:column;align-items:center;gap:3px"> + <canvas id="canvas"></canvas> + <span id="cnnLabel" style="font-size:9px;color:#555"></span> + </div> + <div id="targetPane" style="display:none;flex-direction:column;align-items:center;gap:3px"> + <canvas id="targetCanvas" style="max-width:100%;max-height:100%;image-rendering:pixelated;box-shadow:0 4px 12px rgba(0,0,0,.5)"></canvas> + <span style="font-size:9px;color:#555">target.png</span> + <span id="psnrSt" style="font-size:9px;color:#4a9eff"></span> + </div> + </div> </div> <div class="right"> <div class="panel" style="flex:1;display:flex;flex-direction:column;min-height:0"> <div class="ph">Layer Visualization</div> - <div class="pc" id="layerViz" style="flex:1;overflow:auto"> + <div class="pc" id="layerViz" style="flex:1;min-height:0;overflow:auto"> <p style="color:#444;text-align:center">Load image + weights</p> </div> + <div id="chzoomWrap" style="display:none;flex-direction:column;align-items:center;justify-content:center;gap:3px;padding:6px;border-top:1px solid #333;background:#1a1a1a;flex:1;min-height:0;overflow:hidden"> + <span id="chzoomLbl" style="font-size:9px;color:#666;flex-shrink:0"></span> + <canvas id="chzoom" style="image-rendering:pixelated;display:block"></canvas> + </div> </div> </div> </div> diff --git a/cnn_v3/tools/shaders.js b/cnn_v3/tools/shaders.js index c3e994d..f178637 100644 --- a/cnn_v3/tools/shaders.js +++ b/cnn_v3/tools/shaders.js @@ -223,30 +223,85 @@ const DISP_SHADER=` }`; // Viz f32: show one channel of rgba16float layer +// Uniform layout: ch(u32) _p(u32) ox(i32) oy(i32) — 16 bytes +// ox/oy = texel offset (top-left of view); 0,0 for full-texture vignettes. const VIZ_F32=` +struct Vu{ch:u32,_p:u32,ox:i32,oy:i32} @group(0) @binding(0) var t:texture_2d<f32>; -@group(0) @binding(1) var<uniform> ch:u32; +@group(0) @binding(1) var<uniform> u:Vu; @vertex fn vs(@builtin(vertex_index) i:u32)->@builtin(position) vec4f{ var p=array<vec2f,6>(vec2f(-1.,-1.),vec2f(1.,-1.),vec2f(-1.,1.),vec2f(-1.,1.),vec2f(1.,-1.),vec2f(1.,1.)); return vec4f(p[i],0.,1.); } @fragment fn fs(@builtin(position) pos:vec4f)->@location(0) vec4f{ - let v=textureLoad(t,vec2i(pos.xy),0); var a=array<f32,4>(v.x,v.y,v.z,v.w); - let x=clamp(a[min(ch,3u)],0.,1.); return vec4f(x,x,x,1.); + let dim=vec2i(textureDimensions(t)); + let tc=clamp(vec2i(i32(pos.x)+u.ox,i32(pos.y)+u.oy),vec2i(0),dim-vec2i(1)); + let v=textureLoad(t,tc,0); var a=array<f32,4>(v.x,v.y,v.z,v.w); + let x=clamp(a[min(u.ch,3u)],0.,1.); return vec4f(x,x,x,1.); }`; // Viz u32: show one f16 channel of rgba32uint layer (8 channels packed) const VIZ_U32=` +struct Vu{ch:u32,_p:u32,ox:i32,oy:i32} @group(0) @binding(0) var t:texture_2d<u32>; -@group(0) @binding(1) var<uniform> ch:u32; +@group(0) @binding(1) var<uniform> u:Vu; @vertex fn vs(@builtin(vertex_index) i:u32)->@builtin(position) vec4f{ var p=array<vec2f,6>(vec2f(-1.,-1.),vec2f(1.,-1.),vec2f(-1.,1.),vec2f(-1.,1.),vec2f(1.,-1.),vec2f(1.,1.)); return vec4f(p[i],0.,1.); } @fragment fn fs(@builtin(position) pos:vec4f)->@location(0) vec4f{ - let t2=textureLoad(t,vec2i(pos.xy),0); + let dim=vec2i(textureDimensions(t)); + let tc=clamp(vec2i(i32(pos.x)+u.ox,i32(pos.y)+u.oy),vec2i(0),dim-vec2i(1)); + let t2=textureLoad(t,tc,0); let a=unpack2x16float(t2.x);let b=unpack2x16float(t2.y); let c=unpack2x16float(t2.z);let d=unpack2x16float(t2.w); var v=array<f32,8>(a.x,a.y,b.x,b.y,c.x,c.y,d.x,d.y); - let x=clamp(v[min(ch,7u)],0.,1.); return vec4f(x,x,x,1.); + let x=clamp(v[min(u.ch,7u)],0.,1.); return vec4f(x,x,x,1.); }`; + +// Full G-buffer pack: assembles feat_tex0/feat_tex1 from individual G-buffer images. +// Bindings: albedo(0) normal(1) depth(2) matid(3) shadow(4) transp(5) f0(6) f1(7) +// All source textures are rgba8unorm (browser-loaded images, R channel for depth/matid/shadow/transp). +// Uses textureLoad() only (no sampler needed). Matches gbuf_pack.wgsl packing exactly. +const FULL_PACK_SHADER=` +@group(0) @binding(0) var albedo: texture_2d<f32>; +@group(0) @binding(1) var normal: texture_2d<f32>; +@group(0) @binding(2) var depth: texture_2d<f32>; +@group(0) @binding(3) var matid: texture_2d<f32>; +@group(0) @binding(4) var shadow: texture_2d<f32>; +@group(0) @binding(5) var transp: texture_2d<f32>; +@group(0) @binding(6) var f0: texture_storage_2d<rgba32uint,write>; +@group(0) @binding(7) var f1: texture_storage_2d<rgba32uint,write>; +fn ld(c:vec2i,d:vec2i)->f32{return textureLoad(depth,clamp(c,vec2i(0),d-vec2i(1)),0).r;} +fn b2(tl:vec2i,d:vec2i)->vec3f{ + var s=vec3f(0.); + for(var y:i32=0;y<2;y++){for(var x:i32=0;x<2;x++){s+=textureLoad(albedo,clamp(tl+vec2i(x,y),vec2i(0),d-vec2i(1)),0).rgb;}} + return s*.25;} +fn b4(tl:vec2i,d:vec2i)->vec3f{ + var s=vec3f(0.); + for(var y:i32=0;y<4;y++){for(var x:i32=0;x<4;x++){s+=textureLoad(albedo,clamp(tl+vec2i(x,y),vec2i(0),d-vec2i(1)),0).rgb;}} + return s*(1./16.);} +@compute @workgroup_size(8,8) +fn main(@builtin(global_invocation_id) id:vec3u){ + let c=vec2i(id.xy); let d=vec2i(textureDimensions(albedo)); + if(c.x>=d.x||c.y>=d.y){return;} + let alb=textureLoad(albedo,c,0).rgb; + let nrm=textureLoad(normal,c,0).rg; + let oct=nrm*2.-vec2f(1.); // [0,1] -> [-1,1] + let dv=ld(c,d); + let dzdx=(ld(c+vec2i(1,0),d)-ld(c-vec2i(1,0),d))*.5; + let dzdy=(ld(c+vec2i(0,1),d)-ld(c-vec2i(0,1),d))*.5; + textureStore(f0,c,vec4u( + pack2x16float(alb.rg), + pack2x16float(vec2f(alb.b,oct.x)), + pack2x16float(vec2f(oct.y,dv)), + pack2x16float(vec2f(dzdx,dzdy)))); + let mid=textureLoad(matid,c,0).r; + let shd=textureLoad(shadow,c,0).r; + let trp=textureLoad(transp,c,0).r; + let m1=b2(c-vec2i(0),d); let m2=b4(c-vec2i(1),d); + textureStore(f1,c,vec4u( + pack4x8unorm(vec4f(mid,0.,0.,0.)), + pack4x8unorm(vec4f(m1.r,m1.g,m1.b,m2.r)), + pack4x8unorm(vec4f(m2.g,m2.b,shd,trp)), + 0u));}`; diff --git a/cnn_v3/tools/tester.js b/cnn_v3/tools/tester.js index aa765a1..0412cae 100644 --- a/cnn_v3/tools/tester.js +++ b/cnn_v3/tools/tester.js @@ -13,6 +13,7 @@ class CNNv3Tester { this.image = null; this.isVideo = false; this.viewMode= 0; // 0=cnn 1=orig 2=diff + this.targetBitmap = null; // set when a sample dir with target.png is loaded this.blend = 1.0; this.layerTextures = {}; this.lastResult = null; @@ -44,9 +45,43 @@ class CNNv3Tester { this.format = navigator.gpu.getPreferredCanvasFormat(); this.linearSampler = this.device.createSampler({magFilter:'linear',minFilter:'linear',mipmapFilter:'linear'}); this.log('WebGPU ready'); + this.preload(); } catch(e) { this.setStatus(`GPU error: ${e.message}`,true); } } + async preload() { + const base = '../../workspaces/main/weights/'; + const files = [ + {url: base+'cnn_v3_weights.bin', isFilm: false}, + {url: base+'cnn_v3_film_mlp.bin', isFilm: true}, + ]; + for (const {url, isFilm} of files) { + try { + const r = await fetch(url); + if (!r.ok) { this.log(`preload skip: ${url.split('/').pop()} (${r.status})`); continue; } + const buf = await r.arrayBuffer(); + const name = url.split('/').pop(); + if (isFilm) { + this.filmMlp = this.parseFilm(buf); + const el = document.getElementById('fDrop'); + el.textContent = `✓ ${name}`; el.classList.add('ok'); + document.getElementById('fSt').textContent = 'FiLM MLP loaded'; + document.getElementById('fSt').style.color = '#28a745'; + } else { + this.weightsU32 = this.parseWeights(buf); this.weightsBuffer = buf; + if (this.weightsGPU) { this.weightsGPU.destroy(); this.weightsGPU = null; } + const el = document.getElementById('wDrop'); + el.textContent = `✓ ${name}`; el.classList.add('ok'); + } + this.log(`Preloaded: ${name}`); + } catch(e) { this.log(`preload error (${url.split('/').pop()}): ${e.message}`, 'err'); } + } + if (this.weightsU32) { + if (this.image || this.isVideo) this.run(); + else this.setStatus('Weights loaded — drop image/video'); + } + } + getDims() { return this.isVideo ? {w:this.video.videoWidth, h:this.video.videoHeight} @@ -106,7 +141,7 @@ class CNNv3Tester { filmParams() { const I4=[1,1,1,1],Z4=[0,0,0,0],I8=[1,1,1,1,1,1,1,1],Z8=[0,0,0,0,0,0,0,0]; if (!this.filmMlp) return {ge0:I4,be0:Z4,ge1:I8,be1:Z8,gd1:I4,bd1:Z4,gd0:I4,bd0:Z4}; - const v=document.getElementById; + const v=document.getElementById.bind(document); const cond=[v('sBP').value,v('sBN').value,v('sAI').value,v('sP0').value,v('sP1').value].map(Number); const f=this.filmFwd(cond); return { @@ -350,6 +385,7 @@ class CNNv3Tester { this.layerTextures={feat0:f0,feat1:f1,enc0:e0,enc1:e1,bn,dec1:d1,output:ot}; this.lastResult={ot,itex:this.inputTex,uDp,dispPL:this.getDisp(),w,h}; this.updateVizPanel(); + this.refreshZoom(); } destroyLayerTex(){for(const t of Object.values(this.layerTextures||{}))try{t.destroy();}catch(_){} this.layerTextures={};} @@ -422,10 +458,19 @@ class CNNv3Tester { document.getElementById(`vb_${id}`)?.classList.add('act'); const def=this.vizDefs.find(d=>d.id===id); if(!def)return; const grid=document.getElementById('chgrid'); grid.innerHTML=''; + const chName = (c) => `${def.lbl} → ${def.ch[c]||'c'+c}`; for(let c=0;c<def.nch;c++){ const cell=document.createElement('div'); cell.className='chcell'; const lbl=document.createElement('div'); lbl.className='chcell-lbl'; lbl.textContent=def.ch[c]||`c${c}`; const cvs=document.createElement('canvas'); + const name=chName(c); + cvs.title=name; + cvs.onclick=(e)=>{ + const r=cvs.getBoundingClientRect(); + const tx=Math.round(e.offsetX/r.width*tex.width); + const ty=Math.round(e.offsetY/r.height*tex.height); + tester.zoomChannel(id,c,name,tx,ty); + }; cell.appendChild(lbl); cell.appendChild(cvs); grid.appendChild(cell); } const pl=def.t==='f32'?this.getVizF32():this.getVizU32(); @@ -434,8 +479,8 @@ class CNNv3Tester { cvs.width=tex.width; cvs.height=tex.height; const ctx=cvs.getContext('webgpu'); if(!ctx)continue; try{ctx.configure({device:this.device,format:this.format});}catch(_){continue;} - const chBuf=this.device.createBuffer({size:4,usage:GPUBufferUsage.UNIFORM|GPUBufferUsage.COPY_DST}); - this.device.queue.writeBuffer(chBuf,0,new Uint32Array([c])); + const chBuf=this.device.createBuffer({size:16,usage:GPUBufferUsage.UNIFORM|GPUBufferUsage.COPY_DST}); + this.device.queue.writeBuffer(chBuf,0,new Int32Array([c,0,0,0])); const bg=this.device.createBindGroup({layout:pl.getBindGroupLayout(0), entries:[{binding:0,resource:tex.createView()},{binding:1,resource:{buffer:chBuf}}]}); const enc=this.device.createCommandEncoder(); @@ -447,6 +492,54 @@ class CNNv3Tester { await this.device.queue.onSubmittedWorkDone(); } + zoomChannel(layerId, ch, label, clickTx=0, clickTy=0) { + const def = this.vizDefs?.find(d => d.id === layerId); + const tex = this.layerTextures[layerId]; + if (!def || !tex || !this.device) return; + const wrap = document.getElementById('chzoomWrap'); + const lbl = document.getElementById('chzoomLbl'); + this.activeZoom = {layerId, ch, label, clickTx, clickTy}; + lbl.textContent = label; + wrap.style.display = 'flex'; + // Wait for layout so clientWidth/clientHeight reflect the flex-distributed size + requestAnimationFrame(() => { + const dst = document.getElementById('chzoom'); + const pad = 12; + const lblH = lbl.offsetHeight + 6; + const availW = wrap.clientWidth - pad; + const availH = wrap.clientHeight - pad - lblH; + const scale = Math.min(1, availW / tex.width, availH / tex.height); + dst.width = Math.round(tex.width * scale); + dst.height = Math.round(tex.height * scale); + // Re-render via WebGPU centered on the clicked texel + const ox = clickTx - Math.floor(dst.width / 2); + const oy = clickTy - Math.floor(dst.height / 2); + const pl = def.t === 'f32' ? this.getVizF32() : this.getVizU32(); + const ctx = dst.getContext('webgpu'); + try { ctx.configure({device: this.device, format: this.format}); } catch(_) { return; } + const uData = new ArrayBuffer(16); + const dv = new DataView(uData); + dv.setUint32(0, ch, true); dv.setInt32(8, ox, true); dv.setInt32(12, oy, true); + const chBuf = this.device.createBuffer({size:16, usage:GPUBufferUsage.UNIFORM|GPUBufferUsage.COPY_DST}); + this.device.queue.writeBuffer(chBuf, 0, uData); + const bg = this.device.createBindGroup({layout: pl.getBindGroupLayout(0), + entries:[{binding:0, resource:tex.createView()}, {binding:1, resource:{buffer:chBuf}}]}); + const enc = this.device.createCommandEncoder(); + const rp = enc.beginRenderPass({colorAttachments:[{ + view:ctx.getCurrentTexture().createView(), loadOp:'clear', storeOp:'store'}]}); + rp.setPipeline(pl); rp.setBindGroup(0, bg); rp.draw(6); rp.end(); + this.device.queue.submit([enc.finish()]); + chBuf.destroy(); + }); + } + + refreshZoom() { + if (this.activeZoom) { + const {layerId, ch, label, clickTx, clickTy} = this.activeZoom; + this.zoomChannel(layerId, ch, label, clickTx, clickTy); + } + } + // ── Save PNG ───────────────────────────────────────────────────────────── async savePNG() { @@ -491,6 +584,281 @@ class CNNv3Tester { return(s?-1:1)*Math.pow(2,e-15)*(1+m/1024);}; return [f(lo),f(hi)]; } + + // ── Full G-buffer pack pipeline ─────────────────────────────────────────── + + getFullPack() { + return this.pl('fullpack', () => this.computePL(FULL_PACK_SHADER, 'main')); + } + + // Create a 1×1 rgba8unorm fallback texture with given RGBA bytes [0-255]. + makeFallbackTex(r, g, b, a) { + const tex = this.device.createTexture({size:[1,1], format:'rgba8unorm', + usage: GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_DST}); + this.device.queue.writeTexture({texture:tex}, new Uint8Array([r,g,b,a]), + {bytesPerRow:4,rowsPerImage:1}, [1,1]); + return tex; + } + + // Load an image File as a GPU rgba8unorm texture. Returns {tex, w, h}. + async loadGpuTex(file) { + const bmp = await createImageBitmap(file); + const w = bmp.width, h = bmp.height; + const tex = this.device.createTexture({size:[w,h], format:'rgba8unorm', + usage: GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_DST|GPUTextureUsage.RENDER_ATTACHMENT}); + this.device.queue.copyExternalImageToTexture({source:bmp}, {texture:tex}, [w,h]); + bmp.close(); + return {tex, w, h}; + } + + // ── Load sample directory ───────────────────────────────────────────────── + + async loadSampleDir(files) { + if (!files || files.length === 0) return; + if (!this.weightsU32) { this.setStatus('Load weights first', true); return; } + + this.setMode('full'); + const st = document.getElementById('sampleSt'); + st.textContent = 'Loading…'; + + // Match files by name pattern + const match = (pat) => { + for (const f of files) { + const n = f.name.toLowerCase(); + if (pat.some(p => n.includes(p))) return f; + } + return null; + }; + + const fAlbedo = match(['albedo', 'color']); + const fNormal = match(['normal', 'nrm']); + const fDepth = match(['depth']); + const fMatid = match(['matid', 'index', 'mat_id']); + const fShadow = match(['shadow']); + const fTransp = match(['transp', 'alpha']); + const fTarget = match(['target', 'output', 'ground_truth']); + + if (!fAlbedo) { + st.textContent = '✗ No albedo.png found'; + this.setStatus('No albedo.png in sample dir', true); + return; + } + + try { + const t0 = performance.now(); + + // Load primary albedo to get dimensions + const {tex: albTex, w, h} = await this.loadGpuTex(fAlbedo); + this.canvas.width = w; this.canvas.height = h; + this.context.configure({device:this.device, format:this.format}); + + // Load optional channels — fall back to neutral 1×1 textures + const nrmTex = fNormal ? (await this.loadGpuTex(fNormal)).tex + : this.makeFallbackTex(128, 128, 0, 255); // oct-encoded (0,0) normal + const dptTex = fDepth ? (await this.loadGpuTex(fDepth)).tex + : this.makeFallbackTex(0, 0, 0, 255); + const midTex = fMatid ? (await this.loadGpuTex(fMatid)).tex + : this.makeFallbackTex(0, 0, 0, 255); + const shdTex = fShadow ? (await this.loadGpuTex(fShadow)).tex + : this.makeFallbackTex(255, 255, 255, 255); // fully lit + const trpTex = fTransp ? (await this.loadGpuTex(fTransp)).tex + : this.makeFallbackTex(0, 0, 0, 255); // fully opaque + + // Load target if present + if (this.targetBitmap) { this.targetBitmap.close(); this.targetBitmap = null; } + if (fTarget) { + this.targetBitmap = await createImageBitmap(fTarget); + this.showTarget(); + } else { + document.getElementById('targetPane').style.display = 'none'; + } + + // Pack G-buffer into feat0/feat1 + const mk = (fmt, tw, th) => this.device.createTexture({size:[tw,th], format:fmt, + usage:GPUTextureUsage.STORAGE_BINDING|GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_SRC}); + const f0 = mk('rgba32uint', w, h); + const f1 = mk('rgba32uint', w, h); + + const ceil8 = (n) => Math.ceil(n/8); + const pl = this.getFullPack(); + const bg = this.device.createBindGroup({layout: pl.getBindGroupLayout(0), + entries: [ + {binding:0, resource: albTex.createView()}, + {binding:1, resource: nrmTex.createView()}, + {binding:2, resource: dptTex.createView()}, + {binding:3, resource: midTex.createView()}, + {binding:4, resource: shdTex.createView()}, + {binding:5, resource: trpTex.createView()}, + {binding:6, resource: f0.createView()}, + {binding:7, resource: f1.createView()}, + ]}); + + const enc = this.device.createCommandEncoder(); + const cp = enc.beginComputePass(); + cp.setPipeline(pl); cp.setBindGroup(0, bg); + cp.dispatchWorkgroups(ceil8(w), ceil8(h)); + cp.end(); + this.device.queue.submit([enc.finish()]); + await this.device.queue.onSubmittedWorkDone(); + + // Cleanup source textures + [albTex, nrmTex, dptTex, midTex, shdTex, trpTex].forEach(t => t.destroy()); + + const found = [fAlbedo, fNormal, fDepth, fMatid, fShadow, fTransp] + .filter(Boolean).map(f => f.name).join(', '); + st.textContent = `✓ ${found}`; + this.log(`Sample packed: ${w}×${h}, ${((performance.now()-t0)).toFixed(0)}ms`); + + // Run inference — runFromFeat takes ownership of f0/f1 (stored in layerTextures) + await this.runFromFeat(f0, f1, w, h); + + } catch(e) { + st.textContent = `✗ ${e.message}`; + this.setStatus(`Sample error: ${e.message}`, true); + this.log(`Sample error: ${e.message}`, 'err'); + } + } + + // Show target.png in the #targetPane alongside main canvas. + showTarget() { + if (!this.targetBitmap) return; + const tc = document.getElementById('targetCanvas'); + tc.width = this.targetBitmap.width; + tc.height = this.targetBitmap.height; + const ctx2d = tc.getContext('2d'); + ctx2d.drawImage(this.targetBitmap, 0, 0); + document.getElementById('targetPane').style.display = 'flex'; + } + + // Run CNN inference starting from pre-packed feat_tex0 / feat_tex1. + // Used by loadSampleDir() to skip the photo-pack step. + async runFromFeat(f0, f1, w, h) { + if (!this.weightsU32 || !this.device) return; + const t0 = performance.now(); + const W2=w>>1, H2=h>>1, W4=W2>>1, H4=H2>>1; + + this.context.configure({device:this.device, format:this.format}); + + // Create a neutral "original" texture so the display shader can still + // render Orig/Diff modes (just black for sample mode). + if (this.inputTex) this.inputTex.destroy(); + this.inputTex = this.device.createTexture({size:[w,h], format:'rgba8unorm', + usage:GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_DST|GPUTextureUsage.RENDER_ATTACHMENT}); + // Leave it cleared to black — Diff mode against target would need more work + + const mk = (fmt, tw, th) => this.device.createTexture({size:[tw,th], format:fmt, + usage:GPUTextureUsage.STORAGE_BINDING|GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_SRC}); + const e0=mk('rgba16float',w,h), e1=mk('rgba32uint',W2,H2); + const bn=mk('rgba32uint',W4,H4), d1=mk('rgba16float',W2,H2), ot=mk('rgba16float',w,h); + + if (!this.weightsGPU) { + this.weightsGPU = this.device.createBuffer({size:this.weightsBuffer.byteLength, + usage:GPUBufferUsage.STORAGE|GPUBufferUsage.COPY_DST}); + this.device.queue.writeBuffer(this.weightsGPU, 0, this.weightsBuffer); + } + const wg = this.weightsGPU; + const fp = this.filmParams(); + const wu = (data) => { + const b = this.device.createBuffer({size:data.byteLength, usage:GPUBufferUsage.UNIFORM|GPUBufferUsage.COPY_DST}); + this.device.queue.writeBuffer(b, 0, data); return b; + }; + const uE0=wu(this.u4(ENC0_OFF,fp.ge0,fp.be0)); + const uE1=wu(this.u8(ENC1_OFF,fp.ge1,fp.be1)); + const uBN=wu(this.ubn(BN_OFF)); + const uD1=wu(this.u4(DEC1_OFF,fp.gd1,fp.bd1)); + const uD0=wu(this.u4(DEC0_OFF,fp.gd0,fp.bd0)); + const dispData=new ArrayBuffer(16); + new DataView(dispData).setFloat32(4, this.blend, true); + const uDp=wu(dispData); + + const enc = this.device.createCommandEncoder(); + const bg = (pl,...entries) => this.device.createBindGroup({layout:pl.getBindGroupLayout(0), + entries:entries.map((r,i)=>({binding:i,resource:r}))}); + const rv = (t) => t.createView(); + const cp = (pl,bgr,wx,wy) => {const p=enc.beginComputePass();p.setPipeline(pl);p.setBindGroup(0,bgr);p.dispatchWorkgroups(wx,wy);p.end();}; + const ceil8 = (n) => Math.ceil(n/8); + + cp(this.getEnc0(), bg(this.getEnc0(), rv(f0),rv(f1),{buffer:wg},{buffer:uE0},rv(e0)), ceil8(w), ceil8(h)); + cp(this.getEnc1(), bg(this.getEnc1(), rv(e0),{buffer:wg},{buffer:uE1},rv(e1)), ceil8(W2), ceil8(H2)); + cp(this.getBN(), bg(this.getBN(), rv(e1),{buffer:wg},{buffer:uBN},rv(bn)), ceil8(W4), ceil8(H4)); + cp(this.getDec1(), bg(this.getDec1(), rv(bn),rv(e1),{buffer:wg},{buffer:uD1},rv(d1)), ceil8(W2), ceil8(H2)); + cp(this.getDec0(), bg(this.getDec0(), rv(d1),rv(e0),{buffer:wg},{buffer:uD0},rv(ot)), ceil8(w), ceil8(h)); + + const dbg = bg(this.getDisp(), rv(ot), rv(this.inputTex), {buffer:uDp}); + const rp = enc.beginRenderPass({colorAttachments:[{ + view:this.context.getCurrentTexture().createView(), loadOp:'clear', storeOp:'store'}]}); + rp.setPipeline(this.getDisp()); rp.setBindGroup(0, dbg); rp.draw(6); rp.end(); + + this.device.queue.submit([enc.finish()]); + await this.device.queue.onSubmittedWorkDone(); + + [uE0,uE1,uBN,uD1,uD0].forEach(b => b.destroy()); + + // Compute PSNR against target if available + let psnrStr = ''; + if (this.targetBitmap) { + this.showTarget(); + try { psnrStr = await this.computePSNR(ot, w, h); } catch(_) {} + } + + this.destroyLayerTex(); + this.layerTextures = {feat0:f0, feat1:f1, enc0:e0, enc1:e1, bn, dec1:d1, output:ot}; + this.lastResult = {ot, itex:this.inputTex, uDp, dispPL:this.getDisp(), w, h}; + this.updateVizPanel(); + this.refreshZoom(); + + const ms = (performance.now()-t0).toFixed(1); + document.getElementById('cnnLabel').textContent = `CNN output (${ms}ms)`; + if (psnrStr) document.getElementById('psnrSt').textContent = psnrStr; + this.setStatus(`Sample: ${ms}ms · ${w}×${h}`); + this.log(`runFromFeat: ${ms}ms`); + } + + // Compute PSNR between CNN rgba16float output texture and target.png bitmap. + async computePSNR(outTex, w, h) { + const bpr = Math.ceil(w * 8 / 256) * 256; + const stg = this.device.createBuffer({size:bpr*h, + usage:GPUBufferUsage.COPY_DST|GPUBufferUsage.MAP_READ}); + const enc = this.device.createCommandEncoder(); + enc.copyTextureToBuffer({texture:outTex}, {buffer:stg, bytesPerRow:bpr, rowsPerImage:h}, [w,h]); + this.device.queue.submit([enc.finish()]); + await stg.mapAsync(GPUMapMode.READ); + const raw = new DataView(stg.getMappedRange()); + + // Decode output pixels from f16 + const f16 = (bits) => { + const s=(bits>>15)&1, e=(bits>>10)&0x1F, m=bits&0x3FF; + if(e===0) return 0; if(e===31) return s?0:1; + return Math.max(0,Math.min(1,(s?-1:1)*Math.pow(2,e-15)*(1+m/1024))); + }; + const cnnPx = new Float32Array(w*h*3); + for (let y=0;y<h;y++) for (let x=0;x<w;x++) { + const src=y*bpr+x*8, pi=(y*w+x)*3; + cnnPx[pi] = f16(raw.getUint16(src, true)); + cnnPx[pi+1]= f16(raw.getUint16(src+2, true)); + cnnPx[pi+2]= f16(raw.getUint16(src+4, true)); + } + stg.unmap(); stg.destroy(); + + // Read target pixels via offscreen canvas + const oc = document.createElement('canvas'); + oc.width = w; oc.height = h; + const ctx2d = oc.getContext('2d'); + ctx2d.drawImage(this.targetBitmap, 0, 0, w, h); + const tgtData = ctx2d.getImageData(0, 0, w, h).data; + + let mse = 0; + const n = w * h * 3; + for (let i=0; i<w*h; i++) { + const dr = cnnPx[i*3] - tgtData[i*4] /255; + const dg = cnnPx[i*3+1] - tgtData[i*4+1]/255; + const db = cnnPx[i*3+2] - tgtData[i*4+2]/255; + mse += dr*dr + dg*dg + db*db; + } + mse /= n; + const psnr = mse > 0 ? (10 * Math.log10(1 / mse)).toFixed(2) : '∞'; + return `MSE=${mse.toFixed(5)} PSNR=${psnr}dB`; + } } // ── UI helpers ─────────────────────────────────────────────────────────────── diff --git a/cnn_v3/training/dataset/full/0001/albedo.png b/cnn_v3/training/dataset/full/0001/albedo.png Binary files differnew file mode 100644 index 0000000..8f64b38 --- /dev/null +++ b/cnn_v3/training/dataset/full/0001/albedo.png diff --git a/cnn_v3/training/dataset/full/0001/depth.png b/cnn_v3/training/dataset/full/0001/depth.png Binary files differnew file mode 100644 index 0000000..c58fcd9 --- /dev/null +++ b/cnn_v3/training/dataset/full/0001/depth.png diff --git a/cnn_v3/training/dataset/full/0001/matid.png b/cnn_v3/training/dataset/full/0001/matid.png Binary files differnew file mode 100644 index 0000000..b4fa98f --- /dev/null +++ b/cnn_v3/training/dataset/full/0001/matid.png diff --git a/cnn_v3/training/dataset/full/0001/normal.png b/cnn_v3/training/dataset/full/0001/normal.png Binary files differnew file mode 100644 index 0000000..62f26e3 --- /dev/null +++ b/cnn_v3/training/dataset/full/0001/normal.png diff --git a/cnn_v3/training/dataset/full/0001/shadow.png b/cnn_v3/training/dataset/full/0001/shadow.png Binary files differnew file mode 100644 index 0000000..0471e7f --- /dev/null +++ b/cnn_v3/training/dataset/full/0001/shadow.png diff --git a/cnn_v3/training/dataset/full/0001/target.png b/cnn_v3/training/dataset/full/0001/target.png Binary files differnew file mode 100644 index 0000000..587d54a --- /dev/null +++ b/cnn_v3/training/dataset/full/0001/target.png diff --git a/cnn_v3/training/dataset/full/0001/transp.png b/cnn_v3/training/dataset/full/0001/transp.png Binary files differnew file mode 100644 index 0000000..b4fa98f --- /dev/null +++ b/cnn_v3/training/dataset/full/0001/transp.png diff --git a/cnn_v3/training/dataset/full/0002/albedo.png b/cnn_v3/training/dataset/full/0002/albedo.png Binary files differnew file mode 100644 index 0000000..8f64b38 --- /dev/null +++ b/cnn_v3/training/dataset/full/0002/albedo.png diff --git a/cnn_v3/training/dataset/full/0002/depth.png b/cnn_v3/training/dataset/full/0002/depth.png Binary files differnew file mode 100644 index 0000000..c58fcd9 --- /dev/null +++ b/cnn_v3/training/dataset/full/0002/depth.png diff --git a/cnn_v3/training/dataset/full/0002/matid.png b/cnn_v3/training/dataset/full/0002/matid.png Binary files differnew file mode 100644 index 0000000..b4fa98f --- /dev/null +++ b/cnn_v3/training/dataset/full/0002/matid.png diff --git a/cnn_v3/training/dataset/full/0002/normal.png b/cnn_v3/training/dataset/full/0002/normal.png Binary files differnew file mode 100644 index 0000000..62f26e3 --- /dev/null +++ b/cnn_v3/training/dataset/full/0002/normal.png diff --git a/cnn_v3/training/dataset/full/0002/shadow.png b/cnn_v3/training/dataset/full/0002/shadow.png Binary files differnew file mode 100644 index 0000000..0471e7f --- /dev/null +++ b/cnn_v3/training/dataset/full/0002/shadow.png diff --git a/cnn_v3/training/dataset/full/0002/target.png b/cnn_v3/training/dataset/full/0002/target.png Binary files differnew file mode 100644 index 0000000..587d54a --- /dev/null +++ b/cnn_v3/training/dataset/full/0002/target.png diff --git a/cnn_v3/training/dataset/full/0002/transp.png b/cnn_v3/training/dataset/full/0002/transp.png Binary files differnew file mode 100644 index 0000000..b4fa98f --- /dev/null +++ b/cnn_v3/training/dataset/full/0002/transp.png diff --git a/cnn_v3/training/dataset/full/0003/albedo.png b/cnn_v3/training/dataset/full/0003/albedo.png Binary files differnew file mode 100644 index 0000000..8f64b38 --- /dev/null +++ b/cnn_v3/training/dataset/full/0003/albedo.png diff --git a/cnn_v3/training/dataset/full/0003/depth.png b/cnn_v3/training/dataset/full/0003/depth.png Binary files differnew file mode 100644 index 0000000..c58fcd9 --- /dev/null +++ b/cnn_v3/training/dataset/full/0003/depth.png diff --git a/cnn_v3/training/dataset/full/0003/matid.png b/cnn_v3/training/dataset/full/0003/matid.png Binary files differnew file mode 100644 index 0000000..b4fa98f --- /dev/null +++ b/cnn_v3/training/dataset/full/0003/matid.png diff --git a/cnn_v3/training/dataset/full/0003/normal.png b/cnn_v3/training/dataset/full/0003/normal.png Binary files differnew file mode 100644 index 0000000..62f26e3 --- /dev/null +++ b/cnn_v3/training/dataset/full/0003/normal.png diff --git a/cnn_v3/training/dataset/full/0003/shadow.png b/cnn_v3/training/dataset/full/0003/shadow.png Binary files differnew file mode 100644 index 0000000..0471e7f --- /dev/null +++ b/cnn_v3/training/dataset/full/0003/shadow.png diff --git a/cnn_v3/training/dataset/full/0003/target.png b/cnn_v3/training/dataset/full/0003/target.png Binary files differnew file mode 100644 index 0000000..587d54a --- /dev/null +++ b/cnn_v3/training/dataset/full/0003/target.png diff --git a/cnn_v3/training/dataset/full/0003/transp.png b/cnn_v3/training/dataset/full/0003/transp.png Binary files differnew file mode 100644 index 0000000..b4fa98f --- /dev/null +++ b/cnn_v3/training/dataset/full/0003/transp.png diff --git a/cnn_v3/training/dataset/full/0004/albedo.png b/cnn_v3/training/dataset/full/0004/albedo.png Binary files differnew file mode 100644 index 0000000..8f64b38 --- /dev/null +++ b/cnn_v3/training/dataset/full/0004/albedo.png diff --git a/cnn_v3/training/dataset/full/0004/depth.png b/cnn_v3/training/dataset/full/0004/depth.png Binary files differnew file mode 100644 index 0000000..c58fcd9 --- /dev/null +++ b/cnn_v3/training/dataset/full/0004/depth.png diff --git a/cnn_v3/training/dataset/full/0004/matid.png b/cnn_v3/training/dataset/full/0004/matid.png Binary files differnew file mode 100644 index 0000000..b4fa98f --- /dev/null +++ b/cnn_v3/training/dataset/full/0004/matid.png diff --git a/cnn_v3/training/dataset/full/0004/normal.png b/cnn_v3/training/dataset/full/0004/normal.png Binary files differnew file mode 100644 index 0000000..62f26e3 --- /dev/null +++ b/cnn_v3/training/dataset/full/0004/normal.png diff --git a/cnn_v3/training/dataset/full/0004/shadow.png b/cnn_v3/training/dataset/full/0004/shadow.png Binary files differnew file mode 100644 index 0000000..0471e7f --- /dev/null +++ b/cnn_v3/training/dataset/full/0004/shadow.png diff --git a/cnn_v3/training/dataset/full/0004/target.png b/cnn_v3/training/dataset/full/0004/target.png Binary files differnew file mode 100644 index 0000000..587d54a --- /dev/null +++ b/cnn_v3/training/dataset/full/0004/target.png diff --git a/cnn_v3/training/dataset/full/0004/transp.png b/cnn_v3/training/dataset/full/0004/transp.png Binary files differnew file mode 100644 index 0000000..b4fa98f --- /dev/null +++ b/cnn_v3/training/dataset/full/0004/transp.png |
