44 files changed, 1683 insertions, 191 deletions
diff --git a/cnn_v3/README.md b/cnn_v3/README.md
index f161bf4..a844b1b 100644
--- a/cnn_v3/README.md
+++ b/cnn_v3/README.md
@@ -31,9 +31,18 @@ Add images directly to these directories and commit them.
 
 ## Status
 
-**Phase 1 complete.** G-buffer integrated (raster + pack), 35/35 tests pass.
-Training infrastructure ready. U-Net WGSL shaders are next.
+**Phases 1–7 complete.** 36/36 tests pass.
 
-See `cnn_v3/docs/HOWTO.md` for the practical playbook.
+| Phase | Status |
+|-------|--------|
+| 1 — G-buffer (raster + pack) | ✅ |
+| 2 — Training infrastructure | ✅ |
+| 3 — WGSL U-Net shaders | ✅ |
+| 4 — C++ CNNv3Effect + FiLM | ✅ |
+| 5 — Parity validation | ✅ max_err=4.88e-4 |
+| 6 — Training script | ✅ train_cnn_v3.py |
+| 7 — Validation tools | ✅ GBufViewEffect + web sample loader |
+
+See `cnn_v3/docs/HOWTO.md` for the practical playbook (§9 covers validation tools).
 See `cnn_v3/docs/CNN_V3.md` for full design.
 See `cnn_v2/` for reference implementation.
diff --git a/cnn_v3/docs/HOWTO.md b/cnn_v3/docs/HOWTO.md
index 983e8b7..5c5cc2a 100644
--- a/cnn_v3/docs/HOWTO.md
+++ b/cnn_v3/docs/HOWTO.md
@@ -22,57 +22,141 @@ It rasterizes proxy geometry to MRT G-buffer textures and packs them into two
 
 ### Adding to a Sequence
 
-`GBufferEffect` does not exist in `seq_compiler.py` as a named effect yet
-(no `.seq` syntax integration for Phase 1). Wire it directly in C++ alongside
-your scene code, or add it to the timeline when the full CNNv3Effect is ready.
+Both `GBufferEffect` and `GBufViewEffect` are registered in `seq_compiler.py`
+(`CLASS_TO_HEADER`) and can be wired directly in `timeline.seq`.
 
-**C++ wiring example** (e.g. inside a Sequence or main.cc):
+**Debug view (G-buffer → sink)**:
+```seq
+SEQUENCE 12.00 0 "cnn_v3_test"
+  NODE gbuf_feat0 gbuf_rgba32uint
+  NODE gbuf_feat1 gbuf_rgba32uint
+  EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0.00 8.00
+  EFFECT + GBufViewEffect gbuf_feat0 gbuf_feat1 -> sink 0.00 8.00
+```
 
-```cpp
-#include "../../cnn_v3/src/gbuffer_effect.h"
+**Full CNN pipeline**:
+```seq
+SEQUENCE 12.00 0 "cnn_v3_test"
+  NODE gbuf_feat0 gbuf_rgba32uint
+  NODE gbuf_feat1 gbuf_rgba32uint
+  NODE cnn_v3_out gbuf_albedo
+  EFFECT + GBufferEffect source -> gbuf_feat0 gbuf_feat1 0.00 8.00
+  EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> cnn_v3_out 0.00 8.00
+  EFFECT + Passthrough cnn_v3_out -> sink 0.00 8.00
+```
 
-// Allocate once alongside your scene
-auto gbuf = std::make_shared<GBufferEffect>(
-    ctx, /*inputs=*/{"prev_cnn"},  // or any dummy node
-    /*outputs=*/{"gbuf_feat0", "gbuf_feat1"},
-    /*start=*/0.0f, /*end=*/60.0f);
+### Internal scene
 
-gbuf->set_scene(&my_scene, &my_camera);
+Call `set_scene()` once before the first render to populate the built-in demo
+scene. No external `Scene` or `Camera` pointer is required — the effect owns
+them.
 
-// In render loop, call before CNN pass:
-gbuf->render(encoder, params, nodes);
-```
+**What `set_scene()` creates:**
+- **20 small cubes** — random positions in [-2,2]×[-1.5,1.5]³, scale 0.1–0.25,
+  random colors. Each has a random rotation axis and speed; animated each frame
+  via `quat::from_axis(axis, time * speed)`.
+- **4 pumping spheres** — at fixed world positions, base radii 0.25–0.35.
+  Scale driven by `audio_intensity`: `r = base_r * (1 + audio_intensity * 0.8)`.
+- **Camera** — position (0, 2.5, 6), target (0, 0, 0), 45° FOV.
+  Aspect ratio updated each frame from `params.aspect_ratio`.
+- **Two directional lights** (uploaded to `lights_uniform_`, ready for shadow pass):
+  - Key: warm white (1.0, 0.92, 0.78), direction `normalize(1, 2, 1)` (upper-right-front)
+  - Fill: cool blue (0.4, 0.45, 0.8 × 0.4), direction `normalize(-1, 1, -1)` (upper-left-back)
 
 ### Internal passes
 
 Each frame, `GBufferEffect::render()` executes:
 
-1. **Pass 1 — MRT rasterization** (`gbuf_raster.wgsl`)
+1. **Pass 1 — MRT rasterization** (`gbuf_raster.wgsl`) ✅
    - Proxy box (36 verts) × N objects, instanced
    - MRT outputs: `gbuf_albedo` (rgba16float), `gbuf_normal_mat` (rgba16float)
    - Depth test + write into `gbuf_depth` (depth32float)
+   - `obj.type` written to `ObjectData.params.x` for future SDF branching
 
-2. **Pass 2/3 — SDF + Lighting** — TODO (placeholder: shadow=1, transp=0)
+2. **Pass 2 — SDF shadow raymarching** (`gbuf_shadow.wgsl`) ✅
+   - See implementation plan below.
 
-3. **Pass 4 — Pack compute** (`gbuf_pack.wgsl`)
+3. **Pass 3 — Transparency** — TODO (deferred; transp=0 for opaque scenes)
+
+4. **Pass 4 — Pack compute** (`gbuf_pack.wgsl`) ✅
    - Reads all G-buffer textures + `prev_cnn` input
    - Writes `feat_tex0` + `feat_tex1` (rgba32uint, 20 channels, 32 bytes/pixel)
+   - Shadow / transp nodes cleared to 1.0 / 0.0 via zero-draw render passes
+     until Pass 2/3 are implemented.
 
 ### Output node names
 
-By default the outputs are named from the `outputs` vector passed to the
-constructor. Use these names when binding the CNN effect input:
+Outputs are named from the `outputs` vector passed to the constructor:
 
 ```
 outputs[0]  → feat_tex0   (rgba32uint: albedo.rgb, normal.xy, depth, depth_grad.xy)
 outputs[1]  → feat_tex1   (rgba32uint: mat_id, prev.rgb, mip1.rgb, mip2.rgb, shadow, transp)
 ```
 
-### Scene data
+---
+
+## 1b. GBufferEffect — Implementation Plan (Pass 2: SDF Shadow)
+
+### What remains
+
+| Item | Status | Notes |
+|------|--------|-------|
+| Pass 1: MRT raster | ✅ Done | proxy box, all object types |
+| Pass 4: Pack compute | ✅ Done | 20 channels packed |
+| Internal scene + animation | ✅ Done | cubes + spheres + 2 lights |
+| Pass 2: SDF shadow | ✅ Done | `gbuf_shadow.wgsl`, proxy-box SDF per object |
+| Pass 3: Transparency | ❌ TODO | low priority, opaque scenes only |
+| Phase 4: type-aware SDF | ✅ Done | switch on `obj.params.x` in `dfWithID` |
+
+### Pass 2: SDF shadow raymarching
 
-Call `set_scene(scene, camera)` before the first render. The effect uploads
-`GlobalUniforms` (view-proj, camera pos, resolution) and `ObjectData` (model
-matrix, color) to GPU storage buffers each frame.
+**New file: `cnn_v3/shaders/gbuf_shadow.wgsl`** — fullscreen render pass.
+
+Bind layout:
+
+| Binding | Type | Content |
+|---------|------|---------|
+| 0 | `uniform` | `GlobalUniforms` (`#include "common_uniforms"`) |
+| 1 | `storage read` | `ObjectsBuffer` |
+| 2 | `texture_depth_2d` | depth from Pass 1 |
+| 3 | `sampler` (non-filtering) | depth load |
+| 4 | `uniform` | `GBufLightsUniforms` (2 lights) |
+
+Algorithm per fragment:
+1. Reconstruct world position from NDC depth + `globals.inv_view_proj`
+2. For each object: `sdBox((inv_model * world_pos).xyz, vec3(1.0))` — proxy box in local space
+3. For each light: offset ray origin by `0.02 * surface_normal`; march shadow ray toward `light.direction`
+4. Soft shadow via `shadowWithStoredDistance()` from `render/raymarching_id`
+5. Combine lights: `shadow = min(shadow_light0, shadow_light1)`
+6. Discard fragments where depth == 1.0 (sky/background → shadow = 1.0)
+7. Output shadow factor to RGBA8Unorm render target (`.r` = shadow)
+
+**C++ additions (`gbuffer_effect.h/.cc`):**
+```cpp
+RenderPipeline shadow_pipeline_;
+void create_shadow_pipeline();
+```
+In `render()` between Pass 1 and the shadow/transp node clears:
+- Build bind group (global_uniforms_buf_, objects_buf_, depth_view, sampler_, lights_uniform_)
+- Run fullscreen triangle → `node_shadow_` color attachment
+- Remove the `clear_node(node_shadow_, 1.0f)` placeholder once the pass is live
+
+**Register:**
+- `cnn_v3/shaders/gbuf_shadow.wgsl` → `SHADER_GBUF_SHADOW` in `assets.txt`
+- `extern const char* gbuf_shadow_wgsl;` in `gbuffer_effect.cc`
+
+### Phase 4: Object-type-aware SDF (optional)
+
+Branch on `obj.params.x` (populated since this commit) using `math/sdf_shapes`:
+
+| Type value | ObjectType | SDF |
+|------------|-----------|-----|
+| 0 | CUBE | `sdBox(local_p, vec3(1))` |
+| 1 | SPHERE | `sdSphere(local_p, 1.0)` |
+| 2 | PLANE | `sdPlane(local_p, vec3(0,1,0), obj.params.y)` |
+| 3 | TORUS | `sdTorus(local_p, vec2(0.8, 0.2))` |
+
+Only worth adding after Pass 2 is validated visually.
 
 ---
 
@@ -253,12 +337,14 @@ Test vectors generated by `cnn_v3/training/gen_test_vectors.py` (PyTorch referen
 | Phase | Status | Notes |
 |-------|--------|-------|
 | 1 — G-buffer (raster + pack) | ✅ Done | Integrated, 36/36 tests pass |
-| 1 — G-buffer (SDF + shadow passes) | TODO | Placeholder: shadow=1, transp=0 |
+| 1 — G-buffer (SDF shadow pass) | ✅ Done | `gbuf_shadow.wgsl`, proxy-box SDF |
 | 2 — Training infrastructure | ✅ Done | blender_export.py, pack_*_sample.py |
 | 3 — WGSL U-Net shaders | ✅ Done | 5 compute shaders + cnn_v3/common snippet |
 | 4 — C++ CNNv3Effect | ✅ Done | FiLM uniform upload, 36/36 tests pass |
 | 5 — Parity validation | ✅ Done | test_cnn_v3_parity.cc, max_err=4.88e-4 |
 | 6 — FiLM MLP training | ✅ Done | train_cnn_v3.py + cnn_v3_utils.py written |
+| 7 — G-buffer visualizer (C++) | ✅ Done | GBufViewEffect, 36/36 tests pass |
+| 7 — Sample loader (web tool) | ✅ Done | "Load sample directory" in cnn_v3/tools/ |
 
 ---
 
@@ -337,9 +423,142 @@ auto src = ShaderComposer::Get().Compose({"cnn_v3/common"}, raw_wgsl);
 
 ---
 
-## 9. See Also
+## 9. Validation Workflow
+
+Two complementary tools let you verify each stage of the pipeline before training
+or integrating into the demo.
+
+### 9a. C++ — GBufViewEffect (G-buffer channel grid)
+
+`GBufViewEffect` renders all 20 feature channels from `feat_tex0` / `feat_tex1`
+in a **4×5 tiled grid** so you can see the G-buffer at a glance.
+
+**Registration (already done)**
+
+| File | What changed |
+|------|-------------|
+| `cnn_v3/shaders/gbuf_view.wgsl` | New fragment shader |
+| `cnn_v3/src/gbuf_view_effect.h` | Effect class declaration |
+| `cnn_v3/src/gbuf_view_effect.cc` | Effect class implementation |
+| `workspaces/main/assets.txt` | `SHADER_GBUF_VIEW` asset |
+| `cmake/DemoSourceLists.cmake` | `gbuf_view_effect.cc` in COMMON_GPU_EFFECTS |
+| `src/gpu/demo_effects.h` | `#include "../../cnn_v3/src/gbuf_view_effect.h"` |
+| `src/effects/shaders.h/.cc` | `gbuf_view_wgsl` extern declaration + definition |
+| `src/tests/gpu/test_demo_effects.cc` | GBufViewEffect test |
+
+**Constructor signature**
+
+```cpp
+GBufViewEffect(const GpuContext& ctx,
+               const std::vector<std::string>& inputs,   // {feat_tex0, feat_tex1}
+               const std::vector<std::string>& outputs,  // {gbuf_view_out}
+               float start_time, float end_time)
+```
+
+**Wiring example** (alongside GBufferEffect):
+
+```cpp
+auto gbuf  = std::make_shared<GBufferEffect>(ctx,
+    std::vector<std::string>{"prev_cnn"},
+    std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"}, 0.0f, 60.0f);
+auto gview = std::make_shared<GBufViewEffect>(ctx,
+    std::vector<std::string>{"gbuf_feat0", "gbuf_feat1"},
+    std::vector<std::string>{"gbuf_view_out"}, 0.0f, 60.0f);
+```
+
+**Grid layout** (output resolution = input resolution, channel cells each 1/4 W × 1/5 H):
+
+| Row | Col 0 | Col 1 | Col 2 | Col 3 |
+|-----|-------|-------|-------|-------|
+| 0 | `alb.r` | `alb.g` | `alb.b` | `nrm.x` remap→[0,1] |
+| 1 | `nrm.y` remap→[0,1] | `depth` (inverted) | `dzdx` ×20+0.5 | `dzdy` ×20+0.5 |
+| 2 | `mat_id` | `prev.r` | `prev.g` | `prev.b` |
+| 3 | `mip1.r` | `mip1.g` | `mip1.b` | `mip2.r` |
+| 4 | `mip2.g` | `mip2.b` | `shadow` | `transp` |
+
+All channels displayed as grayscale. 1-pixel gray grid lines separate cells. Dark background for out-of-range cells.
+
+**Shader binding layout** (no sampler needed — integer texture):
+
+| Binding | Type | Content |
+|---------|------|---------|
+| 0 | `texture_2d<u32>` | `feat_tex0` (8 f16 channels via `pack2x16float`) |
+| 1 | `texture_2d<u32>` | `feat_tex1` (12 u8 channels via `pack4x8unorm`) |
+| 2 | `uniform` (8 B) | `GBufViewUniforms { resolution: vec2f }` |
+
+The BGL is built manually in the constructor (no sampler) — this is an exception to the
+standard post-process pattern because `rgba32uint` textures use `WGPUTextureSampleType_Uint`
+and cannot be sampled, only loaded via `textureLoad()`.
+
+**Implementation note — bind group recreation**
+
+`render()` calls `wgpuRenderPipelineGetBindGroupLayout(pipeline_, 0)` each frame to
+extract the BGL, creates a new `BindGroup`, then immediately releases the BGL handle.
+This avoids storing a raw BGL as a member (no RAII wrapper exists for it) while
+remaining correct across ping-pong buffer swaps.
+
+---
+
+### 9b. Web tool — "Load sample directory"
+
+`cnn_v3/tools/index.html` has a **"Load sample directory"** button that:
+1. Opens a `webkitdirectory` picker to select a sample folder
+2. Loads all G-buffer component PNGs as `rgba8unorm` GPU textures
+3. Runs the `FULL_PACK_SHADER` compute shader to assemble `feat_tex0` / `feat_tex1`
+4. Runs full CNN inference (enc0 → enc1 → bottleneck → dec1 → dec0)
+5. Displays the CNN output on the main canvas
+6. If `target.png` is present, shows it side-by-side and prints PSNR
+
+**File name matching** (case-insensitive, substring):
+
+| Channel | Matched patterns | Fallback |
+|---------|-----------------|---------|
+| Albedo (required) | `albedo`, `color` | — (error if missing) |
+| Normal | `normal`, `nrm` | `rgb(128,128,0,255)` — flat (0,0) oct-encoded |
+| Depth | `depth` | `0` — zero depth |
+| Mat ID | `matid`, `index`, `mat_id` | `0` — no material |
+| Shadow | `shadow` | `255` — fully lit |
+| Transparency | `transp`, `alpha` | `0` — fully opaque |
+| Target | `target`, `output`, `ground_truth` | not shown |
+
+**`FULL_PACK_SHADER`** (defined in `cnn_v3/tools/shaders.js`)
+
+WebGPU compute shader (`@workgroup_size(8,8)`) with 9 bindings:
+
+| Binding | Resource | Format |
+|---------|----------|--------|
+| 0–5 | albedo, normal, depth, matid, shadow, transp | `texture_2d<f32>` (rgba8unorm, R channel for single-channel maps) |
+| 6 | feat_tex0 output | `texture_storage_2d<rgba32uint,write>` |
+| 7 | feat_tex1 output | `texture_storage_2d<rgba32uint,write>` |
+
+No sampler — all reads use `textureLoad()` (integer texel coordinates).
+
+Packs channels identically to `gbuf_pack.wgsl`:
+- `feat_tex0`: `pack2x16float(alb.rg)`, `pack2x16float(alb.b, nrm.x)`, `pack2x16float(nrm.y, depth)`, `pack2x16float(dzdx, dzdy)`
+- `feat_tex1`: `pack4x8unorm(matid,0,0,0)`, `pack4x8unorm(mip1.rgb, mip2.r)`, `pack4x8unorm(mip2.gb, shadow, transp)`
+- Depth gradients: central differences on depth R channel
+- Mip1 / Mip2: box2 (2×2) / box4 (4×4) average filter on albedo
+
+**PSNR computation** (`computePSNR`)
+
+- CNN output (`rgba16float`) copied to CPU staging buffer via `copyTextureToBuffer`
+- f16→float32 decoded in JavaScript
+- Target drawn to offscreen `<canvas>` via `drawImage`, pixels read with `getImageData`
+- MSE and PSNR computed over all RGB pixels (alpha ignored)
+- Result displayed below target canvas as `MSE=X.XXXXX  PSNR=XX.XXdB`
+
+**`runFromFeat(f0, f1, w, h)`**
+
+Called by `loadSampleDir()` after packing, or can be called directly if feat textures
+are already available. Skips the photo-pack step, runs all 5 CNN passes, and displays
+the result. Intermediate textures are stored in `this.layerTextures` so the Layer
+Visualization panel still works.
+
+---
+
+## 10. See Also
 
 - `cnn_v3/docs/CNN_V3.md` — Full architecture design (U-Net, FiLM, feature layout)
 - `doc/EFFECT_WORKFLOW.md` — General effect integration guide
 - `cnn_v2/docs/CNN_V2.md` — Reference implementation (simpler, operational)
-- `src/tests/gpu/test_demo_effects.cc` — GBufferEffect construction test
+- `src/tests/gpu/test_demo_effects.cc` — GBufferEffect + GBufViewEffect tests
diff --git a/cnn_v3/shaders/gbuf_deferred.wgsl b/cnn_v3/shaders/gbuf_deferred.wgsl
new file mode 100644
index 0000000..2ed4ce3
--- /dev/null
+++ b/cnn_v3/shaders/gbuf_deferred.wgsl
@@ -0,0 +1,48 @@
+// G-buffer deferred render — albedo * diffuse
+// Reads feat_tex0 (rgba32uint, f16 geometric).
+// Outputs albedo * (ambient + diffuse).
+
+#include "math/normal"
+
+@group(0) @binding(0) var feat_tex0: texture_2d<u32>;
+@group(0) @binding(1) var feat_tex1: texture_2d<u32>;
+@group(0) @binding(2) var<uniform> uniforms: GBufDeferredUniforms;
+
+struct GBufDeferredUniforms {
+    resolution: vec2f,
+}
+
+const KEY_LIGHT: vec3f = vec3f(0.408, 0.816, 0.408);  // normalize(1,2,1)
+const AMBIENT:   f32   = 0.15;
+
+@vertex
+fn vs_main(@builtin(vertex_index) vid: u32) -> @builtin(position) vec4f {
+    let x = f32((vid & 1u) << 2u) - 1.0;
+    let y = f32((vid & 2u) << 1u) - 1.0;
+    return vec4f(x, y, 0.0, 1.0);
+}
+
+@fragment
+fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f {
+    let coord = vec2i(pos.xy);
+
+    // feat_tex0 layout:
+    //   [0] pack2x16float(albedo.r, albedo.g)
+    //   [1] pack2x16float(albedo.b, normal.x)   oct [-1,1]
+    //   [2] pack2x16float(normal.y, depth)       oct [-1,1]
+    let t0     = textureLoad(feat_tex0, coord, 0);
+    let rg     = unpack2x16float(t0.x);
+    let bx     = unpack2x16float(t0.y);
+    let ny_d   = unpack2x16float(t0.z);
+    let albedo = vec3f(rg.x, rg.y, bx.x);
+
+    // Decode oct-normal (stored in [-1,1] — use oct_decode directly)
+    let normal  = oct_decode(vec2f(bx.y, ny_d.x));
+    let diffuse = max(0.0, dot(normal, KEY_LIGHT));
+
+    // feat_tex1[2] = pack4x8unorm(mip2.g, mip2.b, shadow, transp)
+    let t1     = textureLoad(feat_tex1, coord, 0);
+    let shadow = unpack4x8unorm(t1.z).z;
+
+    return vec4f(albedo * (AMBIENT + diffuse * shadow), 1.0);
+}
diff --git a/cnn_v3/shaders/gbuf_pack.wgsl b/cnn_v3/shaders/gbuf_pack.wgsl
index 71d8471..333589c 100644
--- a/cnn_v3/shaders/gbuf_pack.wgsl
+++ b/cnn_v3/shaders/gbuf_pack.wgsl
@@ -1,4 +1,5 @@
 // G-buffer pack compute shader for CNN v3
+#include "math/normal"
 // Pass 4: Pack all G-buffer channels into two rgba32uint feature textures (32 bytes/pixel)
 // Output feat_tex0 holds 8×f16 geometric channels; feat_tex1 holds 12×u8 context channels.
 
@@ -44,15 +45,6 @@ fn box4(tl: vec2i) -> vec3f {
     return acc * (1.0 / 16.0);
 }
 
-// Decode oct-normal from [0,1] storage → [-1,1] encoded xy → reconstruct z.
-fn decode_oct_normal(rg: vec2f) -> vec3f {
-    let f = rg * 2.0 - vec2f(1.0);
-    var n = vec3f(f.x, f.y, 1.0 - abs(f.x) - abs(f.y));
-    let t = max(-n.z, 0.0);
-    n.x += select(t, -t, n.x >= 0.0);
-    n.y += select(t, -t, n.y >= 0.0);
-    return normalize(n);
-}
 
 @compute @workgroup_size(8, 8)
 fn pack_features(@builtin(global_invocation_id) id: vec3u) {
@@ -73,7 +65,7 @@ fn pack_features(@builtin(global_invocation_id) id: vec3u) {
 
     // Normal: stored as oct-encoded [0,1] in RG; extract just the encoded xy for feat_tex0
     let normal_enc = nm.rg; // already in [0,1] — decode to get the xy for CNN input
-    let n3         = decode_oct_normal(normal_enc);
+    let n3         = oct_decode_unorm(normal_enc);
     // Store oct-encoded in [-1,1] remapped back to what CNN expects (the [-1,1] oct xy)
     let oct_xy     = normal_enc * 2.0 - vec2f(1.0); // remap [0,1] → [-1,1]
 
diff --git a/cnn_v3/shaders/gbuf_raster.wgsl b/cnn_v3/shaders/gbuf_raster.wgsl
index c762db2..aed0e72 100644
--- a/cnn_v3/shaders/gbuf_raster.wgsl
+++ b/cnn_v3/shaders/gbuf_raster.wgsl
@@ -1,8 +1,11 @@
 // G-buffer rasterization shader for CNN v3
 // Pass 1: Proxy geometry → MRT (albedo rgba16float, normal_mat rgba16float, depth32)
 // Uses GlobalUniforms, ObjectData, ObjectsBuffer from common_uniforms.
+// SPHERE objects use ray-sphere impostor (correct silhouette + normal + depth).
 
 #include "common_uniforms"
+#include "math/normal"
+#include "ray_sphere"
 
 @group(0) @binding(0) var<uniform> globals: GlobalUniforms;
 @group(0) @binding(1) var<storage, read> object_data: ObjectsBuffer;
@@ -15,21 +18,6 @@ struct VertexOutput {
     @location(3) @interpolate(flat) instance_index: u32,
 }
 
-// Octahedral encoding: maps unit normal to [-1,1]^2
-fn oct_encode(n: vec3f) -> vec2f {
-    let inv_l1 = 1.0 / (abs(n.x) + abs(n.y) + abs(n.z));
-    var p = n.xy * inv_l1;
-    // Fold lower hemisphere
-    if (n.z < 0.0) {
-        let s = vec2f(
-            select(-1.0, 1.0, p.x >= 0.0),
-            select(-1.0, 1.0, p.y >= 0.0)
-        );
-        p = (1.0 - abs(p.yx)) * s;
-    }
-    return p; // in [-1, 1]
-}
-
 @vertex
 fn vs_main(
     @builtin(vertex_index)   vertex_index:   u32,
@@ -73,8 +61,8 @@ fn vs_main(
 
     let world_pos    = obj.model * vec4f(p, 1.0);
     let clip_pos     = globals.view_proj * world_pos;
-    // Transform normal by inverse-transpose (upper-left 3×3 of inv_model^T)
-    let world_normal = normalize((obj.inv_model * vec4f(n, 0.0)).xyz);
+    // Transform normal: use model matrix (correct for uniform scale + rotation).
+    let world_normal = normalize((obj.model * vec4f(n, 0.0)).xyz);
 
     var out: VertexOutput;
     out.position       = clip_pos;
@@ -86,20 +74,40 @@ fn vs_main(
 }
 
 struct GBufOutput {
-    @location(0) albedo:     vec4f,  // rgba16float: material color
-    @location(1) normal_mat: vec4f,  // rgba16float: oct-normal XY in RG, mat_id/255 in B
+    @location(0)            albedo:     vec4f,  // rgba16float: material color
+    @location(1)            normal_mat: vec4f,  // rgba16float: oct-normal XY in RG, mat_id/255 in B
+    @builtin(frag_depth)    depth:      f32,    // corrected depth (sphere impostor)
 }
 
 @fragment
 fn fs_main(in: VertexOutput) -> GBufOutput {
-    let obj    = object_data.objects[in.instance_index];
-    let mat_id = f32(in.instance_index) / 255.0;
+    let obj      = object_data.objects[in.instance_index];
+    let obj_type = u32(obj.params.x);
+    let mat_id   = f32(in.instance_index) / 255.0;
+
+    var world_normal = normalize(in.world_normal);
+    var frag_depth   = in.position.z;  // default: hardware depth
+
+    // Sphere impostor: ray-sphere intersection for correct silhouette and normal.
+    if (obj_type == 1u) {
+        let sphere_center = obj.model[3].xyz;
+        let sphere_radius = length(obj.model[0].xyz);  // uniform scale in col0
+        let cam_pos = globals.camera_pos_time.xyz;
+        let rd = normalize(in.world_pos - cam_pos);
+        let isect = ray_sphere_intersection(cam_pos, rd, sphere_center, sphere_radius);
+        if (!isect.hit) { discard; }
+        let hit = cam_pos + rd * isect.t;
+        world_normal = normalize(hit - sphere_center);
+        // Reproject hit point to get correct clip-space depth.
+        let clip_hit = globals.view_proj * vec4f(hit, 1.0);
+        frag_depth   = clip_hit.z / clip_hit.w;
+    }
 
-    // Oct-encode world normal, remap [-1,1] → [0,1] for storage
-    let oct = oct_encode(normalize(in.world_normal)) * 0.5 + vec2f(0.5);
+    let oct = oct_encode_unorm(world_normal);
 
     var out: GBufOutput;
     out.albedo     = vec4f(in.color.rgb, 1.0);
     out.normal_mat = vec4f(oct.x, oct.y, mat_id, 0.0);
+    out.depth      = frag_depth;
     return out;
 }
diff --git a/cnn_v3/shaders/gbuf_shadow.wgsl b/cnn_v3/shaders/gbuf_shadow.wgsl
new file mode 100644
index 0000000..735e47c
--- /dev/null
+++ b/cnn_v3/shaders/gbuf_shadow.wgsl
@@ -0,0 +1,118 @@
+// G-buffer shadow raymarching shader for CNN v3
+// Pass 2: Reads depth from Pass 1, marches shadow rays toward lights,
+// outputs shadow factor (1.0=lit, 0.0=shadow) to RGBA8Unorm render target (.r).
+
+#include "common_uniforms"
+#include "camera_common"
+#include "math/sdf_shapes"
+#include "math/normal"
+#include "render/raymarching_id"
+
+@group(0) @binding(0) var<uniform>         globals:     GlobalUniforms;
+@group(0) @binding(1) var<storage, read>   object_data: ObjectsBuffer;
+@group(0) @binding(2) var                  depth_tex:   texture_depth_2d;
+@group(0) @binding(4) var                  normal_mat_tex: texture_2d<f32>;
+
+struct GBufLight {
+    direction: vec4f,  // xyz = toward light (world space, normalized)
+    color:     vec4f,  // rgb = color, a = intensity
+}
+struct GBufLightsUniforms {
+    lights: array<GBufLight, 2>,
+    params: vec4f,  // x = num_lights
+}
+@group(0) @binding(3) var<uniform> lights: GBufLightsUniforms;
+
+// ---- SDF scene (proxy box per object in local space) ----
+
+// Stub required by render/raymarching (shadow() / rayMarch() call df()).
+fn df(p: vec3f) -> f32 { return MAX_RAY_LENGTH; }
+
+// SDF of the full scene: proxy box for each object transformed to local space.
+fn dfWithID(p: vec3f) -> RayMarchResult {
+    var res: RayMarchResult;
+    res.distance     = MAX_RAY_LENGTH;
+    res.distance_max = MAX_RAY_LENGTH;
+    res.object_id    = 0.0;
+
+    let n = u32(globals.params.x);
+    for (var i = 0u; i < n; i++) {
+        let obj      = object_data.objects[i];
+        let lp       = (obj.inv_model * vec4f(p, 1.0)).xyz;
+        let obj_type = u32(obj.params.x);
+        // Scale factor: convert local-space SDF to world-space distance.
+        let scale    = length(obj.model[0].xyz);
+        var d: f32;
+        switch obj_type {
+            case 1u: { d = sdSphere(lp, 1.0) * scale; }                              // SPHERE
+            case 2u: { d = sdPlane(lp, vec3f(0.0, 1.0, 0.0), obj.params.y); }       // PLANE
+            case 3u: { d = sdTorus(lp, vec2f(0.8, 0.2)) * scale; }                   // TORUS
+            default: { d = sdBox(lp, vec3f(1.0)) * scale; }                          // CUBE (0) + fallback
+        }
+        if (d < res.distance) {
+            res.distance  = d;
+            res.object_id = f32(i + 1u);
+        }
+    }
+    return res;
+}
+
+// Soft shadow march (IQ formula). Returns 1=lit, 0=shadow.
+// No dmin/dmax bounds: in open space d grows large so 8*d/t >> 1, res stays 1 naturally.
+fn soft_shadow(ro: vec3f, rd: vec3f) -> f32 {
+    var t   = 0.001;
+    var res = 1.0;
+    for (var i = 0; i < 64; i++) {
+        let d = dfWithID(ro + rd * t).distance;
+        if (d < 0.0005) { return 0.0; }
+        res = min(res, 8.0 * d / t);
+        t  += d;
+    }
+    return clamp(res, 0.0, 1.0);
+}
+
+// ---- Vertex: fullscreen triangle ----
+
+@vertex
+fn vs_main(@builtin(vertex_index) vid: u32) -> @builtin(position) vec4f {
+    let x = f32((vid & 1u) << 2u) - 1.0;
+    let y = f32((vid & 2u) << 1u) - 1.0;
+    return vec4f(x, y, 0.0, 1.0);
+}
+
+// ---- Fragment: shadow factor per pixel ----
+
+@fragment
+fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f {
+    let depth = textureLoad(depth_tex, vec2i(pos.xy), 0);
+
+    // Sky / background: fully lit.
+    if (depth >= 1.0) {
+        return vec4f(1.0);
+    }
+
+    // Reconstruct world-space position from NDC + depth.
+    let res = globals.resolution;
+    let ndc = vec2f(
+        (pos.x / res.x) * 2.0 - 1.0,
+        1.0 - (pos.y / res.y) * 2.0
+    );
+    let clip  = globals.inv_view_proj * vec4f(ndc, depth, 1.0);
+    let world = clip.xyz / clip.w;
+
+    // Use rasterized surface normal for bias — correct for sphere impostors.
+    let nm       = textureLoad(normal_mat_tex, vec2i(pos.xy), 0);
+    let nor      = oct_decode_unorm(nm.rg);
+    let bias_pos = world + nor * 0.05;
+
+    // March shadow rays toward each light; take the darkest value.
+    var shadow_val  = 1.0;
+    let num_lights  = u32(lights.params.x);
+    for (var i = 0u; i < num_lights; i++) {
+        let ld = lights.lights[i].direction.xyz;
+        let s  = soft_shadow(bias_pos, ld);
+        shadow_val = min(shadow_val, s);
+    }
+
+    return vec4f(shadow_val, shadow_val, shadow_val, 1.0);
+}
diff --git a/cnn_v3/shaders/gbuf_view.wgsl b/cnn_v3/shaders/gbuf_view.wgsl
new file mode 100644
index 0000000..3e7d1ff
--- /dev/null
+++ b/cnn_v3/shaders/gbuf_view.wgsl
@@ -0,0 +1,127 @@
+// G-buffer channel visualization — 4×5 grid of 20 feature channels.
+// Takes feat_tex0 (rgba32uint, ch 0-7 f16) and feat_tex1 (rgba32uint, ch 8-19 unorm8).
+// Outputs tiled channel view to a standard rgba8unorm render target.
+//
+// Channel layout (row×col):
+//   Row 0: ch0(alb.r)  ch1(alb.g)  ch2(alb.b)  ch3(nrm.x)
+//   Row 1: ch4(nrm.y)  ch5(depth)  ch6(dzdx)   ch7(dzdy)
+//   Row 2: ch8(matid)  ch9(prv.r)  ch10(prv.g) ch11(prv.b)
+//   Row 3: ch12(m1.r)  ch13(m1.g)  ch14(m1.b)  ch15(m2.r)
+//   Row 4: ch16(m2.g)  ch17(m2.b)  ch18(shdw)  ch19(trns)
+
+#include "debug/debug_print"
+
+struct GBufViewUniforms { resolution: vec2f }
+
+@group(0) @binding(0) var feat0:   texture_2d<u32>;
+@group(0) @binding(1) var feat1:   texture_2d<u32>;
+@group(0) @binding(2) var<uniform> u: GBufViewUniforms;
+
+@vertex
+fn vs_main(@builtin(vertex_index) vid: u32) -> @builtin(position) vec4f {
+    var corners = array<vec2f, 3>(
+        vec2f(-1.0, -1.0), vec2f(3.0, -1.0), vec2f(-1.0, 3.0));
+    return vec4f(corners[vid], 0.0, 1.0);
+}
+
+@fragment
+fn fs_main(@builtin(position) pos: vec4f) -> @location(0) vec4f {
+    let uv  = pos.xy / u.resolution;
+
+    let COLS = 4.0;
+    let ROWS = 5.0;
+    let col  = u32(uv.x * COLS);
+    let row  = u32(uv.y * ROWS);
+    let ch   = row * 4u + col;
+
+    if (col >= 4u || ch >= 20u) {
+        return vec4f(0.05, 0.05, 0.05, 1.0);
+    }
+
+    // 1-pixel grid lines (thin border per cell)
+    let lx = fract(uv.x * COLS);
+    let ly = fract(uv.y * ROWS);
+    if (lx < 0.005 || lx > 0.995 || ly < 0.005 || ly > 0.995) {
+        return vec4f(0.25, 0.25, 0.25, 1.0);
+    }
+
+    // Map local UV to texel coordinate
+    let dim = vec2i(textureDimensions(feat0));
+    let tc  = clamp(vec2i(vec2f(lx, ly) * vec2f(dim)), vec2i(0), dim - vec2i(1));
+
+    var v: f32 = 0.0;
+
+    if (ch < 8u) {
+        // feat0: 4 × pack2x16float — each u32 component holds two f16 values
+        let t       = textureLoad(feat0, tc, 0);
+        let pair_idx = ch >> 1u;
+        let sub      = ch & 1u;
+        var p: vec2f;
+        if      (pair_idx == 0u) { p = unpack2x16float(t.x); }
+        else if (pair_idx == 1u) { p = unpack2x16float(t.y); }
+        else if (pair_idx == 2u) { p = unpack2x16float(t.z); }
+        else                     { p = unpack2x16float(t.w); }
+        v = select(p.y, p.x, sub == 0u);
+    } else {
+        // feat1: 3 × pack4x8unorm — components .x/.y/.z hold 4 u8 values each
+        let t        = textureLoad(feat1, tc, 0);
+        let ch1      = ch - 8u;
+        let comp_idx = ch1 / 4u;
+        let sub      = ch1 % 4u;
+        var bytes: vec4f;
+        if      (comp_idx == 0u) { bytes = unpack4x8unorm(t.x); }
+        else if (comp_idx == 1u) { bytes = unpack4x8unorm(t.y); }
+        else                     { bytes = unpack4x8unorm(t.z); }
+        var ba = array<f32, 4>(bytes.x, bytes.y, bytes.z, bytes.w);
+        v = ba[sub];
+    }
+
+    // Channel-specific normalization for display clarity
+    var disp: f32;
+    if (ch <= 2u) {
+        // Albedo: already [0,1]
+        disp = clamp(v, 0.0, 1.0);
+    } else if (ch == 3u || ch == 4u) {
+        // Normals oct-encoded in [-1,1] → remap to [0,1]
+        disp = clamp(v * 0.5 + 0.5, 0.0, 1.0);
+    } else if (ch == 5u) {
+        // Depth [0,1]: invert so near=white, far=dark
+        disp = clamp(1.0 - v, 0.0, 1.0);
+    } else if (ch == 6u || ch == 7u) {
+        // Depth gradients (signed, small values): amplify × 20 + 0.5 for visibility
+        disp = clamp(v * 20.0 + 0.5, 0.0, 1.0);
+    } else {
+        // Everything else: clamp to [0,1]
+        disp = clamp(v, 0.0, 1.0);
+    }
+
+    var out = vec4f(disp, disp, disp, 1.0);
+
+    // Label at top-left of each tile
+    let tile_w = u.resolution.x / 4.0;
+    let tile_h = u.resolution.y / 5.0;
+    let origin = vec2f(f32(col) * tile_w + 4.0, f32(row) * tile_h + 4.0);
+    switch ch {
+        case  0u: { out = debug_str(out, pos.xy, origin, vec4u(0x616C622Eu, 0x72000000u, 0u, 0u), 5u); } // alb.r
+        case  1u: { out = debug_str(out, pos.xy, origin, vec4u(0x616C622Eu, 0x67000000u, 0u, 0u), 5u); } // alb.g
+        case  2u: { out = debug_str(out, pos.xy, origin, vec4u(0x616C622Eu, 0x62000000u, 0u, 0u), 5u); } // alb.b
+        case  3u: { out = debug_str(out, pos.xy, origin, vec4u(0x6E726D2Eu, 0x78000000u, 0u, 0u), 5u); } // nrm.x
+        case  4u: { out = debug_str(out, pos.xy, origin, vec4u(0x6E726D2Eu, 0x79000000u, 0u, 0u), 5u); } // nrm.y
+        case  5u: { out = debug_str(out, pos.xy, origin, vec4u(0x64657074u, 0x68000000u, 0u, 0u), 5u); } // depth
+        case  6u: { out = debug_str(out, pos.xy, origin, vec4u(0x647A6478u, 0u, 0u, 0u), 4u); }          // dzdx
+        case  7u: { out = debug_str(out, pos.xy, origin, vec4u(0x647A6479u, 0u, 0u, 0u), 4u); }          // dzdy
+        case  8u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D617469u, 0x64000000u, 0u, 0u), 5u); } // matid
+        case  9u: { out = debug_str(out, pos.xy, origin, vec4u(0x7072762Eu, 0x72000000u, 0u, 0u), 5u); } // prv.r
+        case 10u: { out = debug_str(out, pos.xy, origin, vec4u(0x7072762Eu, 0x67000000u, 0u, 0u), 5u); } // prv.g
+        case 11u: { out = debug_str(out, pos.xy, origin, vec4u(0x7072762Eu, 0x62000000u, 0u, 0u), 5u); } // prv.b
+        case 12u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D312E72u, 0u, 0u, 0u), 4u); }          // m1.r
+        case 13u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D312E67u, 0u, 0u, 0u), 4u); }          // m1.g
+        case 14u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D312E62u, 0u, 0u, 0u), 4u); }          // m1.b
+        case 15u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D322E72u, 0u, 0u, 0u), 4u); }          // m2.r
+        case 16u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D322E67u, 0u, 0u, 0u), 4u); }          // m2.g
+        case 17u: { out = debug_str(out, pos.xy, origin, vec4u(0x6D322E62u, 0u, 0u, 0u), 4u); }          // m2.b
+        case 18u: { out = debug_str(out, pos.xy, origin, vec4u(0x73686477u, 0u, 0u, 0u), 4u); }          // shdw
+        default:  { out = debug_str(out, pos.xy, origin, vec4u(0x74726E73u, 0u, 0u, 0u), 4u); }          // trns
+    }
+    return out;
+}
diff --git a/cnn_v3/src/gbuf_deferred_effect.cc b/cnn_v3/src/gbuf_deferred_effect.cc
new file mode 100644
index 0000000..de6bd29
--- /dev/null
+++ b/cnn_v3/src/gbuf_deferred_effect.cc
@@ -0,0 +1,140 @@
+// GBufDeferredEffect — simple deferred render: albedo * shadow from packed G-buffer.
+
+#include "gbuf_deferred_effect.h"
+#include "gpu/gpu.h"
+#include "gpu/shader_composer.h"
+#include "util/fatal_error.h"
+
+extern const char* gbuf_deferred_wgsl;
+
+struct GBufDeferredUniforms {
+  float resolution[2];
+};
+static_assert(sizeof(GBufDeferredUniforms) == 8, "GBufDeferredUniforms must be 8 bytes");
+
+static WGPUBindGroupLayoutEntry bgl_uint_tex(uint32_t binding) {
+  WGPUBindGroupLayoutEntry e = {};
+  e.binding               = binding;
+  e.visibility            = WGPUShaderStage_Fragment;
+  e.texture.sampleType    = WGPUTextureSampleType_Uint;
+  e.texture.viewDimension = WGPUTextureViewDimension_2D;
+  return e;
+}
+
+static WGPUBindGroupLayoutEntry bgl_uniform(uint32_t binding, uint64_t min_size) {
+  WGPUBindGroupLayoutEntry e = {};
+  e.binding               = binding;
+  e.visibility            = WGPUShaderStage_Fragment;
+  e.buffer.type           = WGPUBufferBindingType_Uniform;
+  e.buffer.minBindingSize = min_size;
+  return e;
+}
+
+GBufDeferredEffect::GBufDeferredEffect(const GpuContext& ctx,
+                                       const std::vector<std::string>& inputs,
+                                       const std::vector<std::string>& outputs,
+                                       float start_time, float end_time)
+    : Effect(ctx, inputs, outputs, start_time, end_time) {
+  HEADLESS_RETURN_IF_NULL(ctx_.device);
+
+  WGPUBindGroupLayoutEntry entries[3] = {
+      bgl_uint_tex(0),
+      bgl_uint_tex(1),
+      bgl_uniform(2, sizeof(GBufDeferredUniforms)),
+  };
+  WGPUBindGroupLayoutDescriptor bgl_desc = {};
+  bgl_desc.entryCount = 3;
+  bgl_desc.entries    = entries;
+  WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(ctx_.device, &bgl_desc);
+
+  WGPUPipelineLayoutDescriptor pl_desc = {};
+  pl_desc.bindGroupLayoutCount = 1;
+  pl_desc.bindGroupLayouts     = &bgl;
+  WGPUPipelineLayout pl = wgpuDeviceCreatePipelineLayout(ctx_.device, &pl_desc);
+
+  WGPUShaderSourceWGSL wgsl_src = {};
+  wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL;
+  const std::string composed = ShaderComposer::Get().Compose({}, gbuf_deferred_wgsl);
+  wgsl_src.code        = str_view(composed.c_str());
+  WGPUShaderModuleDescriptor shader_desc = {};
+  shader_desc.nextInChain = &wgsl_src.chain;
+  WGPUShaderModule shader = wgpuDeviceCreateShaderModule(ctx_.device, &shader_desc);
+
+  WGPUColorTargetState target = {};
+  target.format    = WGPUTextureFormat_RGBA8Unorm;
+  target.writeMask = WGPUColorWriteMask_All;
+
+  WGPUFragmentState frag = {};
+  frag.module      = shader;
+  frag.entryPoint  = str_view("fs_main");
+  frag.targetCount = 1;
+  frag.targets     = &target;
+
+  WGPURenderPipelineDescriptor pipe_desc = {};
+  pipe_desc.layout               = pl;
+  pipe_desc.vertex.module        = shader;
+  pipe_desc.vertex.entryPoint    = str_view("vs_main");
+  pipe_desc.fragment             = &frag;
+  pipe_desc.primitive.topology   = WGPUPrimitiveTopology_TriangleList;
+  pipe_desc.multisample.count    = 1;
+  pipe_desc.multisample.mask     = UINT32_MAX;
+
+  pipeline_.set(wgpuDeviceCreateRenderPipeline(ctx_.device, &pipe_desc));
+
+  wgpuShaderModuleRelease(shader);
+  wgpuPipelineLayoutRelease(pl);
+  wgpuBindGroupLayoutRelease(bgl);
+}
+
+void GBufDeferredEffect::render(WGPUCommandEncoder encoder,
+                                const UniformsSequenceParams& params,
+                                NodeRegistry& nodes) {
+  WGPUTextureView feat0_view  = nodes.get_view(input_nodes_[0]);
+  WGPUTextureView feat1_view  = nodes.get_view(input_nodes_[1]);
+  WGPUTextureView output_view = nodes.get_view(output_nodes_[0]);
+
+  // Upload resolution uniform into the base class uniforms buffer (first 8 bytes).
+  GBufDeferredUniforms u;
+  u.resolution[0] = params.resolution.x;
+  u.resolution[1] = params.resolution.y;
+  wgpuQueueWriteBuffer(ctx_.queue, uniforms_buffer_.get().buffer, 0,
+                       &u, sizeof(u));
+
+  WGPUBindGroupLayout bgl =
+      wgpuRenderPipelineGetBindGroupLayout(pipeline_.get(), 0);
+
+  WGPUBindGroupEntry bg_entries[3] = {};
+  bg_entries[0].binding     = 0;
+  bg_entries[0].textureView = feat0_view;
+  bg_entries[1].binding     = 1;
+  bg_entries[1].textureView = feat1_view;
+  bg_entries[2].binding     = 2;
+  bg_entries[2].buffer      = uniforms_buffer_.get().buffer;
+  bg_entries[2].size        = sizeof(GBufDeferredUniforms);
+
+  WGPUBindGroupDescriptor bg_desc = {};
+  bg_desc.layout     = bgl;
+  bg_desc.entryCount = 3;
+  bg_desc.entries    = bg_entries;
+  bind_group_.replace(wgpuDeviceCreateBindGroup(ctx_.device, &bg_desc));
+  wgpuBindGroupLayoutRelease(bgl);
+
+  WGPURenderPassColorAttachment color_att = {};
+  color_att.view       = output_view;
+  color_att.loadOp     = WGPULoadOp_Clear;
+  color_att.storeOp    = WGPUStoreOp_Store;
+  color_att.clearValue = {0.0f, 0.0f, 0.0f, 1.0f};
+  color_att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED;
+
+  WGPURenderPassDescriptor pass_desc = {};
+  pass_desc.colorAttachmentCount = 1;
+  pass_desc.colorAttachments     = &color_att;
+
+  WGPURenderPassEncoder pass =
+      wgpuCommandEncoderBeginRenderPass(encoder, &pass_desc);
+  wgpuRenderPassEncoderSetPipeline(pass, pipeline_.get());
+  wgpuRenderPassEncoderSetBindGroup(pass, 0, bind_group_.get(), 0, nullptr);
+  wgpuRenderPassEncoderDraw(pass, 3, 1, 0, 0);
+  wgpuRenderPassEncoderEnd(pass);
+  wgpuRenderPassEncoderRelease(pass);
+}
diff --git a/cnn_v3/src/gbuf_deferred_effect.h b/cnn_v3/src/gbuf_deferred_effect.h
new file mode 100644
index 0000000..4daf13d
--- /dev/null
+++ b/cnn_v3/src/gbuf_deferred_effect.h
@@ -0,0 +1,23 @@
+// GBufDeferredEffect — simple deferred render from packed G-buffer.
+// Inputs: feat_tex0, feat_tex1 (rgba32uint). Output: albedo * shadow (rgba8unorm).
+
+#pragma once
+#include "gpu/effect.h"
+#include "gpu/uniform_helper.h"
+#include "gpu/wgpu_resource.h"
+
+class GBufDeferredEffect : public Effect {
+ public:
+  GBufDeferredEffect(const GpuContext& ctx,
+                     const std::vector<std::string>& inputs,
+                     const std::vector<std::string>& outputs,
+                     float start_time, float end_time);
+
+  void render(WGPUCommandEncoder encoder,
+              const UniformsSequenceParams& params,
+              NodeRegistry& nodes) override;
+
+ private:
+  RenderPipeline pipeline_;
+  BindGroup bind_group_;
+};
diff --git a/cnn_v3/src/gbuf_view_effect.cc b/cnn_v3/src/gbuf_view_effect.cc
new file mode 100644
index 0000000..ccf80b0
--- /dev/null
+++ b/cnn_v3/src/gbuf_view_effect.cc
@@ -0,0 +1,146 @@
+// GBufViewEffect — G-buffer channel grid visualization
+// Renders 20 feature channels from feat_tex0/feat_tex1 in a 4×5 tiled layout.
+
+#include "gbuf_view_effect.h"
+
+#if defined(USE_TEST_ASSETS)
+#include "test_assets.h"
+#else
+#include "generated/assets.h"
+#endif
+
+#include "gpu/gpu.h"
+#include "gpu/shader_composer.h"
+#include "util/asset_manager.h"
+#include "util/fatal_error.h"
+
+extern const char* gbuf_view_wgsl;
+
+// BGL entry: texture_2d<u32> read binding (fragment stage)
+static WGPUBindGroupLayoutEntry bgl_uint_tex_frag(uint32_t binding) {
+  WGPUBindGroupLayoutEntry e = {};
+  e.binding               = binding;
+  e.visibility            = WGPUShaderStage_Fragment;
+  e.texture.sampleType    = WGPUTextureSampleType_Uint;
+  e.texture.viewDimension = WGPUTextureViewDimension_2D;
+  return e;
+}
+
+// BGL entry: uniform buffer (fragment stage)
+static WGPUBindGroupLayoutEntry bgl_uniform_frag(uint32_t binding,
+                                                  uint64_t min_size) {
+  WGPUBindGroupLayoutEntry e = {};
+  e.binding               = binding;
+  e.visibility            = WGPUShaderStage_Fragment;
+  e.buffer.type           = WGPUBufferBindingType_Uniform;
+  e.buffer.minBindingSize = min_size;
+  return e;
+}
+
+GBufViewEffect::GBufViewEffect(const GpuContext& ctx,
+                                const std::vector<std::string>& inputs,
+                                const std::vector<std::string>& outputs,
+                                float start_time, float end_time)
+    : Effect(ctx, inputs, outputs, start_time, end_time) {
+  HEADLESS_RETURN_IF_NULL(ctx_.device);
+
+  // Build BGL: binding 0 = feat0 (u32 tex), 1 = feat1 (u32 tex), 2 = uniforms
+  WGPUBindGroupLayoutEntry entries[3] = {
+      bgl_uint_tex_frag(0),
+      bgl_uint_tex_frag(1),
+      bgl_uniform_frag(2, 8),  // only resolution (vec2f = 8 bytes) is read
+  };
+  WGPUBindGroupLayoutDescriptor bgl_desc = {};
+  bgl_desc.entryCount = 3;
+  bgl_desc.entries    = entries;
+  WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(ctx_.device, &bgl_desc);
+
+  // Pipeline layout
+  WGPUPipelineLayoutDescriptor pl_desc = {};
+  pl_desc.bindGroupLayoutCount = 1;
+  pl_desc.bindGroupLayouts     = &bgl;
+  WGPUPipelineLayout pl = wgpuDeviceCreatePipelineLayout(ctx_.device, &pl_desc);
+
+  // Shader module
+  WGPUShaderSourceWGSL wgsl_src = {};
+  wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL;
+  const std::string composed = ShaderComposer::Get().Compose({}, gbuf_view_wgsl);
+  wgsl_src.code        = str_view(composed.c_str());
+  WGPUShaderModuleDescriptor shader_desc = {};
+  shader_desc.nextInChain = &wgsl_src.chain;
+  WGPUShaderModule shader =
+      wgpuDeviceCreateShaderModule(ctx_.device, &shader_desc);
+
+  // Render pipeline
+  WGPUColorTargetState target = {};
+  target.format    = WGPUTextureFormat_RGBA8Unorm;
+  target.writeMask = WGPUColorWriteMask_All;
+
+  WGPUFragmentState frag = {};
+  frag.module      = shader;
+  frag.entryPoint  = str_view("fs_main");
+  frag.targetCount = 1;
+  frag.targets     = &target;
+
+  WGPURenderPipelineDescriptor pipe_desc = {};
+  pipe_desc.layout               = pl;
+  pipe_desc.vertex.module        = shader;
+  pipe_desc.vertex.entryPoint    = str_view("vs_main");
+  pipe_desc.fragment             = &frag;
+  pipe_desc.primitive.topology   = WGPUPrimitiveTopology_TriangleList;
+  pipe_desc.multisample.count    = 1;
+  pipe_desc.multisample.mask     = UINT32_MAX;
+
+  pipeline_.set(wgpuDeviceCreateRenderPipeline(ctx_.device, &pipe_desc));
+
+  wgpuShaderModuleRelease(shader);
+  wgpuPipelineLayoutRelease(pl);
+  wgpuBindGroupLayoutRelease(bgl);
+}
+
+void GBufViewEffect::render(WGPUCommandEncoder encoder,
+                             const UniformsSequenceParams& params,
+                             NodeRegistry& nodes) {
+  WGPUTextureView feat0_view  = nodes.get_view(input_nodes_[0]);
+  WGPUTextureView feat1_view  = nodes.get_view(input_nodes_[1]);
+  WGPUTextureView output_view = nodes.get_view(output_nodes_[0]);
+
+  // Rebuild bind group (views may change with ping-pong or resize)
+  WGPUBindGroupLayout bgl =
+      wgpuRenderPipelineGetBindGroupLayout(pipeline_.get(), 0);
+
+  WGPUBindGroupEntry bg_entries[3] = {};
+  bg_entries[0].binding     = 0;
+  bg_entries[0].textureView = feat0_view;
+  bg_entries[1].binding     = 1;
+  bg_entries[1].textureView = feat1_view;
+  bg_entries[2].binding     = 2;
+  bg_entries[2].buffer      = uniforms_buffer_.get().buffer;
+  bg_entries[2].size        = sizeof(UniformsSequenceParams);
+
+  WGPUBindGroupDescriptor bg_desc = {};
+  bg_desc.layout     = bgl;
+  bg_desc.entryCount = 3;
+  bg_desc.entries    = bg_entries;
+  bind_group_.replace(wgpuDeviceCreateBindGroup(ctx_.device, &bg_desc));
+  wgpuBindGroupLayoutRelease(bgl);
+
+  WGPURenderPassColorAttachment color_att = {};
+  color_att.view         = output_view;
+  color_att.loadOp       = WGPULoadOp_Clear;
+  color_att.storeOp      = WGPUStoreOp_Store;
+  color_att.clearValue   = {0.0f, 0.0f, 0.0f, 1.0f};
+  color_att.depthSlice   = WGPU_DEPTH_SLICE_UNDEFINED;
+
+  WGPURenderPassDescriptor pass_desc = {};
+  pass_desc.colorAttachmentCount = 1;
+  pass_desc.colorAttachments     = &color_att;
+
+  WGPURenderPassEncoder pass =
+      wgpuCommandEncoderBeginRenderPass(encoder, &pass_desc);
+  wgpuRenderPassEncoderSetPipeline(pass, pipeline_.get());
+  wgpuRenderPassEncoderSetBindGroup(pass, 0, bind_group_.get(), 0, nullptr);
+  wgpuRenderPassEncoderDraw(pass, 3, 1, 0, 0);
+  wgpuRenderPassEncoderEnd(pass);
+  wgpuRenderPassEncoderRelease(pass);
+}
diff --git a/cnn_v3/src/gbuf_view_effect.h b/cnn_v3/src/gbuf_view_effect.h
new file mode 100644
index 0000000..d4d8139
--- /dev/null
+++ b/cnn_v3/src/gbuf_view_effect.h
@@ -0,0 +1,25 @@
+// GBufViewEffect: Visualizes G-buffer feature textures as a 4×5 channel grid.
+// Inputs: feat_tex0 (rgba32uint, ch 0-7 f16), feat_tex1 (rgba32uint, ch 8-19 unorm8)
+// Output: rgba8unorm tiled channel visualization (downscaled 4× per channel)
+
+#pragma once
+
+#include "gpu/effect.h"
+#include "gpu/sequence.h"
+#include "gpu/wgpu_resource.h"
+
+class GBufViewEffect : public Effect {
+ public:
+  GBufViewEffect(const GpuContext& ctx,
+                 const std::vector<std::string>& inputs,
+                 const std::vector<std::string>& outputs,
+                 float start_time, float end_time);
+
+  void render(WGPUCommandEncoder encoder,
+              const UniformsSequenceParams& params,
+              NodeRegistry& nodes) override;
+
+ private:
+  RenderPipeline pipeline_;
+  BindGroup       bind_group_;
+};
diff --git a/cnn_v3/src/gbuffer_effect.cc b/cnn_v3/src/gbuffer_effect.cc
index 750188f..b059915 100644
--- a/cnn_v3/src/gbuffer_effect.cc
+++ b/cnn_v3/src/gbuffer_effect.cc
@@ -14,6 +14,7 @@
 // For standalone use outside the asset system, the caller must ensure the WGSL
 // source strings are available.  They are declared here as weak-linkable externs.
 extern const char* gbuf_raster_wgsl;
+extern const char* gbuf_shadow_wgsl;
 extern const char* gbuf_pack_wgsl;
 
 // Maximum number of objects the G-buffer supports per frame.
@@ -41,53 +42,6 @@ struct GBufGlobalUniforms {
 static_assert(sizeof(GBufGlobalUniforms) == sizeof(float) * 44,
               "GBufGlobalUniforms must be 176 bytes");
 
-// Helper: create a 1×1 placeholder texture of a given format cleared to `value`.
-static WGPUTexture create_placeholder_tex(WGPUDevice device,
-                                          WGPUTextureFormat format,
-                                          float value) {
-  WGPUTextureDescriptor desc = {};
-  desc.usage = (WGPUTextureUsage)(WGPUTextureUsage_TextureBinding |
-                                  WGPUTextureUsage_CopyDst);
-  desc.dimension = WGPUTextureDimension_2D;
-  desc.size = {1, 1, 1};
-  desc.format = format;
-  desc.mipLevelCount = 1;
-  desc.sampleCount = 1;
-  WGPUTexture tex = wgpuDeviceCreateTexture(device, &desc);
-  return tex;
-}
-
-// Helper: write a single RGBA float pixel to a texture via queue.
-static void write_placeholder_pixel(WGPUQueue queue, WGPUTexture tex,
-                                    float r, float g, float b, float a) {
-  const float data[4] = {r, g, b, a};
-  WGPUTexelCopyTextureInfo dst = {};
-  dst.texture = tex;
-  dst.mipLevel = 0;
-  dst.origin = {0, 0, 0};
-  dst.aspect = WGPUTextureAspect_All;
-
-  WGPUTexelCopyBufferLayout layout = {};
-  layout.offset = 0;
-  layout.bytesPerRow = 16; // 4 × sizeof(float)
-  layout.rowsPerImage = 1;
-
-  const WGPUExtent3D extent = {1, 1, 1};
-  wgpuQueueWriteTexture(queue, &dst, data, sizeof(data), &layout, &extent);
-}
-
-// Create bilinear sampler.
-static WGPUSampler create_bilinear_sampler(WGPUDevice device) {
-  WGPUSamplerDescriptor desc = {};
-  desc.addressModeU = WGPUAddressMode_ClampToEdge;
-  desc.addressModeV = WGPUAddressMode_ClampToEdge;
-  desc.magFilter = WGPUFilterMode_Linear;
-  desc.minFilter = WGPUFilterMode_Linear;
-  desc.mipmapFilter = WGPUMipmapFilterMode_Linear;
-  desc.maxAnisotropy = 1;
-  return wgpuDeviceCreateSampler(device, &desc);
-}
-
 // ---- GBufferEffect ----
 
 GBufferEffect::GBufferEffect(const GpuContext& ctx,
@@ -104,9 +58,6 @@ GBufferEffect::GBufferEffect(const GpuContext& ctx,
   node_depth_      = prefix + "_depth";
   node_shadow_     = prefix + "_shadow";
   node_transp_     = prefix + "_transp";
-  node_feat0_      = outputs.size() > 0 ? outputs[0] : prefix + "_feat0";
-  node_feat1_      = outputs.size() > 1 ? outputs[1] : prefix + "_feat1";
-
   // Allocate GPU buffers for scene data.
   global_uniforms_buf_ =
       gpu_create_buffer(ctx_.device, sizeof(GBufGlobalUniforms),
@@ -116,34 +67,15 @@ GBufferEffect::GBufferEffect(const GpuContext& ctx,
 
   // Resolution uniform for pack shader.
   pack_res_uniform_.init(ctx_.device);
+  lights_uniform_.init(ctx_.device);
 
-  // Placeholder shadow (1.0 = fully lit) and transp (0.0 = opaque) textures.
-  shadow_placeholder_tex_.set(
-      create_placeholder_tex(ctx_.device, WGPUTextureFormat_RGBA32Float, 1.0f));
-  write_placeholder_pixel(ctx_.queue,
-                          shadow_placeholder_tex_.get(), 1.0f, 0.0f, 0.0f, 1.0f);
-
-  transp_placeholder_tex_.set(
-      create_placeholder_tex(ctx_.device, WGPUTextureFormat_RGBA32Float, 0.0f));
-  write_placeholder_pixel(ctx_.queue,
-                          transp_placeholder_tex_.get(), 0.0f, 0.0f, 0.0f, 1.0f);
-
-  WGPUTextureViewDescriptor vd = {};
-  vd.format = WGPUTextureFormat_RGBA32Float;
-  vd.dimension = WGPUTextureViewDimension_2D;
-  vd.baseMipLevel = 0;
-  vd.mipLevelCount = 1;
-  vd.baseArrayLayer = 0;
-  vd.arrayLayerCount = 1;
-  vd.aspect = WGPUTextureAspect_All;
-
-  shadow_placeholder_view_.set(
-      wgpuTextureCreateView(shadow_placeholder_tex_.get(), &vd));
-  transp_placeholder_view_.set(
-      wgpuTextureCreateView(transp_placeholder_tex_.get(), &vd));
+  create_linear_sampler();
 
   create_raster_pipeline();
+  create_shadow_pipeline();
   create_pack_pipeline();
+
+  set_scene();
 }
 
 void GBufferEffect::declare_nodes(NodeRegistry& registry) {
@@ -154,27 +86,123 @@ void GBufferEffect::declare_nodes(NodeRegistry& registry) {
   registry.declare_node(node_transp_,     NodeType::GBUF_R8,        -1, -1);
   // feat_tex0 / feat_tex1 are the declared output_nodes_ — they get registered
   // by the sequence infrastructure; declare them here as well if not already.
-  if (!registry.has_node(node_feat0_)) {
-    registry.declare_node(node_feat0_, NodeType::GBUF_RGBA32UINT, -1, -1);
+  if (!registry.has_node(output_nodes_[0])) {
+    registry.declare_node(output_nodes_[0], NodeType::GBUF_RGBA32UINT, -1, -1);
+  }
+  if (!registry.has_node(output_nodes_[1])) {
+    registry.declare_node(output_nodes_[1], NodeType::GBUF_RGBA32UINT, -1, -1);
   }
-  if (!registry.has_node(node_feat1_)) {
-    registry.declare_node(node_feat1_, NodeType::GBUF_RGBA32UINT, -1, -1);
+}
+
+void GBufferEffect::set_scene() {
+  scene_.clear();
+  cube_anims_.clear();
+  sphere_anims_.clear();
+
+  // Deterministic pseudo-random (xorshift32).
+  uint32_t seed = 0xBEEF1234u;
+  auto rnd = [&]() -> float {
+    seed ^= seed << 13;
+    seed ^= seed >> 17;
+    seed ^= seed << 5;
+    return (float)(seed >> 8) / 16777216.0f;  // [0, 1)
+  };
+  auto rrange = [&](float lo, float hi) { return lo + rnd() * (hi - lo); };
+
+  // 2 large cubes.
+  // 2 large static cubes for shadow debugging.
+  {
+    Object3D obj(ObjectType::CUBE);
+    obj.position = vec3(-1.0f, 0.0f, 0.0f);
+    obj.scale    = vec3(0.6f, 0.6f, 0.6f);
+    obj.color    = vec4(0.9f, 0.5f, 0.3f, 1.0f);
+    scene_.add_object(obj);
+    cube_anims_.push_back({{0.0f, 1.0f, 0.0f}, 0.0f});
   }
+  {
+    Object3D obj(ObjectType::SPHERE);
+    obj.position = vec3(1.0f, 0.0f, 0.0f);
+    const float r = 0.9f;
+    obj.scale    = vec3(r, r, r);
+    obj.color    = vec4(0.3f, 0.6f, 0.9f, 1.0f);
+    const int idx = (int)scene_.objects.size();
+    scene_.add_object(obj);
+    sphere_anims_.push_back({idx, r});
+  }
+
+  // (sphere removed for shadow debugging)
+
+  // Camera: above and in front of the scene, looking at origin.
+  camera_.set_look_at(vec3(0.0f, 2.5f, 6.0f),
+                      vec3(0.0f, 0.0f, 0.0f),
+                      vec3(0.0f, 1.0f, 0.0f));
+  camera_.fov_y_rad  = 0.7854f;  // 45°
+  camera_.near_plane = 0.1f;
+  camera_.far_plane  = 20.0f;
+  // aspect_ratio is updated each frame from params.resolution.
+
+  scene_ready_ = true;
 }
 
-void GBufferEffect::set_scene(const Scene* scene, const Camera* camera) {
-  scene_  = scene;
-  camera_ = camera;
+static void clear_r8_node(WGPUCommandEncoder encoder, WGPUTextureView view,
+                           float value) {
+  WGPURenderPassColorAttachment att = {};
+  att.view       = view;
+  att.loadOp     = WGPULoadOp_Clear;
+  att.storeOp    = WGPUStoreOp_Store;
+  att.clearValue = {value, value, value, value};
+  att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED;
+  WGPURenderPassDescriptor pd = {};
+  pd.colorAttachmentCount = 1;
+  pd.colorAttachments     = &att;
+  WGPURenderPassEncoder p = wgpuCommandEncoderBeginRenderPass(encoder, &pd);
+  wgpuRenderPassEncoderEnd(p);
+  wgpuRenderPassEncoderRelease(p);
 }
 
 void GBufferEffect::render(WGPUCommandEncoder encoder,
                            const UniformsSequenceParams& params,
                            NodeRegistry& nodes) {
-  if (!scene_ || !camera_) {
+  if (!scene_ready_) {
     return;
   }
 
-  upload_scene_data(*scene_, *camera_, params.time);
+  // Update camera aspect ratio from current resolution.
+  camera_.aspect_ratio = params.aspect_ratio;
+
+  // Slowly orbit around the scene.
+  const float angle = params.time * 0.3f;
+  const float R = 6.0f;
+  camera_.set_look_at(vec3(R * sinf(angle), 2.5f, R * cosf(angle)),
+                      vec3(0.0f, 0.0f, 0.0f),
+                      vec3(0.0f, 1.0f, 0.0f));
+
+  // Animate cubes: axis-angle rotation driven by physical time.
+  for (int i = 0; i < (int)cube_anims_.size(); ++i) {
+    const CubeAnim& a = cube_anims_[(size_t)i];
+    scene_.objects[(size_t)i].rotation =
+        quat::from_axis(a.axis, params.time * a.speed);
+  }
+  // Pump spheres: scale with audio_intensity.
+  for (const SphereAnim& a : sphere_anims_) {
+    const float r = a.base_radius * (1.0f + params.audio_intensity * 0.8f);
+    scene_.objects[(size_t)a.obj_idx].scale = vec3(r, r, r);
+  }
+
+  // Upload two directional lights.
+  {
+    GBufLightsUniforms lu = {};
+    lu.params = vec4(1.0f, 0.0f, 0.0f, 0.0f);
+    // Key: warm sun, upper-right-front.
+    lu.lights[0].direction = vec4(0.408f, 0.816f, 0.408f, 0.0f);  // norm(1,2,1)
+    lu.lights[0].color     = vec4(1.00f, 0.92f, 0.78f, 1.0f);
+    // Fill: cool sky, upper-left-back. (disabled for debugging)
+    // lu.lights[1].direction = vec4(-0.577f, 0.577f, -0.577f, 0.0f);
+    // lu.lights[1].color     = vec4(0.40f, 0.45f, 0.80f, 0.4f);
+    lights_uniform_.update(ctx_.queue, lu);
+  }
+
+  upload_scene_data(scene_, camera_, params.time);
 
   // Update resolution uniform for pack shader.
   GBufResUniforms res_uni;
@@ -186,8 +214,8 @@ void GBufferEffect::render(WGPUCommandEncoder encoder,
   WGPUTextureView albedo_view     = nodes.get_view(node_albedo_);
   WGPUTextureView normal_mat_view = nodes.get_view(node_normal_mat_);
   WGPUTextureView depth_view      = nodes.get_view(node_depth_);
-  WGPUTextureView feat0_view      = nodes.get_view(node_feat0_);
-  WGPUTextureView feat1_view      = nodes.get_view(node_feat1_);
+  WGPUTextureView feat0_view      = nodes.get_view(output_nodes_[0]);
+  WGPUTextureView feat1_view      = nodes.get_view(output_nodes_[1]);
 
   // prev_cnn: first input node if available, else dummy.
   WGPUTextureView prev_view = nullptr;
@@ -228,8 +256,8 @@ void GBufferEffect::render(WGPUCommandEncoder encoder,
   raster_pass_desc.depthStencilAttachment = &depth_attachment;
 
   const int num_objects =
-      (int)(scene_->objects.size() < (size_t)kGBufMaxObjects
-                ? scene_->objects.size()
+      (int)(scene_.objects.size() < (size_t)kGBufMaxObjects
+                ? scene_.objects.size()
                 : (size_t)kGBufMaxObjects);
 
   if (num_objects > 0 && raster_pipeline_.get() != nullptr) {
@@ -250,13 +278,69 @@ void GBufferEffect::render(WGPUCommandEncoder encoder,
     wgpuRenderPassEncoderRelease(raster_pass);
   }
 
-  // Pass 2: SDF raymarching — TODO (placeholder: shadow=1, transp=0 already set)
-  // Pass 3: Lighting/shadow — TODO
+  // --- Pass 2: SDF shadow raymarching ---
+  if (shadow_pipeline_.get() != nullptr) {
+    WGPUBindGroupEntry shadow_entries[5] = {};
+    shadow_entries[0].binding = 0;
+    shadow_entries[0].buffer  = global_uniforms_buf_.buffer;
+    shadow_entries[0].size    = sizeof(GBufGlobalUniforms);
+
+    shadow_entries[1].binding = 1;
+    shadow_entries[1].buffer  = objects_buf_.buffer;
+    shadow_entries[1].size    = (size_t)objects_buf_capacity_ * sizeof(GBufObjectData);
+
+    shadow_entries[2].binding     = 2;
+    shadow_entries[2].textureView = depth_view;
+
+    shadow_entries[3].binding = 3;
+    shadow_entries[3].buffer  = lights_uniform_.get().buffer;
+    shadow_entries[3].size    = sizeof(GBufLightsUniforms);
+
+    shadow_entries[4].binding     = 4;
+    shadow_entries[4].textureView = normal_mat_view;
+
+    WGPUBindGroupLayout shadow_bgl =
+        wgpuRenderPipelineGetBindGroupLayout(shadow_pipeline_.get(), 0);
+
+    WGPUBindGroupDescriptor shadow_bg_desc = {};
+    shadow_bg_desc.layout     = shadow_bgl;
+    shadow_bg_desc.entryCount = 5;
+    shadow_bg_desc.entries    = shadow_entries;
+
+    WGPUBindGroup shadow_bg =
+        wgpuDeviceCreateBindGroup(ctx_.device, &shadow_bg_desc);
+    wgpuBindGroupLayoutRelease(shadow_bgl);
+
+    WGPURenderPassColorAttachment shadow_att = {};
+    shadow_att.view       = nodes.get_view(node_shadow_);
+    shadow_att.loadOp     = WGPULoadOp_Clear;
+    shadow_att.storeOp    = WGPUStoreOp_Store;
+    shadow_att.clearValue = {1.0f, 1.0f, 1.0f, 1.0f};
+    shadow_att.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED;
+
+    WGPURenderPassDescriptor shadow_pass_desc = {};
+    shadow_pass_desc.colorAttachmentCount = 1;
+    shadow_pass_desc.colorAttachments     = &shadow_att;
+
+    WGPURenderPassEncoder shadow_pass =
+        wgpuCommandEncoderBeginRenderPass(encoder, &shadow_pass_desc);
+    wgpuRenderPassEncoderSetPipeline(shadow_pass, shadow_pipeline_.get());
+    wgpuRenderPassEncoderSetBindGroup(shadow_pass, 0, shadow_bg, 0, nullptr);
+    wgpuRenderPassEncoderDraw(shadow_pass, 3, 1, 0, 0);
+    wgpuRenderPassEncoderEnd(shadow_pass);
+    wgpuRenderPassEncoderRelease(shadow_pass);
+    wgpuBindGroupRelease(shadow_bg);
+  } else {
+    // Fallback: clear to 1.0 (fully lit) if pipeline not ready.
+    clear_r8_node(encoder, nodes.get_view(node_shadow_), 1.0f);
+  }
+
+  // Pass 3: Transparency — TODO (deferred; opaque scenes only)
+  clear_r8_node(encoder, nodes.get_view(node_transp_), 0.0f);
 
   // --- Pass 4: Pack compute ---
   // Rebuild pack bind group with current node views.
-  // Construct a temporary bilinear sampler for this pass.
-  WGPUSampler bilinear = create_bilinear_sampler(ctx_.device);
+  WGPUSampler bilinear = sampler_.get();
 
   // Get texture views from nodes.
   // shadow / transp are GBUF_R8 nodes; use their views.
@@ -320,7 +404,7 @@ void GBufferEffect::render(WGPUCommandEncoder encoder,
   wgpuComputePassEncoderRelease(compute_pass);
 
   wgpuBindGroupRelease(pack_bg);
-  wgpuSamplerRelease(bilinear);
+  // bilinear is owned by sampler_ — no release here.
 }
 
 // ---- private helpers ----
@@ -346,7 +430,8 @@ void GBufferEffect::upload_scene_data(const Scene& scene,
                 : (size_t)kGBufMaxObjects);
 
   const mat4 view = camera.get_view_matrix();
-  const mat4 proj = camera.get_projection_matrix();
+  mat4 proj = camera.get_projection_matrix();
+  proj.m[5] = -proj.m[5];  // undo post-process Y flip: G-buffer uses integer reads
   const mat4 vp   = proj * view;
 
   GBufGlobalUniforms gu = {};
@@ -373,7 +458,7 @@ void GBufferEffect::upload_scene_data(const Scene& scene,
       d.model     = m;
       d.inv_model = m.inverse();
       d.color     = obj.color;
-      d.params    = vec4(0.0f, 0.0f, 0.0f, 0.0f);
+      d.params    = vec4((float)(int)obj.type, 0.0f, 0.0f, 0.0f);
       obj_data.push_back(d);
     }
     wgpuQueueWriteBuffer(ctx_.queue, objects_buf_.buffer, 0,
@@ -392,7 +477,7 @@ void GBufferEffect::create_raster_pipeline() {
   }
 
   const std::string composed =
-      ShaderComposer::Get().Compose({"common_uniforms"}, src);
+      ShaderComposer::Get().Compose({}, src);
 
   WGPUShaderSourceWGSL wgsl_src = {};
   wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL;
@@ -450,8 +535,9 @@ void GBufferEffect::create_raster_pipeline() {
   pipe_desc.vertex.entryPoint = str_view("vs_main");
   pipe_desc.fragment = &frag;
   pipe_desc.depthStencil = &ds;
-  pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList;
-  pipe_desc.primitive.cullMode = WGPUCullMode_Back;
+  pipe_desc.primitive.topology  = WGPUPrimitiveTopology_TriangleList;
+  pipe_desc.primitive.cullMode  = WGPUCullMode_Back;
+  pipe_desc.primitive.frontFace = WGPUFrontFace_CCW;  // standard (no Y flip)
   pipe_desc.multisample.count = 1;
   pipe_desc.multisample.mask = 0xFFFFFFFF;
 
@@ -462,6 +548,91 @@ void GBufferEffect::create_raster_pipeline() {
   wgpuShaderModuleRelease(shader);
 }
 
+void GBufferEffect::create_shadow_pipeline() {
+  HEADLESS_RETURN_IF_NULL(ctx_.device);
+
+  const char* src = gbuf_shadow_wgsl;
+  if (!src) {
+    return;
+  }
+
+  const std::string composed = ShaderComposer::Get().Compose({}, src);
+
+  WGPUShaderSourceWGSL wgsl_src = {};
+  wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL;
+  wgsl_src.code = str_view(composed.c_str());
+
+  WGPUShaderModuleDescriptor shader_desc = {};
+  shader_desc.nextInChain = &wgsl_src.chain;
+  WGPUShaderModule shader = wgpuDeviceCreateShaderModule(ctx_.device, &shader_desc);
+
+  // BGL: B0=GlobalUniforms, B1=ObjectsBuffer, B2=texture_depth_2d, B3=GBufLightsUniforms
+  WGPUBindGroupLayoutEntry bgl_entries[5] = {};
+
+  bgl_entries[0].binding = 0;
+  bgl_entries[0].visibility =
+      (WGPUShaderStage)(WGPUShaderStage_Vertex | WGPUShaderStage_Fragment);
+  bgl_entries[0].buffer.type = WGPUBufferBindingType_Uniform;
+  bgl_entries[0].buffer.minBindingSize = sizeof(GBufGlobalUniforms);
+
+  bgl_entries[1].binding = 1;
+  bgl_entries[1].visibility = WGPUShaderStage_Fragment;
+  bgl_entries[1].buffer.type = WGPUBufferBindingType_ReadOnlyStorage;
+  bgl_entries[1].buffer.minBindingSize = sizeof(GBufObjectData);
+
+  bgl_entries[2].binding = 2;
+  bgl_entries[2].visibility = WGPUShaderStage_Fragment;
+  bgl_entries[2].texture.sampleType = WGPUTextureSampleType_Depth;
+  bgl_entries[2].texture.viewDimension = WGPUTextureViewDimension_2D;
+
+  bgl_entries[3].binding = 3;
+  bgl_entries[3].visibility = WGPUShaderStage_Fragment;
+  bgl_entries[3].buffer.type = WGPUBufferBindingType_Uniform;
+  bgl_entries[3].buffer.minBindingSize = sizeof(GBufLightsUniforms);
+
+  bgl_entries[4].binding = 4;
+  bgl_entries[4].visibility = WGPUShaderStage_Fragment;
+  bgl_entries[4].texture.sampleType = WGPUTextureSampleType_Float;
+  bgl_entries[4].texture.viewDimension = WGPUTextureViewDimension_2D;
+
+  WGPUBindGroupLayoutDescriptor bgl_desc = {};
+  bgl_desc.entryCount = 5;
+  bgl_desc.entries = bgl_entries;
+  WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(ctx_.device, &bgl_desc);
+
+  WGPUPipelineLayoutDescriptor pl_desc = {};
+  pl_desc.bindGroupLayoutCount = 1;
+  pl_desc.bindGroupLayouts = &bgl;
+  WGPUPipelineLayout pl = wgpuDeviceCreatePipelineLayout(ctx_.device, &pl_desc);
+
+  // Color target: RGBA8Unorm (NodeType::GBUF_R8)
+  WGPUColorTargetState color_target = {};
+  color_target.format = WGPUTextureFormat_RGBA8Unorm;
+  color_target.writeMask = WGPUColorWriteMask_All;
+
+  WGPUFragmentState frag = {};
+  frag.module = shader;
+  frag.entryPoint = str_view("fs_main");
+  frag.targetCount = 1;
+  frag.targets = &color_target;
+
+  WGPURenderPipelineDescriptor pipe_desc = {};
+  pipe_desc.layout = pl;
+  pipe_desc.vertex.module = shader;
+  pipe_desc.vertex.entryPoint = str_view("vs_main");
+  pipe_desc.fragment = &frag;
+  pipe_desc.primitive.topology = WGPUPrimitiveTopology_TriangleList;
+  pipe_desc.primitive.cullMode = WGPUCullMode_None;
+  pipe_desc.multisample.count = 1;
+  pipe_desc.multisample.mask = 0xFFFFFFFF;
+
+  shadow_pipeline_.set(wgpuDeviceCreateRenderPipeline(ctx_.device, &pipe_desc));
+
+  wgpuPipelineLayoutRelease(pl);
+  wgpuBindGroupLayoutRelease(bgl);
+  wgpuShaderModuleRelease(shader);
+}
+
 void GBufferEffect::create_pack_pipeline() {
   HEADLESS_RETURN_IF_NULL(ctx_.device);
 
@@ -596,7 +767,3 @@ void GBufferEffect::update_raster_bind_group(NodeRegistry& nodes) {
   wgpuBindGroupLayoutRelease(bgl);
 }
 
-void GBufferEffect::update_pack_bind_group(NodeRegistry& nodes) {
-  (void)nodes;
-  // Pack bind group is rebuilt inline in render() to use current node views.
-}
diff --git a/cnn_v3/src/gbuffer_effect.h b/cnn_v3/src/gbuffer_effect.h
index 42fb0ec..13d394d 100644
--- a/cnn_v3/src/gbuffer_effect.h
+++ b/cnn_v3/src/gbuffer_effect.h
@@ -10,6 +10,7 @@
 #include "gpu/uniform_helper.h"
 #include "gpu/wgpu_resource.h"
 #include "util/mini_math.h"
+#include <vector>
 
 // Uniform for the pack compute shader
 struct GBufResUniforms {
@@ -20,6 +21,20 @@ struct GBufResUniforms {
 static_assert(sizeof(GBufResUniforms) == 16,
               "GBufResUniforms must be 16 bytes");
 
+// Single directional light: direction points *toward* the light source (world space).
+struct GBufLight {
+  vec4 direction;  // xyz = normalized direction toward light, w = unused
+  vec4 color;      // rgb = color, a = intensity
+};
+static_assert(sizeof(GBufLight) == 32, "GBufLight must be 32 bytes");
+
+struct GBufLightsUniforms {
+  GBufLight lights[2];
+  vec4      params;  // x = num_lights
+};
+static_assert(sizeof(GBufLightsUniforms) == 80,
+              "GBufLightsUniforms must be 80 bytes");
+
 class GBufferEffect : public Effect {
  public:
   GBufferEffect(const GpuContext& ctx, const std::vector<std::string>& inputs,
@@ -31,47 +46,59 @@ class GBufferEffect : public Effect {
   void render(WGPUCommandEncoder encoder, const UniformsSequenceParams& params,
               NodeRegistry& nodes) override;
 
-  void set_scene(const Scene* scene, const Camera* camera);
+  // Populate the internal scene with ~20 rotating cubes and a few pumping
+  // spheres.  Must be called once before the first render().
+  void set_scene();
 
  private:
+  // Per-cube animation state (axis-angle rotation)
+  struct CubeAnim {
+    vec3  axis;
+    float speed;  // radians/second, may be negative
+  };
+  // Per-sphere animation state (radius driven by audio_intensity)
+  struct SphereAnim {
+    int   obj_idx;     // index into scene_.objects
+    float base_radius;
+  };
+
   // Internal G-buffer node names
   std::string node_albedo_;
   std::string node_normal_mat_;
   std::string node_depth_;
   std::string node_shadow_;
   std::string node_transp_;
-  std::string node_feat0_;
-  std::string node_feat1_;
 
-  const Scene*  scene_  = nullptr;
-  const Camera* camera_ = nullptr;
+  // Owned scene and camera — populated by set_scene()
+  Scene  scene_;
+  Camera camera_;
+  bool   scene_ready_ = false;
+
+  std::vector<CubeAnim>  cube_anims_;
+  std::vector<SphereAnim> sphere_anims_;
 
   // Pass 1: MRT rasterization pipeline
   RenderPipeline raster_pipeline_;
   BindGroup      raster_bind_group_;
 
+  // Pass 2: SDF shadow pipeline
+  RenderPipeline shadow_pipeline_;
+
   // Pass 4: Pack compute pipeline
   ComputePipeline  pack_pipeline_;
-  BindGroup        pack_bind_group_;
-  UniformBuffer<GBufResUniforms> pack_res_uniform_;
-
-  // Placeholder textures for shadow/transp (white/black cleared once)
-  Texture     shadow_placeholder_tex_;
-  TextureView shadow_placeholder_view_;
-  Texture     transp_placeholder_tex_;
-  TextureView transp_placeholder_view_;
+  UniformBuffer<GBufResUniforms>    pack_res_uniform_;
+  UniformBuffer<GBufLightsUniforms> lights_uniform_;
 
   // GPU-side object data buffers (global uniforms + objects storage)
-  // These mirror the layout expected by gbuf_raster.wgsl
   GpuBuffer global_uniforms_buf_;
   GpuBuffer objects_buf_;
-  int       objects_buf_capacity_ = 0; // number of ObjectData slots allocated
+  int       objects_buf_capacity_ = 0;
 
   void create_raster_pipeline();
+  void create_shadow_pipeline();
   void create_pack_pipeline();
 
   void update_raster_bind_group(NodeRegistry& nodes);
-  void update_pack_bind_group(NodeRegistry& nodes);
 
   void upload_scene_data(const Scene& scene, const Camera& camera,
                          float time);
diff --git a/cnn_v3/tools/index.html b/cnn_v3/tools/index.html
index eba532e..26fee9b 100644
--- a/cnn_v3/tools/index.html
+++ b/cnn_v3/tools/index.html
@@ -64,9 +64,11 @@ video{display:none}
   <div class="left">
     <input type="file" id="wFile"  accept=".bin" style="display:none">
     <input type="file" id="fFile"  accept=".bin" style="display:none">
+    <input type="file" id="sFile"  webkitdirectory style="display:none" onchange="tester.loadSampleDir(this.files)">
 
     <div class="dz" id="wDrop"  onclick="document.getElementById('wFile').click()">Drop cnn_v3_weights.bin</div>
     <div class="dz" id="fDrop"  onclick="document.getElementById('fFile').click()">Drop cnn_v3_film_mlp.bin (optional)</div>
+    <button onclick="tester.preload()" style="font-size:9px;margin-top:2px">↺ Reload from workspace weights/</button>
 
     <div class="panel">
       <div class="ph" onclick="togglePanel(this)">Input Mode <span>▼</span></div>
@@ -78,6 +80,10 @@ video{display:none}
         <div id="fullHelp" style="display:none;margin-top:6px;font-size:9px;color:#555;line-height:1.6">
           Drop PNGs: *albedo*/color · *normal* · *depth* · *matid*/index · *shadow* · *transp*/alpha
         </div>
+        <div style="margin-top:8px;border-top:1px solid #333;padding-top:8px">
+          <button onclick="document.getElementById('sFile').click()" style="width:100%">↑ Load sample directory</button>
+          <div id="sampleSt" style="font-size:9px;color:#555;margin-top:3px"></div>
+        </div>
       </div>
     </div>
 
@@ -120,15 +126,29 @@ video{display:none}
       <div class="sep"></div>
       <button onclick="tester.savePNG()">Save PNG</button>
     </div>
-    <canvas id="canvas"></canvas>
+    <div style="display:flex;gap:12px;align-items:flex-start">
+      <div style="display:flex;flex-direction:column;align-items:center;gap:3px">
+        <canvas id="canvas"></canvas>
+        <span id="cnnLabel" style="font-size:9px;color:#555"></span>
+      </div>
+      <div id="targetPane" style="display:none;flex-direction:column;align-items:center;gap:3px">
+        <canvas id="targetCanvas" style="max-width:100%;max-height:100%;image-rendering:pixelated;box-shadow:0 4px 12px rgba(0,0,0,.5)"></canvas>
+        <span style="font-size:9px;color:#555">target.png</span>
+        <span id="psnrSt" style="font-size:9px;color:#4a9eff"></span>
+      </div>
+    </div>
   </div>
 
   <div class="right">
     <div class="panel" style="flex:1;display:flex;flex-direction:column;min-height:0">
       <div class="ph">Layer Visualization</div>
-      <div class="pc" id="layerViz" style="flex:1;overflow:auto">
+      <div class="pc" id="layerViz" style="flex:1;min-height:0;overflow:auto">
         <p style="color:#444;text-align:center">Load image + weights</p>
       </div>
+      <div id="chzoomWrap" style="display:none;flex-direction:column;align-items:center;justify-content:center;gap:3px;padding:6px;border-top:1px solid #333;background:#1a1a1a;flex:1;min-height:0;overflow:hidden">
+        <span id="chzoomLbl" style="font-size:9px;color:#666;flex-shrink:0"></span>
+        <canvas id="chzoom" style="image-rendering:pixelated;display:block"></canvas>
+      </div>
     </div>
   </div>
 </div>
diff --git a/cnn_v3/tools/shaders.js b/cnn_v3/tools/shaders.js
index c3e994d..f178637 100644
--- a/cnn_v3/tools/shaders.js
+++ b/cnn_v3/tools/shaders.js
@@ -223,30 +223,85 @@ const DISP_SHADER=`
 }`;
 
 // Viz f32: show one channel of rgba16float layer
+// Uniform layout: ch(u32) _p(u32) ox(i32) oy(i32) — 16 bytes
+// ox/oy = texel offset (top-left of view); 0,0 for full-texture vignettes.
 const VIZ_F32=`
+struct Vu{ch:u32,_p:u32,ox:i32,oy:i32}
 @group(0) @binding(0) var t:texture_2d<f32>;
-@group(0) @binding(1) var<uniform> ch:u32;
+@group(0) @binding(1) var<uniform> u:Vu;
 @vertex fn vs(@builtin(vertex_index) i:u32)->@builtin(position) vec4f{
   var p=array<vec2f,6>(vec2f(-1.,-1.),vec2f(1.,-1.),vec2f(-1.,1.),vec2f(-1.,1.),vec2f(1.,-1.),vec2f(1.,1.));
   return vec4f(p[i],0.,1.);
 }
 @fragment fn fs(@builtin(position) pos:vec4f)->@location(0) vec4f{
-  let v=textureLoad(t,vec2i(pos.xy),0); var a=array<f32,4>(v.x,v.y,v.z,v.w);
-  let x=clamp(a[min(ch,3u)],0.,1.); return vec4f(x,x,x,1.);
+  let dim=vec2i(textureDimensions(t));
+  let tc=clamp(vec2i(i32(pos.x)+u.ox,i32(pos.y)+u.oy),vec2i(0),dim-vec2i(1));
+  let v=textureLoad(t,tc,0); var a=array<f32,4>(v.x,v.y,v.z,v.w);
+  let x=clamp(a[min(u.ch,3u)],0.,1.); return vec4f(x,x,x,1.);
 }`;
 
 // Viz u32: show one f16 channel of rgba32uint layer (8 channels packed)
 const VIZ_U32=`
+struct Vu{ch:u32,_p:u32,ox:i32,oy:i32}
 @group(0) @binding(0) var t:texture_2d<u32>;
-@group(0) @binding(1) var<uniform> ch:u32;
+@group(0) @binding(1) var<uniform> u:Vu;
 @vertex fn vs(@builtin(vertex_index) i:u32)->@builtin(position) vec4f{
   var p=array<vec2f,6>(vec2f(-1.,-1.),vec2f(1.,-1.),vec2f(-1.,1.),vec2f(-1.,1.),vec2f(1.,-1.),vec2f(1.,1.));
   return vec4f(p[i],0.,1.);
 }
 @fragment fn fs(@builtin(position) pos:vec4f)->@location(0) vec4f{
-  let t2=textureLoad(t,vec2i(pos.xy),0);
+  let dim=vec2i(textureDimensions(t));
+  let tc=clamp(vec2i(i32(pos.x)+u.ox,i32(pos.y)+u.oy),vec2i(0),dim-vec2i(1));
+  let t2=textureLoad(t,tc,0);
   let a=unpack2x16float(t2.x);let b=unpack2x16float(t2.y);
   let c=unpack2x16float(t2.z);let d=unpack2x16float(t2.w);
   var v=array<f32,8>(a.x,a.y,b.x,b.y,c.x,c.y,d.x,d.y);
-  let x=clamp(v[min(ch,7u)],0.,1.); return vec4f(x,x,x,1.);
+  let x=clamp(v[min(u.ch,7u)],0.,1.); return vec4f(x,x,x,1.);
 }`;
+
+// Full G-buffer pack: assembles feat_tex0/feat_tex1 from individual G-buffer images.
+// Bindings: albedo(0) normal(1) depth(2) matid(3) shadow(4) transp(5) f0(6) f1(7)
+// All source textures are rgba8unorm (browser-loaded images, R channel for depth/matid/shadow/transp).
+// Uses textureLoad() only (no sampler needed). Matches gbuf_pack.wgsl packing exactly.
+const FULL_PACK_SHADER=`
+@group(0) @binding(0) var albedo: texture_2d<f32>;
+@group(0) @binding(1) var normal: texture_2d<f32>;
+@group(0) @binding(2) var depth:  texture_2d<f32>;
+@group(0) @binding(3) var matid:  texture_2d<f32>;
+@group(0) @binding(4) var shadow: texture_2d<f32>;
+@group(0) @binding(5) var transp: texture_2d<f32>;
+@group(0) @binding(6) var f0:     texture_storage_2d<rgba32uint,write>;
+@group(0) @binding(7) var f1:     texture_storage_2d<rgba32uint,write>;
+fn ld(c:vec2i,d:vec2i)->f32{return textureLoad(depth,clamp(c,vec2i(0),d-vec2i(1)),0).r;}
+fn b2(tl:vec2i,d:vec2i)->vec3f{
+  var s=vec3f(0.);
+  for(var y:i32=0;y<2;y++){for(var x:i32=0;x<2;x++){s+=textureLoad(albedo,clamp(tl+vec2i(x,y),vec2i(0),d-vec2i(1)),0).rgb;}}
+  return s*.25;}
+fn b4(tl:vec2i,d:vec2i)->vec3f{
+  var s=vec3f(0.);
+  for(var y:i32=0;y<4;y++){for(var x:i32=0;x<4;x++){s+=textureLoad(albedo,clamp(tl+vec2i(x,y),vec2i(0),d-vec2i(1)),0).rgb;}}
+  return s*(1./16.);}
+@compute @workgroup_size(8,8)
+fn main(@builtin(global_invocation_id) id:vec3u){
+  let c=vec2i(id.xy); let d=vec2i(textureDimensions(albedo));
+  if(c.x>=d.x||c.y>=d.y){return;}
+  let alb=textureLoad(albedo,c,0).rgb;
+  let nrm=textureLoad(normal,c,0).rg;
+  let oct=nrm*2.-vec2f(1.);  // [0,1] -> [-1,1]
+  let dv=ld(c,d);
+  let dzdx=(ld(c+vec2i(1,0),d)-ld(c-vec2i(1,0),d))*.5;
+  let dzdy=(ld(c+vec2i(0,1),d)-ld(c-vec2i(0,1),d))*.5;
+  textureStore(f0,c,vec4u(
+    pack2x16float(alb.rg),
+    pack2x16float(vec2f(alb.b,oct.x)),
+    pack2x16float(vec2f(oct.y,dv)),
+    pack2x16float(vec2f(dzdx,dzdy))));
+  let mid=textureLoad(matid,c,0).r;
+  let shd=textureLoad(shadow,c,0).r;
+  let trp=textureLoad(transp,c,0).r;
+  let m1=b2(c-vec2i(0),d); let m2=b4(c-vec2i(1),d);
+  textureStore(f1,c,vec4u(
+    pack4x8unorm(vec4f(mid,0.,0.,0.)),
+    pack4x8unorm(vec4f(m1.r,m1.g,m1.b,m2.r)),
+    pack4x8unorm(vec4f(m2.g,m2.b,shd,trp)),
+    0u));}`;
diff --git a/cnn_v3/tools/tester.js b/cnn_v3/tools/tester.js
index aa765a1..0412cae 100644
--- a/cnn_v3/tools/tester.js
+++ b/cnn_v3/tools/tester.js
@@ -13,6 +13,7 @@ class CNNv3Tester {
     this.image   = null;
     this.isVideo = false;
     this.viewMode= 0;   // 0=cnn 1=orig 2=diff
+    this.targetBitmap = null;  // set when a sample dir with target.png is loaded
     this.blend   = 1.0;
     this.layerTextures = {};
     this.lastResult = null;
@@ -44,9 +45,43 @@ class CNNv3Tester {
       this.format  = navigator.gpu.getPreferredCanvasFormat();
       this.linearSampler = this.device.createSampler({magFilter:'linear',minFilter:'linear',mipmapFilter:'linear'});
       this.log('WebGPU ready');
+      this.preload();
     } catch(e) { this.setStatus(`GPU error: ${e.message}`,true); }
   }
 
+  async preload() {
+    const base = '../../workspaces/main/weights/';
+    const files = [
+      {url: base+'cnn_v3_weights.bin',  isFilm: false},
+      {url: base+'cnn_v3_film_mlp.bin', isFilm: true},
+    ];
+    for (const {url, isFilm} of files) {
+      try {
+        const r = await fetch(url);
+        if (!r.ok) { this.log(`preload skip: ${url.split('/').pop()} (${r.status})`); continue; }
+        const buf = await r.arrayBuffer();
+        const name = url.split('/').pop();
+        if (isFilm) {
+          this.filmMlp = this.parseFilm(buf);
+          const el = document.getElementById('fDrop');
+          el.textContent = `✓ ${name}`; el.classList.add('ok');
+          document.getElementById('fSt').textContent = 'FiLM MLP loaded';
+          document.getElementById('fSt').style.color = '#28a745';
+        } else {
+          this.weightsU32 = this.parseWeights(buf); this.weightsBuffer = buf;
+          if (this.weightsGPU) { this.weightsGPU.destroy(); this.weightsGPU = null; }
+          const el = document.getElementById('wDrop');
+          el.textContent = `✓ ${name}`; el.classList.add('ok');
+        }
+        this.log(`Preloaded: ${name}`);
+      } catch(e) { this.log(`preload error (${url.split('/').pop()}): ${e.message}`, 'err'); }
+    }
+    if (this.weightsU32) {
+      if (this.image || this.isVideo) this.run();
+      else this.setStatus('Weights loaded — drop image/video');
+    }
+  }
+
   getDims() {
     return this.isVideo
       ? {w:this.video.videoWidth, h:this.video.videoHeight}
@@ -106,7 +141,7 @@ class CNNv3Tester {
   filmParams() {
     const I4=[1,1,1,1],Z4=[0,0,0,0],I8=[1,1,1,1,1,1,1,1],Z8=[0,0,0,0,0,0,0,0];
     if (!this.filmMlp) return {ge0:I4,be0:Z4,ge1:I8,be1:Z8,gd1:I4,bd1:Z4,gd0:I4,bd0:Z4};
-    const v=document.getElementById;
+    const v=document.getElementById.bind(document);
     const cond=[v('sBP').value,v('sBN').value,v('sAI').value,v('sP0').value,v('sP1').value].map(Number);
     const f=this.filmFwd(cond);
     return {
@@ -350,6 +385,7 @@ class CNNv3Tester {
     this.layerTextures={feat0:f0,feat1:f1,enc0:e0,enc1:e1,bn,dec1:d1,output:ot};
     this.lastResult={ot,itex:this.inputTex,uDp,dispPL:this.getDisp(),w,h};
     this.updateVizPanel();
+    this.refreshZoom();
   }
 
   destroyLayerTex(){for(const t of Object.values(this.layerTextures||{}))try{t.destroy();}catch(_){} this.layerTextures={};}
@@ -422,10 +458,19 @@ class CNNv3Tester {
     document.getElementById(`vb_${id}`)?.classList.add('act');
     const def=this.vizDefs.find(d=>d.id===id); if(!def)return;
     const grid=document.getElementById('chgrid'); grid.innerHTML='';
+    const chName = (c) => `${def.lbl} → ${def.ch[c]||'c'+c}`;
     for(let c=0;c<def.nch;c++){
       const cell=document.createElement('div'); cell.className='chcell';
       const lbl=document.createElement('div'); lbl.className='chcell-lbl'; lbl.textContent=def.ch[c]||`c${c}`;
       const cvs=document.createElement('canvas');
+      const name=chName(c);
+      cvs.title=name;
+      cvs.onclick=(e)=>{
+        const r=cvs.getBoundingClientRect();
+        const tx=Math.round(e.offsetX/r.width*tex.width);
+        const ty=Math.round(e.offsetY/r.height*tex.height);
+        tester.zoomChannel(id,c,name,tx,ty);
+      };
       cell.appendChild(lbl); cell.appendChild(cvs); grid.appendChild(cell);
     }
     const pl=def.t==='f32'?this.getVizF32():this.getVizU32();
@@ -434,8 +479,8 @@ class CNNv3Tester {
       cvs.width=tex.width; cvs.height=tex.height;
       const ctx=cvs.getContext('webgpu'); if(!ctx)continue;
       try{ctx.configure({device:this.device,format:this.format});}catch(_){continue;}
-      const chBuf=this.device.createBuffer({size:4,usage:GPUBufferUsage.UNIFORM|GPUBufferUsage.COPY_DST});
-      this.device.queue.writeBuffer(chBuf,0,new Uint32Array([c]));
+      const chBuf=this.device.createBuffer({size:16,usage:GPUBufferUsage.UNIFORM|GPUBufferUsage.COPY_DST});
+      this.device.queue.writeBuffer(chBuf,0,new Int32Array([c,0,0,0]));
       const bg=this.device.createBindGroup({layout:pl.getBindGroupLayout(0),
         entries:[{binding:0,resource:tex.createView()},{binding:1,resource:{buffer:chBuf}}]});
       const enc=this.device.createCommandEncoder();
@@ -447,6 +492,54 @@ class CNNv3Tester {
     await this.device.queue.onSubmittedWorkDone();
   }
 
+  zoomChannel(layerId, ch, label, clickTx=0, clickTy=0) {
+    const def = this.vizDefs?.find(d => d.id === layerId);
+    const tex = this.layerTextures[layerId];
+    if (!def || !tex || !this.device) return;
+    const wrap = document.getElementById('chzoomWrap');
+    const lbl  = document.getElementById('chzoomLbl');
+    this.activeZoom    = {layerId, ch, label, clickTx, clickTy};
+    lbl.textContent    = label;
+    wrap.style.display = 'flex';
+    // Wait for layout so clientWidth/clientHeight reflect the flex-distributed size
+    requestAnimationFrame(() => {
+      const dst   = document.getElementById('chzoom');
+      const pad   = 12;
+      const lblH  = lbl.offsetHeight + 6;
+      const availW = wrap.clientWidth  - pad;
+      const availH = wrap.clientHeight - pad - lblH;
+      const scale  = Math.min(1, availW / tex.width, availH / tex.height);
+      dst.width  = Math.round(tex.width  * scale);
+      dst.height = Math.round(tex.height * scale);
+      // Re-render via WebGPU centered on the clicked texel
+      const ox = clickTx - Math.floor(dst.width  / 2);
+      const oy = clickTy - Math.floor(dst.height / 2);
+      const pl = def.t === 'f32' ? this.getVizF32() : this.getVizU32();
+      const ctx = dst.getContext('webgpu');
+      try { ctx.configure({device: this.device, format: this.format}); } catch(_) { return; }
+      const uData = new ArrayBuffer(16);
+      const dv = new DataView(uData);
+      dv.setUint32(0, ch, true); dv.setInt32(8, ox, true); dv.setInt32(12, oy, true);
+      const chBuf = this.device.createBuffer({size:16, usage:GPUBufferUsage.UNIFORM|GPUBufferUsage.COPY_DST});
+      this.device.queue.writeBuffer(chBuf, 0, uData);
+      const bg = this.device.createBindGroup({layout: pl.getBindGroupLayout(0),
+        entries:[{binding:0, resource:tex.createView()}, {binding:1, resource:{buffer:chBuf}}]});
+      const enc = this.device.createCommandEncoder();
+      const rp  = enc.beginRenderPass({colorAttachments:[{
+        view:ctx.getCurrentTexture().createView(), loadOp:'clear', storeOp:'store'}]});
+      rp.setPipeline(pl); rp.setBindGroup(0, bg); rp.draw(6); rp.end();
+      this.device.queue.submit([enc.finish()]);
+      chBuf.destroy();
+    });
+  }
+
+  refreshZoom() {
+    if (this.activeZoom) {
+      const {layerId, ch, label, clickTx, clickTy} = this.activeZoom;
+      this.zoomChannel(layerId, ch, label, clickTx, clickTy);
+    }
+  }
+
   // ── Save PNG ─────────────────────────────────────────────────────────────
 
   async savePNG() {
@@ -491,6 +584,281 @@ class CNNv3Tester {
       return(s?-1:1)*Math.pow(2,e-15)*(1+m/1024);};
     return [f(lo),f(hi)];
   }
+
+  // ── Full G-buffer pack pipeline ───────────────────────────────────────────
+
+  getFullPack() {
+    return this.pl('fullpack', () => this.computePL(FULL_PACK_SHADER, 'main'));
+  }
+
+  // Create a 1×1 rgba8unorm fallback texture with given RGBA bytes [0-255].
+  makeFallbackTex(r, g, b, a) {
+    const tex = this.device.createTexture({size:[1,1], format:'rgba8unorm',
+      usage: GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_DST});
+    this.device.queue.writeTexture({texture:tex}, new Uint8Array([r,g,b,a]),
+      {bytesPerRow:4,rowsPerImage:1}, [1,1]);
+    return tex;
+  }
+
+  // Load an image File as a GPU rgba8unorm texture. Returns {tex, w, h}.
+  async loadGpuTex(file) {
+    const bmp = await createImageBitmap(file);
+    const w = bmp.width, h = bmp.height;
+    const tex = this.device.createTexture({size:[w,h], format:'rgba8unorm',
+      usage: GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_DST|GPUTextureUsage.RENDER_ATTACHMENT});
+    this.device.queue.copyExternalImageToTexture({source:bmp}, {texture:tex}, [w,h]);
+    bmp.close();
+    return {tex, w, h};
+  }
+
+  // ── Load sample directory ─────────────────────────────────────────────────
+
+  async loadSampleDir(files) {
+    if (!files || files.length === 0) return;
+    if (!this.weightsU32) { this.setStatus('Load weights first', true); return; }
+
+    this.setMode('full');
+    const st = document.getElementById('sampleSt');
+    st.textContent = 'Loading…';
+
+    // Match files by name pattern
+    const match = (pat) => {
+      for (const f of files) {
+        const n = f.name.toLowerCase();
+        if (pat.some(p => n.includes(p))) return f;
+      }
+      return null;
+    };
+
+    const fAlbedo = match(['albedo', 'color']);
+    const fNormal = match(['normal', 'nrm']);
+    const fDepth  = match(['depth']);
+    const fMatid  = match(['matid', 'index', 'mat_id']);
+    const fShadow = match(['shadow']);
+    const fTransp = match(['transp', 'alpha']);
+    const fTarget = match(['target', 'output', 'ground_truth']);
+
+    if (!fAlbedo) {
+      st.textContent = '✗ No albedo.png found';
+      this.setStatus('No albedo.png in sample dir', true);
+      return;
+    }
+
+    try {
+      const t0 = performance.now();
+
+      // Load primary albedo to get dimensions
+      const {tex: albTex, w, h} = await this.loadGpuTex(fAlbedo);
+      this.canvas.width = w; this.canvas.height = h;
+      this.context.configure({device:this.device, format:this.format});
+
+      // Load optional channels — fall back to neutral 1×1 textures
+      const nrmTex = fNormal ? (await this.loadGpuTex(fNormal)).tex
+                              : this.makeFallbackTex(128, 128, 0, 255); // oct-encoded (0,0) normal
+      const dptTex = fDepth  ? (await this.loadGpuTex(fDepth)).tex
+                              : this.makeFallbackTex(0, 0, 0, 255);
+      const midTex = fMatid  ? (await this.loadGpuTex(fMatid)).tex
+                              : this.makeFallbackTex(0, 0, 0, 255);
+      const shdTex = fShadow ? (await this.loadGpuTex(fShadow)).tex
+                              : this.makeFallbackTex(255, 255, 255, 255); // fully lit
+      const trpTex = fTransp ? (await this.loadGpuTex(fTransp)).tex
+                              : this.makeFallbackTex(0, 0, 0, 255);      // fully opaque
+
+      // Load target if present
+      if (this.targetBitmap) { this.targetBitmap.close(); this.targetBitmap = null; }
+      if (fTarget) {
+        this.targetBitmap = await createImageBitmap(fTarget);
+        this.showTarget();
+      } else {
+        document.getElementById('targetPane').style.display = 'none';
+      }
+
+      // Pack G-buffer into feat0/feat1
+      const mk = (fmt, tw, th) => this.device.createTexture({size:[tw,th], format:fmt,
+        usage:GPUTextureUsage.STORAGE_BINDING|GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_SRC});
+      const f0 = mk('rgba32uint', w, h);
+      const f1 = mk('rgba32uint', w, h);
+
+      const ceil8 = (n) => Math.ceil(n/8);
+      const pl = this.getFullPack();
+      const bg = this.device.createBindGroup({layout: pl.getBindGroupLayout(0),
+        entries: [
+          {binding:0, resource: albTex.createView()},
+          {binding:1, resource: nrmTex.createView()},
+          {binding:2, resource: dptTex.createView()},
+          {binding:3, resource: midTex.createView()},
+          {binding:4, resource: shdTex.createView()},
+          {binding:5, resource: trpTex.createView()},
+          {binding:6, resource: f0.createView()},
+          {binding:7, resource: f1.createView()},
+        ]});
+
+      const enc = this.device.createCommandEncoder();
+      const cp = enc.beginComputePass();
+      cp.setPipeline(pl); cp.setBindGroup(0, bg);
+      cp.dispatchWorkgroups(ceil8(w), ceil8(h));
+      cp.end();
+      this.device.queue.submit([enc.finish()]);
+      await this.device.queue.onSubmittedWorkDone();
+
+      // Cleanup source textures
+      [albTex, nrmTex, dptTex, midTex, shdTex, trpTex].forEach(t => t.destroy());
+
+      const found = [fAlbedo, fNormal, fDepth, fMatid, fShadow, fTransp]
+        .filter(Boolean).map(f => f.name).join(', ');
+      st.textContent = `✓ ${found}`;
+      this.log(`Sample packed: ${w}×${h}, ${((performance.now()-t0)).toFixed(0)}ms`);
+
+      // Run inference — runFromFeat takes ownership of f0/f1 (stored in layerTextures)
+      await this.runFromFeat(f0, f1, w, h);
+
+    } catch(e) {
+      st.textContent = `✗ ${e.message}`;
+      this.setStatus(`Sample error: ${e.message}`, true);
+      this.log(`Sample error: ${e.message}`, 'err');
+    }
+  }
+
+  // Show target.png in the #targetPane alongside main canvas.
+  showTarget() {
+    if (!this.targetBitmap) return;
+    const tc = document.getElementById('targetCanvas');
+    tc.width  = this.targetBitmap.width;
+    tc.height = this.targetBitmap.height;
+    const ctx2d = tc.getContext('2d');
+    ctx2d.drawImage(this.targetBitmap, 0, 0);
+    document.getElementById('targetPane').style.display = 'flex';
+  }
+
+  // Run CNN inference starting from pre-packed feat_tex0 / feat_tex1.
+  // Used by loadSampleDir() to skip the photo-pack step.
+  async runFromFeat(f0, f1, w, h) {
+    if (!this.weightsU32 || !this.device) return;
+    const t0 = performance.now();
+    const W2=w>>1, H2=h>>1, W4=W2>>1, H4=H2>>1;
+
+    this.context.configure({device:this.device, format:this.format});
+
+    // Create a neutral "original" texture so the display shader can still
+    // render Orig/Diff modes (just black for sample mode).
+    if (this.inputTex) this.inputTex.destroy();
+    this.inputTex = this.device.createTexture({size:[w,h], format:'rgba8unorm',
+      usage:GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_DST|GPUTextureUsage.RENDER_ATTACHMENT});
+    // Leave it cleared to black — Diff mode against target would need more work
+
+    const mk = (fmt, tw, th) => this.device.createTexture({size:[tw,th], format:fmt,
+      usage:GPUTextureUsage.STORAGE_BINDING|GPUTextureUsage.TEXTURE_BINDING|GPUTextureUsage.COPY_SRC});
+    const e0=mk('rgba16float',w,h), e1=mk('rgba32uint',W2,H2);
+    const bn=mk('rgba32uint',W4,H4), d1=mk('rgba16float',W2,H2), ot=mk('rgba16float',w,h);
+
+    if (!this.weightsGPU) {
+      this.weightsGPU = this.device.createBuffer({size:this.weightsBuffer.byteLength,
+        usage:GPUBufferUsage.STORAGE|GPUBufferUsage.COPY_DST});
+      this.device.queue.writeBuffer(this.weightsGPU, 0, this.weightsBuffer);
+    }
+    const wg = this.weightsGPU;
+    const fp = this.filmParams();
+    const wu = (data) => {
+      const b = this.device.createBuffer({size:data.byteLength, usage:GPUBufferUsage.UNIFORM|GPUBufferUsage.COPY_DST});
+      this.device.queue.writeBuffer(b, 0, data); return b;
+    };
+    const uE0=wu(this.u4(ENC0_OFF,fp.ge0,fp.be0));
+    const uE1=wu(this.u8(ENC1_OFF,fp.ge1,fp.be1));
+    const uBN=wu(this.ubn(BN_OFF));
+    const uD1=wu(this.u4(DEC1_OFF,fp.gd1,fp.bd1));
+    const uD0=wu(this.u4(DEC0_OFF,fp.gd0,fp.bd0));
+    const dispData=new ArrayBuffer(16);
+    new DataView(dispData).setFloat32(4, this.blend, true);
+    const uDp=wu(dispData);
+
+    const enc = this.device.createCommandEncoder();
+    const bg  = (pl,...entries) => this.device.createBindGroup({layout:pl.getBindGroupLayout(0),
+      entries:entries.map((r,i)=>({binding:i,resource:r}))});
+    const rv  = (t) => t.createView();
+    const cp  = (pl,bgr,wx,wy) => {const p=enc.beginComputePass();p.setPipeline(pl);p.setBindGroup(0,bgr);p.dispatchWorkgroups(wx,wy);p.end();};
+    const ceil8 = (n) => Math.ceil(n/8);
+
+    cp(this.getEnc0(), bg(this.getEnc0(), rv(f0),rv(f1),{buffer:wg},{buffer:uE0},rv(e0)), ceil8(w), ceil8(h));
+    cp(this.getEnc1(), bg(this.getEnc1(), rv(e0),{buffer:wg},{buffer:uE1},rv(e1)), ceil8(W2), ceil8(H2));
+    cp(this.getBN(),   bg(this.getBN(),   rv(e1),{buffer:wg},{buffer:uBN},rv(bn)), ceil8(W4), ceil8(H4));
+    cp(this.getDec1(), bg(this.getDec1(), rv(bn),rv(e1),{buffer:wg},{buffer:uD1},rv(d1)), ceil8(W2), ceil8(H2));
+    cp(this.getDec0(), bg(this.getDec0(), rv(d1),rv(e0),{buffer:wg},{buffer:uD0},rv(ot)), ceil8(w), ceil8(h));
+
+    const dbg = bg(this.getDisp(), rv(ot), rv(this.inputTex), {buffer:uDp});
+    const rp = enc.beginRenderPass({colorAttachments:[{
+      view:this.context.getCurrentTexture().createView(), loadOp:'clear', storeOp:'store'}]});
+    rp.setPipeline(this.getDisp()); rp.setBindGroup(0, dbg); rp.draw(6); rp.end();
+
+    this.device.queue.submit([enc.finish()]);
+    await this.device.queue.onSubmittedWorkDone();
+
+    [uE0,uE1,uBN,uD1,uD0].forEach(b => b.destroy());
+
+    // Compute PSNR against target if available
+    let psnrStr = '';
+    if (this.targetBitmap) {
+      this.showTarget();
+      try { psnrStr = await this.computePSNR(ot, w, h); } catch(_) {}
+    }
+
+    this.destroyLayerTex();
+    this.layerTextures = {feat0:f0, feat1:f1, enc0:e0, enc1:e1, bn, dec1:d1, output:ot};
+    this.lastResult = {ot, itex:this.inputTex, uDp, dispPL:this.getDisp(), w, h};
+    this.updateVizPanel();
+    this.refreshZoom();
+
+    const ms = (performance.now()-t0).toFixed(1);
+    document.getElementById('cnnLabel').textContent = `CNN output  (${ms}ms)`;
+    if (psnrStr) document.getElementById('psnrSt').textContent = psnrStr;
+    this.setStatus(`Sample: ${ms}ms · ${w}×${h}`);
+    this.log(`runFromFeat: ${ms}ms`);
+  }
+
+  // Compute PSNR between CNN rgba16float output texture and target.png bitmap.
+  async computePSNR(outTex, w, h) {
+    const bpr = Math.ceil(w * 8 / 256) * 256;
+    const stg = this.device.createBuffer({size:bpr*h,
+      usage:GPUBufferUsage.COPY_DST|GPUBufferUsage.MAP_READ});
+    const enc = this.device.createCommandEncoder();
+    enc.copyTextureToBuffer({texture:outTex}, {buffer:stg, bytesPerRow:bpr, rowsPerImage:h}, [w,h]);
+    this.device.queue.submit([enc.finish()]);
+    await stg.mapAsync(GPUMapMode.READ);
+    const raw = new DataView(stg.getMappedRange());
+
+    // Decode output pixels from f16
+    const f16 = (bits) => {
+      const s=(bits>>15)&1, e=(bits>>10)&0x1F, m=bits&0x3FF;
+      if(e===0) return 0; if(e===31) return s?0:1;
+      return Math.max(0,Math.min(1,(s?-1:1)*Math.pow(2,e-15)*(1+m/1024)));
+    };
+    const cnnPx = new Float32Array(w*h*3);
+    for (let y=0;y<h;y++) for (let x=0;x<w;x++) {
+      const src=y*bpr+x*8, pi=(y*w+x)*3;
+      cnnPx[pi]  = f16(raw.getUint16(src,   true));
+      cnnPx[pi+1]= f16(raw.getUint16(src+2, true));
+      cnnPx[pi+2]= f16(raw.getUint16(src+4, true));
+    }
+    stg.unmap(); stg.destroy();
+
+    // Read target pixels via offscreen canvas
+    const oc = document.createElement('canvas');
+    oc.width = w; oc.height = h;
+    const ctx2d = oc.getContext('2d');
+    ctx2d.drawImage(this.targetBitmap, 0, 0, w, h);
+    const tgtData = ctx2d.getImageData(0, 0, w, h).data;
+
+    let mse = 0;
+    const n = w * h * 3;
+    for (let i=0; i<w*h; i++) {
+      const dr = cnnPx[i*3]   - tgtData[i*4]  /255;
+      const dg = cnnPx[i*3+1] - tgtData[i*4+1]/255;
+      const db = cnnPx[i*3+2] - tgtData[i*4+2]/255;
+      mse += dr*dr + dg*dg + db*db;
+    }
+    mse /= n;
+    const psnr = mse > 0 ? (10 * Math.log10(1 / mse)).toFixed(2) : '∞';
+    return `MSE=${mse.toFixed(5)}  PSNR=${psnr}dB`;
+  }
 }
 
 // ── UI helpers ───────────────────────────────────────────────────────────────
diff --git a/cnn_v3/training/dataset/full/0001/albedo.png b/cnn_v3/training/dataset/full/0001/albedo.png
new file mode 100644
index 0000000..8f64b38
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0001/albedo.png
diff --git a/cnn_v3/training/dataset/full/0001/depth.png b/cnn_v3/training/dataset/full/0001/depth.png
new file mode 100644
index 0000000..c58fcd9
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0001/depth.png
diff --git a/cnn_v3/training/dataset/full/0001/matid.png b/cnn_v3/training/dataset/full/0001/matid.png
new file mode 100644
index 0000000..b4fa98f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0001/matid.png
diff --git a/cnn_v3/training/dataset/full/0001/normal.png b/cnn_v3/training/dataset/full/0001/normal.png
new file mode 100644
index 0000000..62f26e3
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0001/normal.png
diff --git a/cnn_v3/training/dataset/full/0001/shadow.png b/cnn_v3/training/dataset/full/0001/shadow.png
new file mode 100644
index 0000000..0471e7f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0001/shadow.png
diff --git a/cnn_v3/training/dataset/full/0001/target.png b/cnn_v3/training/dataset/full/0001/target.png
new file mode 100644
index 0000000..587d54a
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0001/target.png
diff --git a/cnn_v3/training/dataset/full/0001/transp.png b/cnn_v3/training/dataset/full/0001/transp.png
new file mode 100644
index 0000000..b4fa98f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0001/transp.png
diff --git a/cnn_v3/training/dataset/full/0002/albedo.png b/cnn_v3/training/dataset/full/0002/albedo.png
new file mode 100644
index 0000000..8f64b38
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0002/albedo.png
diff --git a/cnn_v3/training/dataset/full/0002/depth.png b/cnn_v3/training/dataset/full/0002/depth.png
new file mode 100644
index 0000000..c58fcd9
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0002/depth.png
diff --git a/cnn_v3/training/dataset/full/0002/matid.png b/cnn_v3/training/dataset/full/0002/matid.png
new file mode 100644
index 0000000..b4fa98f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0002/matid.png
diff --git a/cnn_v3/training/dataset/full/0002/normal.png b/cnn_v3/training/dataset/full/0002/normal.png
new file mode 100644
index 0000000..62f26e3
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0002/normal.png
diff --git a/cnn_v3/training/dataset/full/0002/shadow.png b/cnn_v3/training/dataset/full/0002/shadow.png
new file mode 100644
index 0000000..0471e7f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0002/shadow.png
diff --git a/cnn_v3/training/dataset/full/0002/target.png b/cnn_v3/training/dataset/full/0002/target.png
new file mode 100644
index 0000000..587d54a
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0002/target.png
diff --git a/cnn_v3/training/dataset/full/0002/transp.png b/cnn_v3/training/dataset/full/0002/transp.png
new file mode 100644
index 0000000..b4fa98f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0002/transp.png
diff --git a/cnn_v3/training/dataset/full/0003/albedo.png b/cnn_v3/training/dataset/full/0003/albedo.png
new file mode 100644
index 0000000..8f64b38
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0003/albedo.png
diff --git a/cnn_v3/training/dataset/full/0003/depth.png b/cnn_v3/training/dataset/full/0003/depth.png
new file mode 100644
index 0000000..c58fcd9
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0003/depth.png
diff --git a/cnn_v3/training/dataset/full/0003/matid.png b/cnn_v3/training/dataset/full/0003/matid.png
new file mode 100644
index 0000000..b4fa98f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0003/matid.png
diff --git a/cnn_v3/training/dataset/full/0003/normal.png b/cnn_v3/training/dataset/full/0003/normal.png
new file mode 100644
index 0000000..62f26e3
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0003/normal.png
diff --git a/cnn_v3/training/dataset/full/0003/shadow.png b/cnn_v3/training/dataset/full/0003/shadow.png
new file mode 100644
index 0000000..0471e7f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0003/shadow.png
diff --git a/cnn_v3/training/dataset/full/0003/target.png b/cnn_v3/training/dataset/full/0003/target.png
new file mode 100644
index 0000000..587d54a
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0003/target.png
diff --git a/cnn_v3/training/dataset/full/0003/transp.png b/cnn_v3/training/dataset/full/0003/transp.png
new file mode 100644
index 0000000..b4fa98f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0003/transp.png
diff --git a/cnn_v3/training/dataset/full/0004/albedo.png b/cnn_v3/training/dataset/full/0004/albedo.png
new file mode 100644
index 0000000..8f64b38
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0004/albedo.png
diff --git a/cnn_v3/training/dataset/full/0004/depth.png b/cnn_v3/training/dataset/full/0004/depth.png
new file mode 100644
index 0000000..c58fcd9
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0004/depth.png
diff --git a/cnn_v3/training/dataset/full/0004/matid.png b/cnn_v3/training/dataset/full/0004/matid.png
new file mode 100644
index 0000000..b4fa98f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0004/matid.png
diff --git a/cnn_v3/training/dataset/full/0004/normal.png b/cnn_v3/training/dataset/full/0004/normal.png
new file mode 100644
index 0000000..62f26e3
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0004/normal.png
diff --git a/cnn_v3/training/dataset/full/0004/shadow.png b/cnn_v3/training/dataset/full/0004/shadow.png
new file mode 100644
index 0000000..0471e7f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0004/shadow.png
diff --git a/cnn_v3/training/dataset/full/0004/target.png b/cnn_v3/training/dataset/full/0004/target.png
new file mode 100644
index 0000000..587d54a
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0004/target.png
diff --git a/cnn_v3/training/dataset/full/0004/transp.png b/cnn_v3/training/dataset/full/0004/transp.png
new file mode 100644
index 0000000..b4fa98f
--- /dev/null
+++ b/cnn_v3/training/dataset/full/0004/transp.png