feat(cnn_v3): HTML WebGPU tool (index.html + shaders.js + tester.js)

3-file tool, 939 lines total. Implements full U-Net+FiLM inference in the browser: Pack→Enc0→Enc1→Bottleneck→Dec1→Dec0 compute passes, layer visualisation (Feat/Enc0/Enc1/BN/Dec1/Output), FiLM MLP sliders, drag-drop weights + image/video, Save PNG, diff/blend view modes. HOW_TO_CNN.md §7 updated to reflect tool is implemented. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
author: skal <pascal.massimino@gmail.com> 2026-03-21 10:50:02 +0100
committer: skal <pascal.massimino@gmail.com> 2026-03-21 10:50:02 +0100
commit: 35355b17576e93b035a2a78ecd05771e98f068ee (patch)
tree: a1c1a4563a62ad69c808383fcf0bce1ccf4c5765 /cnn_v3/docs/HOW_TO_CNN.md
parent: e343021ac007549c76e58b27a361b11dd3f6a136 (diff)
1 files changed, 50 insertions, 91 deletions
diff --git a/cnn_v3/docs/HOW_TO_CNN.md b/cnn_v3/docs/HOW_TO_CNN.md
index 8c41ab0..f325a38 100644
--- a/cnn_v3/docs/HOW_TO_CNN.md
+++ b/cnn_v3/docs/HOW_TO_CNN.md
@@ -646,120 +646,76 @@ If results drift after shader edits, verify these invariants match the Python re
 
 ## 7. HTML WebGPU Tool
 
-### Current state
+**Location:** `cnn_v3/tools/` — three files, no build step.
 
-There is no dedicated CNN v3 HTML tool yet.
-The CNN v2 tool (`cnn_v2/tools/cnn_v2_test/index.html`) is the reference pattern.
+| File | Lines | Contents |
+|------|-------|----------|
+| `index.html` | 147 | HTML + CSS |
+| `shaders.js` | 252 | WGSL shader constants, weight-offset constants |
+| `tester.js` | 540 | `CNNv3Tester` class, event wiring |
 
-### CNN v2 tool as reference
+### Usage
 
-The v2 tool is a single self-contained HTML file demonstrating:
-- Inline WGSL shaders (no build step)
-- Drag-and-drop `.bin` weight loading
-- Image/video file input
-- Intermediate layer visualisation
-- View modes: CNN output / original / diff×10
-- Side panel with per-layer weight statistics
-
-A v3 tool follows the same pattern with a more complex texture chain.
-
-### What a CNN v3 HTML tool requires
-
-**WGSL shaders to inline** (resolve `#include "cnn_v3/common"` via JS string substitution):
-
-```js
-const common = `/* contents of cnn_v3_common.wgsl */`;
-const enc0_src = enc0_template.replace('#include "cnn_v3/common"', common);
+```bash
+# Requires HTTP server (WebGPU blocked on file://)
+cd /path/to/demo
+python3 -m http.server 8080
+# Open: http://localhost:8080/cnn_v3/tools/
 ```
 
-**Texture chain:**
-
-| Texture | Format | Size |
-|---------|--------|------|
-| feat_tex0 (input) | rgba32uint | W × H |
-| feat_tex1 (input) | rgba32uint | W × H |
-| enc0_tex | rgba16float | W × H |
-| enc1_tex | rgba32uint | W/2 × H/2 |
-| bottleneck_tex | rgba32uint | W/4 × H/4 |
-| dec1_tex | rgba16float | W/2 × H/2 |
-| output_tex | rgba16float | W × H |
-
-`rgba32uint` textures cannot be sampled; use `textureLoad` — already done in the shaders.
-
-**Weight loading:**
-
-```js
-const resp = await fetch('cnn_v3_weights.bin');
-const buf  = await resp.arrayBuffer();
-const gpu_buf = device.createBuffer({
-    size: buf.byteLength,
-    usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST
-});
-device.queue.writeBuffer(gpu_buf, 0, buf);
+Or on macOS with Chrome:
+```bash
+open -a "Google Chrome" --args --allow-file-access-from-files
+open cnn_v3/tools/index.html
 ```
 
-**FiLM MLP inference (JS-side):**
+### Workflow
 
-```js
-// Load cnn_v3_film_mlp.bin as Float32Array
-const mlp = new Float32Array(await (await fetch('cnn_v3_film_mlp.bin')).arrayBuffer());
-const L0_W = mlp.subarray(0,    80);   // (16×5) row-major
-const L0_b = mlp.subarray(80,   96);
-const L1_W = mlp.subarray(96,  736);   // (40×16) row-major
-const L1_b = mlp.subarray(736, 776);
+1. **Drop `cnn_v3_weights.bin`** onto the left "weights" drop zone.
+2. **Drop a PNG or video** onto the centre canvas → CNN runs immediately.
+3. _(Optional)_ **Drop `cnn_v3_film_mlp.bin`** → FiLM sliders become active.
+4. Adjust **beat_phase / beat_norm / audio_int / style_p0 / style_p1** sliders → reruns on change.
+5. Click layer buttons (**Feat · Enc0 · Enc1 · BN · Dec1 · Output**) in the right panel to inspect activations.
+6. **Save PNG** to export the current output.
 
-function mlp_forward(cond5) {
-    // h = relu(L0_W @ cond + L0_b)
-    const h = new Float32Array(16);
-    for (let o = 0; o < 16; o++) {
-        let s = L0_b[o];
-        for (let i = 0; i < 5; i++) s += L0_W[o * 5 + i] * cond5[i];
-        h[o] = Math.max(0, s);
-    }
-    // out = L1_W @ h + L1_b
-    const out = new Float32Array(40);
-    for (let o = 0; o < 40; o++) {
-        let s = L1_b[o];
-        for (let i = 0; i < 16; i++) s += L1_W[o * 16 + i] * h[i];
-        out[o] = s;
-    }
-    return out;  // [γenc0×4, βenc0×4, γenc1×8, βenc1×8, γdec1×4, βdec1×4, γdec0×4, βdec0×4]
-}
-```
+Keyboard: `[SPACE]` toggle original · `[D]` diff×10.
 
-The 40 outputs split into per-layer γ/β and uploaded to the 5 params uniform buffers
-before each compute dispatch.
+### Input files
 
-**Input feature assembly from a photo:**
+| File | Format | Notes |
+|------|--------|-------|
+| `cnn_v3_weights.bin` | raw u32 (no header) | 982 u32 = 1964 f16 = ~3.9 KB |
+| `cnn_v3_film_mlp.bin` | raw f32 | 776 f32 = 3.1 KB; optional — identity FiLM used if absent |
 
-For simple (photo-only) mode, build `feat_tex0` and `feat_tex1` from the image data:
-- `feat_tex0`: pack albedo RGB (f16×3), normal XY (128,128 neutral → 0.0 in oct), depth (0), depth_grad (0,0) as `pack2x16float` into rgba32uint
-- `feat_tex1`: pack mat_id (0), prev.rgb (0,0,0), mip1.rgb, mip2.rgb, shadow (1.0), transp (0) as `pack4x8unorm` into rgba32uint
+Both produced by `export_cnn_v3_weights.py` (§3).
 
-See `cnn_v3/shaders/gbuf_pack.wgsl` for the exact packing layout (mirrors `GBufferEffect`).
+### Texture chain
 
-### Serving locally
+| Texture | Format | Size |
+|---------|--------|------|
+| `feat_tex0` | rgba32uint | W × H (8 f16: albedo, normal, depth, depth_grad) |
+| `feat_tex1` | rgba32uint | W × H (12 u8: mat_id, prev, mip1, mip2, shadow, transp) |
+| `enc0_tex` | rgba16float | W × H |
+| `enc1_tex` | rgba32uint | W/2 × H/2 (8 f16 packed) |
+| `bn_tex` | rgba32uint | W/4 × H/4 |
+| `dec1_tex` | rgba16float | W/2 × H/2 |
+| `output_tex` | rgba16float | W × H → displayed on canvas |
 
-Chrome requires a real HTTP server for WebGPU (not `file://`):
+### Simple mode (photo input)
 
-```bash
-python3 -m http.server 8080
-# Open: http://localhost:8080/cnn_v3/tools/cnn_v3_test/index.html
-```
+Albedo = image RGB, mip1/mip2 from GPU mipmaps, shadow = 1.0, transp = 1 − alpha,
+all geometric channels (normal, depth, depth_grad, mat_id, prev) = 0.
 
 ### Browser requirements
 
-- Chrome 113+ with WebGPU enabled (default on desktop)
+- Chrome 113+ / Edge 113+ (WebGPU on by default)
 - Firefox Nightly with `dom.webgpu.enabled = true`
-- Required features: check `device.features.has('shader-f16')` for f16 support;
-  fall back to f32 accumulation if absent
 
 ### Pitfalls
 
-- `rgba32uint` requires `STORAGE` + `TEXTURE_BINDING` usage flags; missing either causes bind group creation failure
-- WGSL `#include "cnn_v3/common"` must be resolved via JS string replace before passing to `device.createShaderModule()`
-- Workgroup dispatch: `Math.ceil(W / 8)` × `Math.ceil(H / 8)` — same formula as C++
-- Cross-origin image loading requires CORS headers or same-origin hosting
+- `rgba32uint` and `rgba16float` textures both need `STORAGE_BINDING | TEXTURE_BINDING` usage.
+- Weight offsets are **f16 indices** (enc0=0, enc1=724, bn=1020, dec1=1092, dec0=1672).
+- Uniform buffer layouts must match WGSL `Params` structs exactly (padding included).
 
 ---
 
@@ -780,6 +736,9 @@ python3 -m http.server 8080
 | `cnn_v3/src/gbuffer_effect.h/.cc` | GBufferEffect: rasterise + pack G-buffer feature textures |
 | `src/tests/gpu/test_cnn_v3_parity.cc` | Per-pixel parity test (WGSL vs. Python reference) |
 | `cnn_v3/docs/CNN_V3.md` | Full architecture spec (U-Net, FiLM, WGSL uniform layouts) |
+| `cnn_v3/tools/index.html` | HTML tool — UI shell + CSS |
+| `cnn_v3/tools/shaders.js` | HTML tool — inline WGSL shaders + weight-offset constants |
+| `cnn_v3/tools/tester.js` | HTML tool — CNNv3Tester class, inference pipeline, layer viz |
 | `cnn_v2/tools/cnn_v2_test/index.html` | HTML tool reference pattern (v2) |
 
 ---
author	skal <pascal.massimino@gmail.com>	2026-03-21 10:50:02 +0100
committer	skal <pascal.massimino@gmail.com>	2026-03-21 10:50:02 +0100
commit	35355b17576e93b035a2a78ecd05771e98f068ee (patch)
tree	a1c1a4563a62ad69c808383fcf0bce1ccf4c5765 /cnn_v3/docs/HOW_TO_CNN.md
parent	e343021ac007549c76e58b27a361b11dd3f6a136 (diff)