Add CNN v2 WebGPU testing tool

Implements single-file HTML tool for rapid CNN weight validation: Features: - Drag-drop PNG images (whole window) and .bin weights - Real-time WebGPU compute pipeline (static features + N layers) - Data-driven execution (reads layer count from binary) - View modes: CNN output / Original / Diff (×10) - Blend slider (0.0-1.0) for effect strength - Console log with timestamps - Keyboard shortcuts: SPACE (original), D (diff) Architecture: - Embedded WGSL shaders (static + compute + display) - Binary parser for .bin format (header + layer info + f16 weights) - Persistent textures for view mode switching - Absolute weight offset calculation (header + layer info skip) Implementation notes: - Weight offsets in binary are relative to weights section - JavaScript precalculates absolute offsets: headerOffsetU32 * 2 + offset - Matches C++ shader behavior (simple get_weight without offset param) - Ping-pong textures for multi-layer processing TODO: - Side panel: .bin metadata, weight statistics, validation - Layer inspection: R/G/B/A plane split, intermediate outputs - Activation heatmaps for debugging Files: - tools/cnn_v2_test/index.html (24 KB, 730 lines) - tools/cnn_v2_test/README.md (usage guide, troubleshooting) handoff(Claude): CNN v2 HTML testing tool complete, documented TODOs for future enhancements
author: skal <pascal.massimino@gmail.com> 2026-02-13 09:33:12 +0100
committer: skal <pascal.massimino@gmail.com> 2026-02-13 09:33:12 +0100
commit: 2a793c23b582ed134b8294bfbbe3b6d7aaafe0c0 (patch)
tree: 53073f5947515a3aad9afca7c24ff6b52ce5ef05 /tools
parent: 65f6866b985fb3d0993fc2b6798c74015fb4fa6a (diff)
2 files changed, 1040 insertions, 0 deletions
diff --git a/tools/cnn_v2_test/README.md b/tools/cnn_v2_test/README.md
new file mode 100644
index 0000000..2a8e08d
--- /dev/null
+++ b/tools/cnn_v2_test/README.md
@@ -0,0 +1,231 @@
+# CNN v2 Testing Tool
+
+WebGPU-based browser tool for testing trained CNN v2 weights.
+
+---
+
+## Features
+
+- Drag-drop PNG images and `.bin` weights
+- Real-time CNN inference with WebGPU compute shaders
+- View modes: CNN output, original input, difference (×10)
+- Adjustable blend amount and depth
+- Data-driven pipeline (supports variable layer count)
+- GPU timing display
+
+---
+
+## Requirements
+
+- Browser with WebGPU support:
+  - Chrome/Edge 113+ (enable `chrome://flags/#enable-unsafe-webgpu` if needed)
+  - Safari 18+ (macOS Ventura+)
+- Trained CNN v2 weights in binary format (`.bin`)
+- Test images (PNG format)
+
+---
+
+## Usage
+
+### 1. Open Tool
+
+```bash
+open tools/cnn_v2_test/index.html
+```
+
+Or use a local server to avoid CORS:
+```bash
+python3 -m http.server 8000
+# Open http://localhost:8000/tools/cnn_v2_test/
+```
+
+### 2. Load Data
+
+1. **Drop PNG image** anywhere in window (shows preview immediately)
+2. **Drop `.bin` weights** into header drop zone
+3. CNN runs automatically when both loaded
+
+### 3. Controls
+
+**Sliders:**
+- **Blend:** Mix between original (0.0) and CNN output (1.0)
+- **Depth:** Uniform depth value for all pixels (0.0–1.0)
+
+**Keyboard:**
+- `SPACE` - Toggle original input view
+- `D` - Toggle difference view (×10 amplification)
+
+**Status Bar:**
+- Shows GPU timing (ms), image dimensions, and current view mode
+- Red text indicates errors
+
+**Console Log:**
+- Timestamped event log at bottom
+- Tracks file loads, pipeline execution, errors
+- Auto-scrolls to latest messages
+
+---
+
+## Preparing Test Data
+
+### Export Weights
+
+```bash
+# From trained checkpoint
+./training/export_cnn_v2_weights.py \
+  checkpoints/checkpoint_epoch_100.pth \
+  --output-weights tools/cnn_v2_test/test_weights.bin
+```
+
+Binary format: 16-byte header + 20 bytes per layer + f16 weights (~3.2 KB for 3-layer model)
+
+### Test Images
+
+Use training images or any PNG:
+```bash
+# Copy test image
+cp training/input/test.png tools/cnn_v2_test/
+```
+
+**Note:** Grayscale images automatically converted to RGB.
+
+---
+
+## Validation
+
+### Visual Comparison
+
+Compare browser output with C++ tool:
+
+```bash
+# Generate C++ output
+./build/cnn_test training/input/test.png /tmp/cpp_output.png
+
+# Load same image in browser tool
+# Visually compare outputs
+```
+
+### GPU Timing
+
+Expected performance:
+- 512×512: ~1-2 ms (integrated GPU)
+- 1024×1024: ~3-5 ms
+- 1920×1080: ~5-8 ms
+
+Slower than expected? Check:
+- WebGPU enabled in browser
+- Dedicated GPU selected (if available)
+- No background tabs consuming GPU
+
+---
+
+## Troubleshooting
+
+### "WebGPU not supported"
+
+- Update browser to latest version
+- Enable WebGPU flag: `chrome://flags/#enable-unsafe-webgpu`
+- Try Safari 18+ (native WebGPU on macOS)
+
+### "Invalid .bin file"
+
+- Check magic number: `hexdump -C weights.bin | head`
+- Should start with: `43 4e 4e 32` ('CNN2')
+- Re-export weights: `./training/export_cnn_v2_weights.py`
+
+### Black output / incorrect colors
+
+- Check blend slider (set to 1.0 for full CNN output)
+- Verify training converged (loss < 0.01)
+- Compare with C++ tool output
+
+### Shader compilation errors
+
+Open browser console (F12) for detailed errors. Common issues:
+- Image too large (>4096×4096 not tested)
+- Unsupported texture format (rare on modern GPUs)
+
+---
+
+## Architecture
+
+**Pipeline:**
+1. **Static Features Pass** - Generate 8D features (RGBD, UV, sin, bias)
+2. **CNN Layer Passes** - Compute N layers with ping-pong textures
+3. **Display Pass** - Unpack and render with view mode
+
+**Textures:**
+- Input: RGBA8 (original image)
+- Depth: R32F (uniform depth)
+- Static features: RGBA32Uint (8×f16 packed)
+- Layer buffers: RGBA32Uint (ping-pong)
+
+**Data-Driven Execution:**
+- Layer count read from binary header
+- Per-layer params (kernel size, channels, offsets) from binary
+- Single CNN shader dispatched N times
+
+---
+
+## TODO
+
+**Side Panel (Right):**
+- Display .bin content metadata:
+  - Layer descriptions (kernel size, channels, weight count)
+  - Weight statistics (min/max/mean per layer)
+  - Weight heatmap visualization
+  - Binary format validation status
+  - Memory usage breakdown
+
+**Layer Inspection Views:**
+- Split R/G/B/A plane visualization
+- Intermediate layer output display:
+  - View static features (8D packed as heatmaps)
+  - View layer 0 output (before activation)
+  - View layer 1 output
+  - Toggle between channels
+- Activation heatmaps (where neurons fire)
+
+---
+
+## Extensions (v2+)
+
+Planned enhancements:
+
+**Variable Feature Count:**
+- Binary v2: Add `num_features` to header
+- Shader: Dynamic feature array or multiple textures
+
+**Multi-Scale Input (Mip Levels):**
+- Uncomment mip bindings in static shader
+- No binary format change needed
+
+**8-bit Quantized Weights:**
+- Binary version bump (format field already present)
+- Add quantization codepath in `get_weight()` function
+- 2× size reduction (~1.6 KB)
+
+**Pre-defined Test Images:**
+- Dropdown menu with training/input/*.png
+- Requires local file server
+
+---
+
+## Size
+
+- HTML structure: ~1 KB
+- CSS styling: ~1 KB
+- JavaScript logic: ~5 KB
+- Static shader: ~1 KB
+- CNN shader: ~3 KB
+- Display shader: ~1 KB
+- **Total: ~12 KB** (single file, no dependencies)
+
+---
+
+## See Also
+
+- `doc/CNN_V2.md` - Architecture and design
+- `doc/HOWTO.md` - Training workflows
+- `training/export_cnn_v2_weights.py` - Binary format
+- `src/gpu/effects/cnn_v2_effect.cc` - C++ reference implementation
diff --git a/tools/cnn_v2_test/index.html b/tools/cnn_v2_test/index.html
new file mode 100644
index 0000000..9c28455
--- /dev/null
+++ b/tools/cnn_v2_test/index.html
@@ -0,0 +1,809 @@
+<!DOCTYPE html>
+<html lang="en">
+<!--
+  CNN v2 Testing Tool - WebGPU-based inference validator
+
+  TODO:
+  - Side panel: .bin metadata display, weight statistics, validation
+  - Layer inspection: R/G/B/A plane split, intermediate layer visualization
+  - Activation heatmaps for debugging
+-->
+<head>
+  <meta charset="UTF-8">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  <title>CNN v2 Testing Tool</title>
+  <style>
+    * { margin: 0; padding: 0; box-sizing: border-box; }
+    body {
+      font-family: 'Courier New', monospace;
+      background: #1a1a1a;
+      color: #e0e0e0;
+      display: flex;
+      flex-direction: column;
+      height: 100vh;
+      overflow: hidden;
+    }
+    .header {
+      background: #2a2a2a;
+      padding: 16px;
+      border-bottom: 1px solid #404040;
+    }
+    h1 { font-size: 18px; margin-bottom: 12px; }
+    .controls {
+      display: flex;
+      gap: 16px;
+      align-items: center;
+      flex-wrap: wrap;
+    }
+    .control-group {
+      display: flex;
+      gap: 8px;
+      align-items: center;
+    }
+    .control-group label { font-size: 12px; }
+    input[type="range"] { width: 120px; }
+    input[type="number"] { width: 60px; background: #1a1a1a; color: #e0e0e0; border: 1px solid #404040; padding: 4px; }
+    .drop-zone {
+      border: 2px dashed #404040;
+      padding: 16px;
+      text-align: center;
+      cursor: pointer;
+      transition: all 0.2s;
+      font-size: 12px;
+      margin-top: 12px;
+    }
+    .drop-zone:hover { border-color: #606060; background: #252525; }
+    .drop-zone.active { border-color: #4a9eff; background: #1a2a3a; }
+    .drop-zone.error { border-color: #ff4a4a; background: #3a1a1a; }
+    .main {
+      flex: 1;
+      display: flex;
+      justify-content: center;
+      align-items: center;
+      padding: 24px;
+      overflow: auto;
+      position: relative;
+    }
+    .main.drop-active::after {
+      content: 'Drop PNG image here';
+      position: absolute;
+      inset: 24px;
+      display: flex;
+      align-items: center;
+      justify-content: center;
+      border: 3px dashed #4a9eff;
+      background: rgba(74, 158, 255, 0.1);
+      font-size: 24px;
+      color: #4a9eff;
+      pointer-events: none;
+      z-index: 10;
+    }
+    canvas {
+      max-width: 100%;
+      max-height: 100%;
+      image-rendering: pixelated;
+      box-shadow: 0 4px 12px rgba(0,0,0,0.5);
+    }
+    .footer {
+      background: #2a2a2a;
+      border-top: 1px solid #404040;
+      font-size: 11px;
+      display: flex;
+      flex-direction: column;
+      gap: 8px;
+    }
+    .footer-top {
+      padding: 12px 16px 0;
+      display: flex;
+      justify-content: space-between;
+    }
+    .status { color: #4a9eff; }
+    .shortcuts { color: #808080; }
+    .console {
+      background: #1a1a1a;
+      padding: 8px 16px;
+      font-family: 'Courier New', monospace;
+      font-size: 10px;
+      color: #808080;
+      max-height: 100px;
+      overflow-y: auto;
+      border-top: 1px solid #404040;
+    }
+    .console-line { margin: 2px 0; }
+    .console-line.error { color: #ff4a4a; }
+    .console-line.info { color: #4a9eff; }
+  </style>
+</head>
+<body>
+  <div class="header">
+    <h1>CNN v2 Testing Tool</h1>
+    <div class="controls">
+      <div class="control-group">
+        <label>Blend:</label>
+        <input type="range" id="blend" min="0" max="1" step="0.01" value="1.0">
+        <span id="blendValue">1.0</span>
+      </div>
+      <div class="control-group">
+        <label>Depth:</label>
+        <input type="number" id="depth" min="0" max="1" step="0.1" value="1.0">
+      </div>
+      <div class="control-group">
+        <label>View:</label>
+        <span id="viewMode">CNN Output</span>
+      </div>
+    </div>
+    <div class="drop-zone" id="weightsDrop">Drop .bin Weights</div>
+  </div>
+  <div class="main" id="mainDrop">
+    <canvas id="canvas"></canvas>
+  </div>
+  <div class="footer">
+    <div class="footer-top">
+      <span class="status" id="status">Drop PNG image anywhere to begin</span>
+      <span class="shortcuts">[SPACE] Original | [D] Diff (×10)</span>
+    </div>
+    <div class="console" id="console"></div>
+  </div>
+
+  <script>
+const STATIC_SHADER = `
+@group(0) @binding(0) var input_tex: texture_2d<f32>;
+@group(0) @binding(1) var depth_tex: texture_2d<f32>;
+@group(0) @binding(2) var output_tex: texture_storage_2d<rgba32uint, write>;
+
+@compute @workgroup_size(8, 8)
+fn main(@builtin(global_invocation_id) id: vec3<u32>) {
+  let coord = vec2<i32>(id.xy);
+  let dims = textureDimensions(input_tex);
+  if (coord.x >= i32(dims.x) || coord.y >= i32(dims.y)) { return; }
+
+  let rgba = textureLoad(input_tex, coord, 0);
+  let d = textureLoad(depth_tex, coord, 0).r;
+  let uv_x = f32(coord.x) / f32(dims.x);
+  let uv_y = 1.0 - (f32(coord.y) / f32(dims.y));
+  let sin10_x = sin(10.0 * uv_x);
+
+  let packed = vec4<u32>(
+    pack2x16float(vec2<f32>(rgba.r, rgba.g)),
+    pack2x16float(vec2<f32>(rgba.b, d)),
+    pack2x16float(vec2<f32>(uv_x, uv_y)),
+    pack2x16float(vec2<f32>(sin10_x, 1.0))
+  );
+  textureStore(output_tex, coord, packed);
+}`;
+
+const CNN_SHADER = `
+struct LayerParams {
+  kernel_size: u32,
+  in_channels: u32,
+  out_channels: u32,
+  weight_offset: u32,
+  is_output_layer: u32,
+  blend_amount: f32,
+}
+
+@group(0) @binding(0) var static_features: texture_2d<u32>;
+@group(0) @binding(1) var layer_input: texture_2d<u32>;
+@group(0) @binding(2) var output_tex: texture_storage_2d<rgba32uint, write>;
+@group(0) @binding(3) var<storage, read> weights_buffer: array<u32>;
+@group(0) @binding(4) var<uniform> params: LayerParams;
+@group(0) @binding(5) var original_input: texture_2d<f32>;
+
+fn unpack_static_features(coord: vec2<i32>) -> array<f32, 8> {
+  let packed = textureLoad(static_features, coord, 0);
+  let v0 = unpack2x16float(packed.x);
+  let v1 = unpack2x16float(packed.y);
+  let v2 = unpack2x16float(packed.z);
+  let v3 = unpack2x16float(packed.w);
+  return array<f32, 8>(v0.x, v0.y, v1.x, v1.y, v2.x, v2.y, v3.x, v3.y);
+}
+
+fn unpack_layer_channels(coord: vec2<i32>) -> array<f32, 8> {
+  let packed = textureLoad(layer_input, coord, 0);
+  let v0 = unpack2x16float(packed.x);
+  let v1 = unpack2x16float(packed.y);
+  let v2 = unpack2x16float(packed.z);
+  let v3 = unpack2x16float(packed.w);
+  return array<f32, 8>(v0.x, v0.y, v1.x, v1.y, v2.x, v2.y, v3.x, v3.y);
+}
+
+fn pack_channels(values: array<f32, 8>) -> vec4<u32> {
+  return vec4<u32>(
+    pack2x16float(vec2<f32>(values[0], values[1])),
+    pack2x16float(vec2<f32>(values[2], values[3])),
+    pack2x16float(vec2<f32>(values[4], values[5])),
+    pack2x16float(vec2<f32>(values[6], values[7]))
+  );
+}
+
+fn get_weight(idx: u32) -> f32 {
+  let pair_idx = idx / 2u;
+  let packed = weights_buffer[pair_idx];
+  let unpacked = unpack2x16float(packed);
+  return select(unpacked.y, unpacked.x, (idx & 1u) == 0u);
+}
+
+@compute @workgroup_size(8, 8)
+fn main(@builtin(global_invocation_id) id: vec3<u32>) {
+  let coord = vec2<i32>(id.xy);
+  let dims = textureDimensions(static_features);
+  if (coord.x >= i32(dims.x) || coord.y >= i32(dims.y)) { return; }
+
+  let kernel_size = params.kernel_size;
+  let in_channels = params.in_channels;
+  let out_channels = params.out_channels;
+  let weight_offset = params.weight_offset;
+  let is_output = params.is_output_layer != 0u;
+  let kernel_radius = i32(kernel_size / 2u);
+
+  let static_feat = unpack_static_features(coord);
+
+  var output: array<f32, 8>;
+  for (var c: u32 = 0u; c < out_channels && c < 8u; c++) {
+    var sum: f32 = 0.0;
+    for (var ky: i32 = -kernel_radius; ky <= kernel_radius; ky++) {
+      for (var kx: i32 = -kernel_radius; kx <= kernel_radius; kx++) {
+        let sample_coord = coord + vec2<i32>(kx, ky);
+        let clamped = vec2<i32>(
+          clamp(sample_coord.x, 0, i32(dims.x) - 1),
+          clamp(sample_coord.y, 0, i32(dims.y) - 1)
+        );
+        let static_local = unpack_static_features(clamped);
+        let layer_local = unpack_layer_channels(clamped);
+
+        let ky_idx = u32(ky + kernel_radius);
+        let kx_idx = u32(kx + kernel_radius);
+        let spatial_idx = ky_idx * kernel_size + kx_idx;
+
+        for (var i: u32 = 0u; i < 8u; i++) {
+          let w_idx = weight_offset +
+                     c * in_channels * kernel_size * kernel_size +
+                     i * kernel_size * kernel_size + spatial_idx;
+          sum += get_weight(w_idx) * static_local[i];
+        }
+
+        let prev_channels = in_channels - 8u;
+        for (var i: u32 = 0u; i < prev_channels && i < 8u; i++) {
+          let w_idx = weight_offset +
+                     c * in_channels * kernel_size * kernel_size +
+                     (8u + i) * kernel_size * kernel_size + spatial_idx;
+          sum += get_weight(w_idx) * layer_local[i];
+        }
+      }
+    }
+
+    if (is_output) {
+      output[c] = clamp(sum, 0.0, 1.0);
+    } else {
+      output[c] = max(0.0, sum);
+    }
+  }
+
+  for (var c: u32 = out_channels; c < 8u; c++) {
+    output[c] = 0.0;
+  }
+
+  if (is_output) {
+    let original = textureLoad(original_input, coord, 0).rgb;
+    let result_rgb = vec3<f32>(output[0], output[1], output[2]);
+    let blended = mix(original, result_rgb, params.blend_amount);
+    output[0] = blended.r;
+    output[1] = blended.g;
+    output[2] = blended.b;
+  }
+
+  textureStore(output_tex, coord, pack_channels(output));
+}`;
+
+const DISPLAY_SHADER = `
+@group(0) @binding(0) var result_tex: texture_2d<u32>;
+@group(0) @binding(1) var original_tex: texture_2d<f32>;
+@group(0) @binding(2) var<uniform> mode: u32;
+
+@vertex
+fn vs_main(@builtin(vertex_index) idx: u32) -> @builtin(position) vec4<f32> {
+  var pos = array<vec2<f32>, 6>(
+    vec2<f32>(-1.0, -1.0), vec2<f32>(1.0, -1.0), vec2<f32>(-1.0, 1.0),
+    vec2<f32>(-1.0, 1.0), vec2<f32>(1.0, -1.0), vec2<f32>(1.0, 1.0)
+  );
+  return vec4<f32>(pos[idx], 0.0, 1.0);
+}
+
+@fragment
+fn fs_main(@builtin(position) pos: vec4<f32>) -> @location(0) vec4<f32> {
+  let coord = vec2<i32>(pos.xy);
+  let packed = textureLoad(result_tex, coord, 0);
+  let v0 = unpack2x16float(packed.x);
+  let v1 = unpack2x16float(packed.y);
+  let result = vec3<f32>(v0.x, v0.y, v1.x);
+
+  if (mode == 0u) {
+    return vec4<f32>(result, 1.0);
+  } else if (mode == 1u) {
+    let original = textureLoad(original_tex, coord, 0).rgb;
+    return vec4<f32>(original, 1.0);
+  } else {
+    let original = textureLoad(original_tex, coord, 0).rgb;
+    let diff = abs(result - original) * 10.0;
+    return vec4<f32>(diff, 1.0);
+  }
+}`;
+
+class CNNTester {
+  constructor() {
+    this.canvas = document.getElementById('canvas');
+    this.status = document.getElementById('status');
+    this.console = document.getElementById('console');
+    this.image = null;
+    this.weights = null;
+    this.viewMode = 0;
+    this.blendAmount = 1.0;
+    this.depth = 1.0;
+    this.init();
+  }
+
+  log(msg, type = 'info') {
+    const line = document.createElement('div');
+    line.className = `console-line ${type}`;
+    line.textContent = `[${new Date().toLocaleTimeString()}] ${msg}`;
+    this.console.appendChild(line);
+    this.console.scrollTop = this.console.scrollHeight;
+  }
+
+  async init() {
+    if (!navigator.gpu) {
+      this.setStatus('WebGPU not supported', true);
+      this.log('WebGPU not supported in this browser', 'error');
+      return;
+    }
+
+    try {
+      this.adapter = await navigator.gpu.requestAdapter();
+      this.device = await this.adapter.requestDevice();
+      this.context = this.canvas.getContext('webgpu');
+      this.format = navigator.gpu.getPreferredCanvasFormat();
+      this.log('WebGPU initialized successfully');
+    } catch (e) {
+      this.setStatus(`GPU init failed: ${e.message}`, true);
+      this.log(`GPU initialization failed: ${e.message}`, 'error');
+    }
+  }
+
+  setStatus(msg, isError = false) {
+    this.status.textContent = msg;
+    this.status.style.color = isError ? '#ff4a4a' : '#4a9eff';
+  }
+
+  parseWeights(buffer) {
+    const view = new DataView(buffer);
+    const magic = view.getUint32(0, true);
+    if (magic !== 0x32_4E_4E_43) {
+      throw new Error('Invalid .bin file (bad magic)');
+    }
+
+    const version = view.getUint32(4, true);
+    const numLayers = view.getUint32(8, true);
+    const totalWeights = view.getUint32(12, true);
+
+    this.log(`Binary header: version=${version}, layers=${numLayers}, weights=${totalWeights}`);
+
+    const layers = [];
+    for (let i = 0; i < numLayers; i++) {
+      const offset = 16 + i * 20;
+      const layer = {
+        kernelSize: view.getUint32(offset, true),
+        inChannels: view.getUint32(offset + 4, true),
+        outChannels: view.getUint32(offset + 8, true),
+        weightOffset: view.getUint32(offset + 12, true),
+        weightCount: view.getUint32(offset + 16, true),
+      };
+      layers.push(layer);
+      this.log(`  Layer ${i}: ${layer.inChannels}→${layer.outChannels}, kernel=${layer.kernelSize}×${layer.kernelSize}, weights=${layer.weightCount}`);
+    }
+
+    const weightsOffset = 16 + numLayers * 20;
+    const weights = new Uint32Array(buffer.slice(weightsOffset));
+
+    // Verify weights are non-zero
+    let nonZero = 0;
+    for (let i = 0; i < weights.length; i++) {
+      if (weights[i] !== 0) nonZero++;
+    }
+    this.log(`  Weight buffer: ${weights.length} u32 (${nonZero} non-zero)`);
+
+    return { layers, weights };
+  }
+
+  async loadImage(file) {
+    const img = await createImageBitmap(file);
+    this.image = img;
+    this.canvas.width = img.width;
+    this.canvas.height = img.height;
+    this.log(`Loaded image: ${file.name} (${img.width}×${img.height})`);
+    if (this.weights) {
+      this.setStatus(`Ready: ${img.width}×${img.height}`);
+      this.run();
+    } else {
+      this.setStatus(`Image loaded (${img.width}×${img.height}) - drop .bin weights to process`);
+      this.displayOriginal();
+    }
+  }
+
+  async loadWeights(file) {
+    const buffer = await file.arrayBuffer();
+    this.weights = this.parseWeights(buffer);
+    this.weightsBuffer = buffer;
+    this.log(`Loaded weights: ${file.name} (${this.weights.layers.length} layers, ${(buffer.byteLength/1024).toFixed(1)} KB)`);
+    if (this.image) {
+      this.setStatus(`Ready: ${this.image.width}×${this.image.height}`);
+      this.run();
+    } else {
+      this.setStatus('Weights loaded - drop PNG image to process');
+    }
+  }
+
+  displayOriginal() {
+    if (!this.image || !this.device) return;
+
+    const { width, height } = this.image;
+    this.context.configure({ device: this.device, format: this.format });
+
+    const inputTex = this.device.createTexture({
+      size: [width, height],
+      format: 'rgba8unorm',
+      usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST | GPUTextureUsage.RENDER_ATTACHMENT
+    });
+
+    this.device.queue.copyExternalImageToTexture(
+      { source: this.image },
+      { texture: inputTex },
+      [width, height]
+    );
+
+    const simpleShader = `
+      @vertex
+      fn vs_main(@builtin(vertex_index) idx: u32) -> @builtin(position) vec4<f32> {
+        var pos = array<vec2<f32>, 6>(
+          vec2<f32>(-1.0, -1.0), vec2<f32>(1.0, -1.0), vec2<f32>(-1.0, 1.0),
+          vec2<f32>(-1.0, 1.0), vec2<f32>(1.0, -1.0), vec2<f32>(1.0, 1.0)
+        );
+        return vec4<f32>(pos[idx], 0.0, 1.0);
+      }
+
+      @group(0) @binding(0) var tex: texture_2d<f32>;
+
+      @fragment
+      fn fs_main(@builtin(position) pos: vec4<f32>) -> @location(0) vec4<f32> {
+        let coord = vec2<i32>(pos.xy);
+        return textureLoad(tex, coord, 0);
+      }
+    `;
+
+    const pipeline = this.device.createRenderPipeline({
+      layout: 'auto',
+      vertex: { module: this.device.createShaderModule({ code: simpleShader }), entryPoint: 'vs_main' },
+      fragment: {
+        module: this.device.createShaderModule({ code: simpleShader }),
+        entryPoint: 'fs_main',
+        targets: [{ format: this.format }]
+      }
+    });
+
+    const bindGroup = this.device.createBindGroup({
+      layout: pipeline.getBindGroupLayout(0),
+      entries: [{ binding: 0, resource: inputTex.createView() }]
+    });
+
+    const encoder = this.device.createCommandEncoder();
+    const renderPass = encoder.beginRenderPass({
+      colorAttachments: [{
+        view: this.context.getCurrentTexture().createView(),
+        loadOp: 'clear',
+        storeOp: 'store'
+      }]
+    });
+    renderPass.setPipeline(pipeline);
+    renderPass.setBindGroup(0, bindGroup);
+    renderPass.draw(6);
+    renderPass.end();
+
+    this.device.queue.submit([encoder.finish()]);
+  }
+
+  async run() {
+    const t0 = performance.now();
+    const { width, height } = this.image;
+    this.log(`Running CNN pipeline (${this.weights.layers.length} layers)...`);
+
+    this.context.configure({ device: this.device, format: this.format });
+
+    // Create persistent input texture for original view
+    if (this.inputTexture) this.inputTexture.destroy();
+    this.inputTexture = this.device.createTexture({
+      size: [width, height],
+      format: 'rgba8unorm',
+      usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST | GPUTextureUsage.RENDER_ATTACHMENT
+    });
+
+    this.device.queue.copyExternalImageToTexture(
+      { source: this.image },
+      { texture: this.inputTexture },
+      [width, height]
+    );
+
+    const depthTex = this.device.createTexture({
+      size: [width, height],
+      format: 'r32float',
+      usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST
+    });
+    const depthData = new Float32Array(width * height).fill(this.depth);
+    this.device.queue.writeTexture(
+      { texture: depthTex },
+      depthData,
+      { bytesPerRow: width * 4 },
+      [width, height]
+    );
+
+    const staticTex = this.device.createTexture({
+      size: [width, height],
+      format: 'rgba32uint',
+      usage: GPUTextureUsage.STORAGE_BINDING | GPUTextureUsage.TEXTURE_BINDING
+    });
+
+    const layerTextures = [
+      this.device.createTexture({
+        size: [width, height],
+        format: 'rgba32uint',
+        usage: GPUTextureUsage.STORAGE_BINDING | GPUTextureUsage.TEXTURE_BINDING
+      }),
+      this.device.createTexture({
+        size: [width, height],
+        format: 'rgba32uint',
+        usage: GPUTextureUsage.STORAGE_BINDING | GPUTextureUsage.TEXTURE_BINDING
+      })
+    ];
+
+    const weightsGPU = this.device.createBuffer({
+      size: this.weightsBuffer.byteLength,
+      usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST
+    });
+    this.device.queue.writeBuffer(weightsGPU, 0, this.weightsBuffer);
+
+    const staticPipeline = this.device.createComputePipeline({
+      layout: 'auto',
+      compute: { module: this.device.createShaderModule({ code: STATIC_SHADER }), entryPoint: 'main' }
+    });
+
+    const cnnPipeline = this.device.createComputePipeline({
+      layout: 'auto',
+      compute: { module: this.device.createShaderModule({ code: CNN_SHADER }), entryPoint: 'main' }
+    });
+
+    const displayPipeline = this.device.createRenderPipeline({
+      layout: 'auto',
+      vertex: { module: this.device.createShaderModule({ code: DISPLAY_SHADER }), entryPoint: 'vs_main' },
+      fragment: {
+        module: this.device.createShaderModule({ code: DISPLAY_SHADER }),
+        entryPoint: 'fs_main',
+        targets: [{ format: this.format }]
+      }
+    });
+
+    const encoder = this.device.createCommandEncoder();
+
+    const staticBG = this.device.createBindGroup({
+      layout: staticPipeline.getBindGroupLayout(0),
+      entries: [
+        { binding: 0, resource: this.inputTexture.createView() },
+        { binding: 1, resource: depthTex.createView() },
+        { binding: 2, resource: staticTex.createView() }
+      ]
+    });
+
+    const staticPass = encoder.beginComputePass();
+    staticPass.setPipeline(staticPipeline);
+    staticPass.setBindGroup(0, staticBG);
+    staticPass.dispatchWorkgroups(Math.ceil(width / 8), Math.ceil(height / 8));
+    staticPass.end();
+
+    let srcTex = staticTex;
+    let dstTex = layerTextures[0];
+
+    for (let i = 0; i < this.weights.layers.length; i++) {
+      const layer = this.weights.layers[i];
+      const isOutput = i === this.weights.layers.length - 1;
+
+      // Calculate absolute weight offset in f16 units (add header offset)
+      const headerOffsetU32 = 4 + this.weights.layers.length * 5;  // Header + layer info in u32
+      const absoluteWeightOffset = headerOffsetU32 * 2 + layer.weightOffset;  // Convert to f16 units
+
+      const paramsData = new Uint32Array(6);
+      paramsData[0] = layer.kernelSize;
+      paramsData[1] = layer.inChannels;
+      paramsData[2] = layer.outChannels;
+      paramsData[3] = absoluteWeightOffset;  // Use absolute offset
+      paramsData[4] = isOutput ? 1 : 0;
+
+      const paramsView = new Float32Array(paramsData.buffer);
+      paramsView[5] = this.blendAmount;
+
+      const paramsBuffer = this.device.createBuffer({
+        size: 24,
+        usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST
+      });
+      this.device.queue.writeBuffer(paramsBuffer, 0, paramsData);
+
+      const cnnBG = this.device.createBindGroup({
+        layout: cnnPipeline.getBindGroupLayout(0),
+        entries: [
+          { binding: 0, resource: staticTex.createView() },
+          { binding: 1, resource: srcTex.createView() },
+          { binding: 2, resource: dstTex.createView() },
+          { binding: 3, resource: { buffer: weightsGPU } },
+          { binding: 4, resource: { buffer: paramsBuffer } },
+          { binding: 5, resource: this.inputTexture.createView() }
+        ]
+      });
+
+      const cnnPass = encoder.beginComputePass();
+      cnnPass.setPipeline(cnnPipeline);
+      cnnPass.setBindGroup(0, cnnBG);
+      cnnPass.dispatchWorkgroups(Math.ceil(width / 8), Math.ceil(height / 8));
+      cnnPass.end();
+
+      [srcTex, dstTex] = [dstTex, srcTex];
+    }
+
+    const modeBuffer = this.device.createBuffer({
+      size: 4,
+      usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST
+    });
+    this.device.queue.writeBuffer(modeBuffer, 0, new Uint32Array([this.viewMode]));
+
+    // Store result texture and display pipeline for view mode switching
+    this.resultTexture = srcTex;
+    this.displayPipeline = displayPipeline;
+    this.modeBuffer = modeBuffer;
+
+    const displayBG = this.device.createBindGroup({
+      layout: displayPipeline.getBindGroupLayout(0),
+      entries: [
+        { binding: 0, resource: srcTex.createView() },
+        { binding: 1, resource: this.inputTexture.createView() },
+        { binding: 2, resource: { buffer: modeBuffer } }
+      ]
+    });
+    this.displayBindGroup = displayBG;
+
+    const renderPass = encoder.beginRenderPass({
+      colorAttachments: [{
+        view: this.context.getCurrentTexture().createView(),
+        loadOp: 'clear',
+        storeOp: 'store'
+      }]
+    });
+    renderPass.setPipeline(displayPipeline);
+    renderPass.setBindGroup(0, displayBG);
+    renderPass.draw(6);
+    renderPass.end();
+
+    this.device.queue.submit([encoder.finish()]);
+
+    const t1 = performance.now();
+    const mode = ['CNN Output', 'Original', 'Diff (×10)'][this.viewMode];
+    this.setStatus(`GPU: ${(t1-t0).toFixed(1)}ms | ${width}×${height} | ${mode}`);
+    this.log(`Completed in ${(t1-t0).toFixed(1)}ms`);
+  }
+
+  updateDisplay() {
+    if (!this.displayPipeline || !this.displayBindGroup) return;
+
+    this.device.queue.writeBuffer(this.modeBuffer, 0, new Uint32Array([this.viewMode]));
+
+    const encoder = this.device.createCommandEncoder();
+    const renderPass = encoder.beginRenderPass({
+      colorAttachments: [{
+        view: this.context.getCurrentTexture().createView(),
+        loadOp: 'clear',
+        storeOp: 'store'
+      }]
+    });
+    renderPass.setPipeline(this.displayPipeline);
+    renderPass.setBindGroup(0, this.displayBindGroup);
+    renderPass.draw(6);
+    renderPass.end();
+
+    this.device.queue.submit([encoder.finish()]);
+  }
+}
+
+const tester = new CNNTester();
+
+function setupDropZone(id, callback) {
+  const zone = document.getElementById(id);
+  ['dragenter', 'dragover', 'dragleave', 'drop'].forEach(e => {
+    zone.addEventListener(e, ev => { ev.preventDefault(); ev.stopPropagation(); });
+  });
+  ['dragenter', 'dragover'].forEach(e => zone.addEventListener(e, () => zone.classList.add('active')));
+  ['dragleave', 'drop'].forEach(e => zone.addEventListener(e, () => zone.classList.remove('active')));
+  zone.addEventListener('drop', e => {
+    const file = e.dataTransfer.files[0];
+    if (file) callback(file).catch(err => {
+      zone.classList.add('error');
+      tester.setStatus(err.message, true);
+      tester.log(err.message, 'error');
+      setTimeout(() => zone.classList.remove('error'), 2000);
+    });
+  });
+}
+
+// Whole window drop for PNG images
+const mainArea = document.getElementById('mainDrop');
+['dragenter', 'dragover', 'dragleave', 'drop'].forEach(e => {
+  mainArea.addEventListener(e, ev => { ev.preventDefault(); ev.stopPropagation(); });
+});
+['dragenter', 'dragover'].forEach(e => mainArea.addEventListener(e, () => mainArea.classList.add('drop-active')));
+['dragleave', 'drop'].forEach(e => mainArea.addEventListener(e, () => mainArea.classList.remove('drop-active')));
+mainArea.addEventListener('drop', e => {
+  const file = e.dataTransfer.files[0];
+  if (file && file.type.startsWith('image/')) {
+    tester.loadImage(file).catch(err => {
+      tester.setStatus(err.message, true);
+      tester.log(err.message, 'error');
+    });
+  }
+});
+
+// Weights drop zone
+setupDropZone('weightsDrop', f => tester.loadWeights(f));
+
+document.getElementById('blend').addEventListener('input', e => {
+  tester.blendAmount = parseFloat(e.target.value);
+  document.getElementById('blendValue').textContent = e.target.value;
+  if (tester.image && tester.weights) {
+    tester.log(`Blend changed to ${e.target.value}`);
+    tester.run();
+  }
+});
+
+document.getElementById('depth').addEventListener('input', e => {
+  tester.depth = parseFloat(e.target.value);
+  if (tester.image && tester.weights) tester.run();
+});
+
+document.addEventListener('keydown', e => {
+  if (e.code === 'Space') {
+    e.preventDefault();
+    if (tester.viewMode === 1) {
+      tester.viewMode = 0;
+    } else {
+      tester.viewMode = 1;
+    }
+    const modeName = ['CNN Output', 'Original', 'Diff (×10)'][tester.viewMode];
+    document.getElementById('viewMode').textContent = modeName;
+    if (tester.image && tester.weights) {
+      tester.log(`View mode: ${modeName}`);
+      tester.updateDisplay();
+      const { width, height } = tester.image;
+      tester.setStatus(`${width}×${height} | ${modeName}`);
+    }
+  } else if (e.code === 'KeyD') {
+    e.preventDefault();
+    if (tester.viewMode === 2) {
+      tester.viewMode = 0;
+    } else {
+      tester.viewMode = 2;
+    }
+    const modeName = ['CNN Output', 'Original', 'Diff (×10)'][tester.viewMode];
+    document.getElementById('viewMode').textContent = modeName;
+    if (tester.image && tester.weights) {
+      tester.log(`View mode: ${modeName}`);
+      tester.updateDisplay();
+      const { width, height } = tester.image;
+      tester.setStatus(`${width}×${height} | ${modeName}`);
+    }
+  }
+});
+  </script>
+</body>
+</html>
author	skal <pascal.massimino@gmail.com>	2026-02-13 09:33:12 +0100
committer	skal <pascal.massimino@gmail.com>	2026-02-13 09:33:12 +0100
commit	2a793c23b582ed134b8294bfbbe3b6d7aaafe0c0 (patch)
tree	53073f5947515a3aad9afca7c24ff6b52ce5ef05 /tools
parent	65f6866b985fb3d0993fc2b6798c74015fb4fa6a (diff)