summaryrefslogtreecommitdiff
path: root/tools
diff options
context:
space:
mode:
authorskal <pascal.massimino@gmail.com>2026-02-13 09:33:12 +0100
committerskal <pascal.massimino@gmail.com>2026-02-13 09:33:12 +0100
commit2a793c23b582ed134b8294bfbbe3b6d7aaafe0c0 (patch)
tree53073f5947515a3aad9afca7c24ff6b52ce5ef05 /tools
parent65f6866b985fb3d0993fc2b6798c74015fb4fa6a (diff)
Add CNN v2 WebGPU testing tool
Implements single-file HTML tool for rapid CNN weight validation: Features: - Drag-drop PNG images (whole window) and .bin weights - Real-time WebGPU compute pipeline (static features + N layers) - Data-driven execution (reads layer count from binary) - View modes: CNN output / Original / Diff (×10) - Blend slider (0.0-1.0) for effect strength - Console log with timestamps - Keyboard shortcuts: SPACE (original), D (diff) Architecture: - Embedded WGSL shaders (static + compute + display) - Binary parser for .bin format (header + layer info + f16 weights) - Persistent textures for view mode switching - Absolute weight offset calculation (header + layer info skip) Implementation notes: - Weight offsets in binary are relative to weights section - JavaScript precalculates absolute offsets: headerOffsetU32 * 2 + offset - Matches C++ shader behavior (simple get_weight without offset param) - Ping-pong textures for multi-layer processing TODO: - Side panel: .bin metadata, weight statistics, validation - Layer inspection: R/G/B/A plane split, intermediate outputs - Activation heatmaps for debugging Files: - tools/cnn_v2_test/index.html (24 KB, 730 lines) - tools/cnn_v2_test/README.md (usage guide, troubleshooting) handoff(Claude): CNN v2 HTML testing tool complete, documented TODOs for future enhancements
Diffstat (limited to 'tools')
-rw-r--r--tools/cnn_v2_test/README.md231
-rw-r--r--tools/cnn_v2_test/index.html809
2 files changed, 1040 insertions, 0 deletions
diff --git a/tools/cnn_v2_test/README.md b/tools/cnn_v2_test/README.md
new file mode 100644
index 0000000..2a8e08d
--- /dev/null
+++ b/tools/cnn_v2_test/README.md
@@ -0,0 +1,231 @@
+# CNN v2 Testing Tool
+
+WebGPU-based browser tool for testing trained CNN v2 weights.
+
+---
+
+## Features
+
+- Drag-drop PNG images and `.bin` weights
+- Real-time CNN inference with WebGPU compute shaders
+- View modes: CNN output, original input, difference (×10)
+- Adjustable blend amount and depth
+- Data-driven pipeline (supports variable layer count)
+- GPU timing display
+
+---
+
+## Requirements
+
+- Browser with WebGPU support:
+ - Chrome/Edge 113+ (enable `chrome://flags/#enable-unsafe-webgpu` if needed)
+ - Safari 18+ (macOS Ventura+)
+- Trained CNN v2 weights in binary format (`.bin`)
+- Test images (PNG format)
+
+---
+
+## Usage
+
+### 1. Open Tool
+
+```bash
+open tools/cnn_v2_test/index.html
+```
+
+Or use a local server to avoid CORS:
+```bash
+python3 -m http.server 8000
+# Open http://localhost:8000/tools/cnn_v2_test/
+```
+
+### 2. Load Data
+
+1. **Drop PNG image** anywhere in window (shows preview immediately)
+2. **Drop `.bin` weights** into header drop zone
+3. CNN runs automatically when both loaded
+
+### 3. Controls
+
+**Sliders:**
+- **Blend:** Mix between original (0.0) and CNN output (1.0)
+- **Depth:** Uniform depth value for all pixels (0.0–1.0)
+
+**Keyboard:**
+- `SPACE` - Toggle original input view
+- `D` - Toggle difference view (×10 amplification)
+
+**Status Bar:**
+- Shows GPU timing (ms), image dimensions, and current view mode
+- Red text indicates errors
+
+**Console Log:**
+- Timestamped event log at bottom
+- Tracks file loads, pipeline execution, errors
+- Auto-scrolls to latest messages
+
+---
+
+## Preparing Test Data
+
+### Export Weights
+
+```bash
+# From trained checkpoint
+./training/export_cnn_v2_weights.py \
+ checkpoints/checkpoint_epoch_100.pth \
+ --output-weights tools/cnn_v2_test/test_weights.bin
+```
+
+Binary format: 16-byte header + 20 bytes per layer + f16 weights (~3.2 KB for 3-layer model)
+
+### Test Images
+
+Use training images or any PNG:
+```bash
+# Copy test image
+cp training/input/test.png tools/cnn_v2_test/
+```
+
+**Note:** Grayscale images automatically converted to RGB.
+
+---
+
+## Validation
+
+### Visual Comparison
+
+Compare browser output with C++ tool:
+
+```bash
+# Generate C++ output
+./build/cnn_test training/input/test.png /tmp/cpp_output.png
+
+# Load same image in browser tool
+# Visually compare outputs
+```
+
+### GPU Timing
+
+Expected performance:
+- 512×512: ~1-2 ms (integrated GPU)
+- 1024×1024: ~3-5 ms
+- 1920×1080: ~5-8 ms
+
+Slower than expected? Check:
+- WebGPU enabled in browser
+- Dedicated GPU selected (if available)
+- No background tabs consuming GPU
+
+---
+
+## Troubleshooting
+
+### "WebGPU not supported"
+
+- Update browser to latest version
+- Enable WebGPU flag: `chrome://flags/#enable-unsafe-webgpu`
+- Try Safari 18+ (native WebGPU on macOS)
+
+### "Invalid .bin file"
+
+- Check magic number: `hexdump -C weights.bin | head`
+- Should start with: `43 4e 4e 32` ('CNN2')
+- Re-export weights: `./training/export_cnn_v2_weights.py`
+
+### Black output / incorrect colors
+
+- Check blend slider (set to 1.0 for full CNN output)
+- Verify training converged (loss < 0.01)
+- Compare with C++ tool output
+
+### Shader compilation errors
+
+Open browser console (F12) for detailed errors. Common issues:
+- Image too large (>4096×4096 not tested)
+- Unsupported texture format (rare on modern GPUs)
+
+---
+
+## Architecture
+
+**Pipeline:**
+1. **Static Features Pass** - Generate 8D features (RGBD, UV, sin, bias)
+2. **CNN Layer Passes** - Compute N layers with ping-pong textures
+3. **Display Pass** - Unpack and render with view mode
+
+**Textures:**
+- Input: RGBA8 (original image)
+- Depth: R32F (uniform depth)
+- Static features: RGBA32Uint (8×f16 packed)
+- Layer buffers: RGBA32Uint (ping-pong)
+
+**Data-Driven Execution:**
+- Layer count read from binary header
+- Per-layer params (kernel size, channels, offsets) from binary
+- Single CNN shader dispatched N times
+
+---
+
+## TODO
+
+**Side Panel (Right):**
+- Display .bin content metadata:
+ - Layer descriptions (kernel size, channels, weight count)
+ - Weight statistics (min/max/mean per layer)
+ - Weight heatmap visualization
+ - Binary format validation status
+ - Memory usage breakdown
+
+**Layer Inspection Views:**
+- Split R/G/B/A plane visualization
+- Intermediate layer output display:
+ - View static features (8D packed as heatmaps)
+ - View layer 0 output (before activation)
+ - View layer 1 output
+ - Toggle between channels
+- Activation heatmaps (where neurons fire)
+
+---
+
+## Extensions (v2+)
+
+Planned enhancements:
+
+**Variable Feature Count:**
+- Binary v2: Add `num_features` to header
+- Shader: Dynamic feature array or multiple textures
+
+**Multi-Scale Input (Mip Levels):**
+- Uncomment mip bindings in static shader
+- No binary format change needed
+
+**8-bit Quantized Weights:**
+- Binary version bump (format field already present)
+- Add quantization codepath in `get_weight()` function
+- 2× size reduction (~1.6 KB)
+
+**Pre-defined Test Images:**
+- Dropdown menu with training/input/*.png
+- Requires local file server
+
+---
+
+## Size
+
+- HTML structure: ~1 KB
+- CSS styling: ~1 KB
+- JavaScript logic: ~5 KB
+- Static shader: ~1 KB
+- CNN shader: ~3 KB
+- Display shader: ~1 KB
+- **Total: ~12 KB** (single file, no dependencies)
+
+---
+
+## See Also
+
+- `doc/CNN_V2.md` - Architecture and design
+- `doc/HOWTO.md` - Training workflows
+- `training/export_cnn_v2_weights.py` - Binary format
+- `src/gpu/effects/cnn_v2_effect.cc` - C++ reference implementation
diff --git a/tools/cnn_v2_test/index.html b/tools/cnn_v2_test/index.html
new file mode 100644
index 0000000..9c28455
--- /dev/null
+++ b/tools/cnn_v2_test/index.html
@@ -0,0 +1,809 @@
+<!DOCTYPE html>
+<html lang="en">
+<!--
+ CNN v2 Testing Tool - WebGPU-based inference validator
+
+ TODO:
+ - Side panel: .bin metadata display, weight statistics, validation
+ - Layer inspection: R/G/B/A plane split, intermediate layer visualization
+ - Activation heatmaps for debugging
+-->
+<head>
+ <meta charset="UTF-8">
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
+ <title>CNN v2 Testing Tool</title>
+ <style>
+ * { margin: 0; padding: 0; box-sizing: border-box; }
+ body {
+ font-family: 'Courier New', monospace;
+ background: #1a1a1a;
+ color: #e0e0e0;
+ display: flex;
+ flex-direction: column;
+ height: 100vh;
+ overflow: hidden;
+ }
+ .header {
+ background: #2a2a2a;
+ padding: 16px;
+ border-bottom: 1px solid #404040;
+ }
+ h1 { font-size: 18px; margin-bottom: 12px; }
+ .controls {
+ display: flex;
+ gap: 16px;
+ align-items: center;
+ flex-wrap: wrap;
+ }
+ .control-group {
+ display: flex;
+ gap: 8px;
+ align-items: center;
+ }
+ .control-group label { font-size: 12px; }
+ input[type="range"] { width: 120px; }
+ input[type="number"] { width: 60px; background: #1a1a1a; color: #e0e0e0; border: 1px solid #404040; padding: 4px; }
+ .drop-zone {
+ border: 2px dashed #404040;
+ padding: 16px;
+ text-align: center;
+ cursor: pointer;
+ transition: all 0.2s;
+ font-size: 12px;
+ margin-top: 12px;
+ }
+ .drop-zone:hover { border-color: #606060; background: #252525; }
+ .drop-zone.active { border-color: #4a9eff; background: #1a2a3a; }
+ .drop-zone.error { border-color: #ff4a4a; background: #3a1a1a; }
+ .main {
+ flex: 1;
+ display: flex;
+ justify-content: center;
+ align-items: center;
+ padding: 24px;
+ overflow: auto;
+ position: relative;
+ }
+ .main.drop-active::after {
+ content: 'Drop PNG image here';
+ position: absolute;
+ inset: 24px;
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ border: 3px dashed #4a9eff;
+ background: rgba(74, 158, 255, 0.1);
+ font-size: 24px;
+ color: #4a9eff;
+ pointer-events: none;
+ z-index: 10;
+ }
+ canvas {
+ max-width: 100%;
+ max-height: 100%;
+ image-rendering: pixelated;
+ box-shadow: 0 4px 12px rgba(0,0,0,0.5);
+ }
+ .footer {
+ background: #2a2a2a;
+ border-top: 1px solid #404040;
+ font-size: 11px;
+ display: flex;
+ flex-direction: column;
+ gap: 8px;
+ }
+ .footer-top {
+ padding: 12px 16px 0;
+ display: flex;
+ justify-content: space-between;
+ }
+ .status { color: #4a9eff; }
+ .shortcuts { color: #808080; }
+ .console {
+ background: #1a1a1a;
+ padding: 8px 16px;
+ font-family: 'Courier New', monospace;
+ font-size: 10px;
+ color: #808080;
+ max-height: 100px;
+ overflow-y: auto;
+ border-top: 1px solid #404040;
+ }
+ .console-line { margin: 2px 0; }
+ .console-line.error { color: #ff4a4a; }
+ .console-line.info { color: #4a9eff; }
+ </style>
+</head>
+<body>
+ <div class="header">
+ <h1>CNN v2 Testing Tool</h1>
+ <div class="controls">
+ <div class="control-group">
+ <label>Blend:</label>
+ <input type="range" id="blend" min="0" max="1" step="0.01" value="1.0">
+ <span id="blendValue">1.0</span>
+ </div>
+ <div class="control-group">
+ <label>Depth:</label>
+ <input type="number" id="depth" min="0" max="1" step="0.1" value="1.0">
+ </div>
+ <div class="control-group">
+ <label>View:</label>
+ <span id="viewMode">CNN Output</span>
+ </div>
+ </div>
+ <div class="drop-zone" id="weightsDrop">Drop .bin Weights</div>
+ </div>
+ <div class="main" id="mainDrop">
+ <canvas id="canvas"></canvas>
+ </div>
+ <div class="footer">
+ <div class="footer-top">
+ <span class="status" id="status">Drop PNG image anywhere to begin</span>
+ <span class="shortcuts">[SPACE] Original | [D] Diff (×10)</span>
+ </div>
+ <div class="console" id="console"></div>
+ </div>
+
+ <script>
+const STATIC_SHADER = `
+@group(0) @binding(0) var input_tex: texture_2d<f32>;
+@group(0) @binding(1) var depth_tex: texture_2d<f32>;
+@group(0) @binding(2) var output_tex: texture_storage_2d<rgba32uint, write>;
+
+@compute @workgroup_size(8, 8)
+fn main(@builtin(global_invocation_id) id: vec3<u32>) {
+ let coord = vec2<i32>(id.xy);
+ let dims = textureDimensions(input_tex);
+ if (coord.x >= i32(dims.x) || coord.y >= i32(dims.y)) { return; }
+
+ let rgba = textureLoad(input_tex, coord, 0);
+ let d = textureLoad(depth_tex, coord, 0).r;
+ let uv_x = f32(coord.x) / f32(dims.x);
+ let uv_y = 1.0 - (f32(coord.y) / f32(dims.y));
+ let sin10_x = sin(10.0 * uv_x);
+
+ let packed = vec4<u32>(
+ pack2x16float(vec2<f32>(rgba.r, rgba.g)),
+ pack2x16float(vec2<f32>(rgba.b, d)),
+ pack2x16float(vec2<f32>(uv_x, uv_y)),
+ pack2x16float(vec2<f32>(sin10_x, 1.0))
+ );
+ textureStore(output_tex, coord, packed);
+}`;
+
+const CNN_SHADER = `
+struct LayerParams {
+ kernel_size: u32,
+ in_channels: u32,
+ out_channels: u32,
+ weight_offset: u32,
+ is_output_layer: u32,
+ blend_amount: f32,
+}
+
+@group(0) @binding(0) var static_features: texture_2d<u32>;
+@group(0) @binding(1) var layer_input: texture_2d<u32>;
+@group(0) @binding(2) var output_tex: texture_storage_2d<rgba32uint, write>;
+@group(0) @binding(3) var<storage, read> weights_buffer: array<u32>;
+@group(0) @binding(4) var<uniform> params: LayerParams;
+@group(0) @binding(5) var original_input: texture_2d<f32>;
+
+fn unpack_static_features(coord: vec2<i32>) -> array<f32, 8> {
+ let packed = textureLoad(static_features, coord, 0);
+ let v0 = unpack2x16float(packed.x);
+ let v1 = unpack2x16float(packed.y);
+ let v2 = unpack2x16float(packed.z);
+ let v3 = unpack2x16float(packed.w);
+ return array<f32, 8>(v0.x, v0.y, v1.x, v1.y, v2.x, v2.y, v3.x, v3.y);
+}
+
+fn unpack_layer_channels(coord: vec2<i32>) -> array<f32, 8> {
+ let packed = textureLoad(layer_input, coord, 0);
+ let v0 = unpack2x16float(packed.x);
+ let v1 = unpack2x16float(packed.y);
+ let v2 = unpack2x16float(packed.z);
+ let v3 = unpack2x16float(packed.w);
+ return array<f32, 8>(v0.x, v0.y, v1.x, v1.y, v2.x, v2.y, v3.x, v3.y);
+}
+
+fn pack_channels(values: array<f32, 8>) -> vec4<u32> {
+ return vec4<u32>(
+ pack2x16float(vec2<f32>(values[0], values[1])),
+ pack2x16float(vec2<f32>(values[2], values[3])),
+ pack2x16float(vec2<f32>(values[4], values[5])),
+ pack2x16float(vec2<f32>(values[6], values[7]))
+ );
+}
+
+fn get_weight(idx: u32) -> f32 {
+ let pair_idx = idx / 2u;
+ let packed = weights_buffer[pair_idx];
+ let unpacked = unpack2x16float(packed);
+ return select(unpacked.y, unpacked.x, (idx & 1u) == 0u);
+}
+
+@compute @workgroup_size(8, 8)
+fn main(@builtin(global_invocation_id) id: vec3<u32>) {
+ let coord = vec2<i32>(id.xy);
+ let dims = textureDimensions(static_features);
+ if (coord.x >= i32(dims.x) || coord.y >= i32(dims.y)) { return; }
+
+ let kernel_size = params.kernel_size;
+ let in_channels = params.in_channels;
+ let out_channels = params.out_channels;
+ let weight_offset = params.weight_offset;
+ let is_output = params.is_output_layer != 0u;
+ let kernel_radius = i32(kernel_size / 2u);
+
+ let static_feat = unpack_static_features(coord);
+
+ var output: array<f32, 8>;
+ for (var c: u32 = 0u; c < out_channels && c < 8u; c++) {
+ var sum: f32 = 0.0;
+ for (var ky: i32 = -kernel_radius; ky <= kernel_radius; ky++) {
+ for (var kx: i32 = -kernel_radius; kx <= kernel_radius; kx++) {
+ let sample_coord = coord + vec2<i32>(kx, ky);
+ let clamped = vec2<i32>(
+ clamp(sample_coord.x, 0, i32(dims.x) - 1),
+ clamp(sample_coord.y, 0, i32(dims.y) - 1)
+ );
+ let static_local = unpack_static_features(clamped);
+ let layer_local = unpack_layer_channels(clamped);
+
+ let ky_idx = u32(ky + kernel_radius);
+ let kx_idx = u32(kx + kernel_radius);
+ let spatial_idx = ky_idx * kernel_size + kx_idx;
+
+ for (var i: u32 = 0u; i < 8u; i++) {
+ let w_idx = weight_offset +
+ c * in_channels * kernel_size * kernel_size +
+ i * kernel_size * kernel_size + spatial_idx;
+ sum += get_weight(w_idx) * static_local[i];
+ }
+
+ let prev_channels = in_channels - 8u;
+ for (var i: u32 = 0u; i < prev_channels && i < 8u; i++) {
+ let w_idx = weight_offset +
+ c * in_channels * kernel_size * kernel_size +
+ (8u + i) * kernel_size * kernel_size + spatial_idx;
+ sum += get_weight(w_idx) * layer_local[i];
+ }
+ }
+ }
+
+ if (is_output) {
+ output[c] = clamp(sum, 0.0, 1.0);
+ } else {
+ output[c] = max(0.0, sum);
+ }
+ }
+
+ for (var c: u32 = out_channels; c < 8u; c++) {
+ output[c] = 0.0;
+ }
+
+ if (is_output) {
+ let original = textureLoad(original_input, coord, 0).rgb;
+ let result_rgb = vec3<f32>(output[0], output[1], output[2]);
+ let blended = mix(original, result_rgb, params.blend_amount);
+ output[0] = blended.r;
+ output[1] = blended.g;
+ output[2] = blended.b;
+ }
+
+ textureStore(output_tex, coord, pack_channels(output));
+}`;
+
+const DISPLAY_SHADER = `
+@group(0) @binding(0) var result_tex: texture_2d<u32>;
+@group(0) @binding(1) var original_tex: texture_2d<f32>;
+@group(0) @binding(2) var<uniform> mode: u32;
+
+@vertex
+fn vs_main(@builtin(vertex_index) idx: u32) -> @builtin(position) vec4<f32> {
+ var pos = array<vec2<f32>, 6>(
+ vec2<f32>(-1.0, -1.0), vec2<f32>(1.0, -1.0), vec2<f32>(-1.0, 1.0),
+ vec2<f32>(-1.0, 1.0), vec2<f32>(1.0, -1.0), vec2<f32>(1.0, 1.0)
+ );
+ return vec4<f32>(pos[idx], 0.0, 1.0);
+}
+
+@fragment
+fn fs_main(@builtin(position) pos: vec4<f32>) -> @location(0) vec4<f32> {
+ let coord = vec2<i32>(pos.xy);
+ let packed = textureLoad(result_tex, coord, 0);
+ let v0 = unpack2x16float(packed.x);
+ let v1 = unpack2x16float(packed.y);
+ let result = vec3<f32>(v0.x, v0.y, v1.x);
+
+ if (mode == 0u) {
+ return vec4<f32>(result, 1.0);
+ } else if (mode == 1u) {
+ let original = textureLoad(original_tex, coord, 0).rgb;
+ return vec4<f32>(original, 1.0);
+ } else {
+ let original = textureLoad(original_tex, coord, 0).rgb;
+ let diff = abs(result - original) * 10.0;
+ return vec4<f32>(diff, 1.0);
+ }
+}`;
+
+class CNNTester {
+ constructor() {
+ this.canvas = document.getElementById('canvas');
+ this.status = document.getElementById('status');
+ this.console = document.getElementById('console');
+ this.image = null;
+ this.weights = null;
+ this.viewMode = 0;
+ this.blendAmount = 1.0;
+ this.depth = 1.0;
+ this.init();
+ }
+
+ log(msg, type = 'info') {
+ const line = document.createElement('div');
+ line.className = `console-line ${type}`;
+ line.textContent = `[${new Date().toLocaleTimeString()}] ${msg}`;
+ this.console.appendChild(line);
+ this.console.scrollTop = this.console.scrollHeight;
+ }
+
+ async init() {
+ if (!navigator.gpu) {
+ this.setStatus('WebGPU not supported', true);
+ this.log('WebGPU not supported in this browser', 'error');
+ return;
+ }
+
+ try {
+ this.adapter = await navigator.gpu.requestAdapter();
+ this.device = await this.adapter.requestDevice();
+ this.context = this.canvas.getContext('webgpu');
+ this.format = navigator.gpu.getPreferredCanvasFormat();
+ this.log('WebGPU initialized successfully');
+ } catch (e) {
+ this.setStatus(`GPU init failed: ${e.message}`, true);
+ this.log(`GPU initialization failed: ${e.message}`, 'error');
+ }
+ }
+
+ setStatus(msg, isError = false) {
+ this.status.textContent = msg;
+ this.status.style.color = isError ? '#ff4a4a' : '#4a9eff';
+ }
+
+ parseWeights(buffer) {
+ const view = new DataView(buffer);
+ const magic = view.getUint32(0, true);
+ if (magic !== 0x32_4E_4E_43) {
+ throw new Error('Invalid .bin file (bad magic)');
+ }
+
+ const version = view.getUint32(4, true);
+ const numLayers = view.getUint32(8, true);
+ const totalWeights = view.getUint32(12, true);
+
+ this.log(`Binary header: version=${version}, layers=${numLayers}, weights=${totalWeights}`);
+
+ const layers = [];
+ for (let i = 0; i < numLayers; i++) {
+ const offset = 16 + i * 20;
+ const layer = {
+ kernelSize: view.getUint32(offset, true),
+ inChannels: view.getUint32(offset + 4, true),
+ outChannels: view.getUint32(offset + 8, true),
+ weightOffset: view.getUint32(offset + 12, true),
+ weightCount: view.getUint32(offset + 16, true),
+ };
+ layers.push(layer);
+ this.log(` Layer ${i}: ${layer.inChannels}→${layer.outChannels}, kernel=${layer.kernelSize}×${layer.kernelSize}, weights=${layer.weightCount}`);
+ }
+
+ const weightsOffset = 16 + numLayers * 20;
+ const weights = new Uint32Array(buffer.slice(weightsOffset));
+
+ // Verify weights are non-zero
+ let nonZero = 0;
+ for (let i = 0; i < weights.length; i++) {
+ if (weights[i] !== 0) nonZero++;
+ }
+ this.log(` Weight buffer: ${weights.length} u32 (${nonZero} non-zero)`);
+
+ return { layers, weights };
+ }
+
+ async loadImage(file) {
+ const img = await createImageBitmap(file);
+ this.image = img;
+ this.canvas.width = img.width;
+ this.canvas.height = img.height;
+ this.log(`Loaded image: ${file.name} (${img.width}×${img.height})`);
+ if (this.weights) {
+ this.setStatus(`Ready: ${img.width}×${img.height}`);
+ this.run();
+ } else {
+ this.setStatus(`Image loaded (${img.width}×${img.height}) - drop .bin weights to process`);
+ this.displayOriginal();
+ }
+ }
+
+ async loadWeights(file) {
+ const buffer = await file.arrayBuffer();
+ this.weights = this.parseWeights(buffer);
+ this.weightsBuffer = buffer;
+ this.log(`Loaded weights: ${file.name} (${this.weights.layers.length} layers, ${(buffer.byteLength/1024).toFixed(1)} KB)`);
+ if (this.image) {
+ this.setStatus(`Ready: ${this.image.width}×${this.image.height}`);
+ this.run();
+ } else {
+ this.setStatus('Weights loaded - drop PNG image to process');
+ }
+ }
+
+ displayOriginal() {
+ if (!this.image || !this.device) return;
+
+ const { width, height } = this.image;
+ this.context.configure({ device: this.device, format: this.format });
+
+ const inputTex = this.device.createTexture({
+ size: [width, height],
+ format: 'rgba8unorm',
+ usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST | GPUTextureUsage.RENDER_ATTACHMENT
+ });
+
+ this.device.queue.copyExternalImageToTexture(
+ { source: this.image },
+ { texture: inputTex },
+ [width, height]
+ );
+
+ const simpleShader = `
+ @vertex
+ fn vs_main(@builtin(vertex_index) idx: u32) -> @builtin(position) vec4<f32> {
+ var pos = array<vec2<f32>, 6>(
+ vec2<f32>(-1.0, -1.0), vec2<f32>(1.0, -1.0), vec2<f32>(-1.0, 1.0),
+ vec2<f32>(-1.0, 1.0), vec2<f32>(1.0, -1.0), vec2<f32>(1.0, 1.0)
+ );
+ return vec4<f32>(pos[idx], 0.0, 1.0);
+ }
+
+ @group(0) @binding(0) var tex: texture_2d<f32>;
+
+ @fragment
+ fn fs_main(@builtin(position) pos: vec4<f32>) -> @location(0) vec4<f32> {
+ let coord = vec2<i32>(pos.xy);
+ return textureLoad(tex, coord, 0);
+ }
+ `;
+
+ const pipeline = this.device.createRenderPipeline({
+ layout: 'auto',
+ vertex: { module: this.device.createShaderModule({ code: simpleShader }), entryPoint: 'vs_main' },
+ fragment: {
+ module: this.device.createShaderModule({ code: simpleShader }),
+ entryPoint: 'fs_main',
+ targets: [{ format: this.format }]
+ }
+ });
+
+ const bindGroup = this.device.createBindGroup({
+ layout: pipeline.getBindGroupLayout(0),
+ entries: [{ binding: 0, resource: inputTex.createView() }]
+ });
+
+ const encoder = this.device.createCommandEncoder();
+ const renderPass = encoder.beginRenderPass({
+ colorAttachments: [{
+ view: this.context.getCurrentTexture().createView(),
+ loadOp: 'clear',
+ storeOp: 'store'
+ }]
+ });
+ renderPass.setPipeline(pipeline);
+ renderPass.setBindGroup(0, bindGroup);
+ renderPass.draw(6);
+ renderPass.end();
+
+ this.device.queue.submit([encoder.finish()]);
+ }
+
+ async run() {
+ const t0 = performance.now();
+ const { width, height } = this.image;
+ this.log(`Running CNN pipeline (${this.weights.layers.length} layers)...`);
+
+ this.context.configure({ device: this.device, format: this.format });
+
+ // Create persistent input texture for original view
+ if (this.inputTexture) this.inputTexture.destroy();
+ this.inputTexture = this.device.createTexture({
+ size: [width, height],
+ format: 'rgba8unorm',
+ usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST | GPUTextureUsage.RENDER_ATTACHMENT
+ });
+
+ this.device.queue.copyExternalImageToTexture(
+ { source: this.image },
+ { texture: this.inputTexture },
+ [width, height]
+ );
+
+ const depthTex = this.device.createTexture({
+ size: [width, height],
+ format: 'r32float',
+ usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST
+ });
+ const depthData = new Float32Array(width * height).fill(this.depth);
+ this.device.queue.writeTexture(
+ { texture: depthTex },
+ depthData,
+ { bytesPerRow: width * 4 },
+ [width, height]
+ );
+
+ const staticTex = this.device.createTexture({
+ size: [width, height],
+ format: 'rgba32uint',
+ usage: GPUTextureUsage.STORAGE_BINDING | GPUTextureUsage.TEXTURE_BINDING
+ });
+
+ const layerTextures = [
+ this.device.createTexture({
+ size: [width, height],
+ format: 'rgba32uint',
+ usage: GPUTextureUsage.STORAGE_BINDING | GPUTextureUsage.TEXTURE_BINDING
+ }),
+ this.device.createTexture({
+ size: [width, height],
+ format: 'rgba32uint',
+ usage: GPUTextureUsage.STORAGE_BINDING | GPUTextureUsage.TEXTURE_BINDING
+ })
+ ];
+
+ const weightsGPU = this.device.createBuffer({
+ size: this.weightsBuffer.byteLength,
+ usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST
+ });
+ this.device.queue.writeBuffer(weightsGPU, 0, this.weightsBuffer);
+
+ const staticPipeline = this.device.createComputePipeline({
+ layout: 'auto',
+ compute: { module: this.device.createShaderModule({ code: STATIC_SHADER }), entryPoint: 'main' }
+ });
+
+ const cnnPipeline = this.device.createComputePipeline({
+ layout: 'auto',
+ compute: { module: this.device.createShaderModule({ code: CNN_SHADER }), entryPoint: 'main' }
+ });
+
+ const displayPipeline = this.device.createRenderPipeline({
+ layout: 'auto',
+ vertex: { module: this.device.createShaderModule({ code: DISPLAY_SHADER }), entryPoint: 'vs_main' },
+ fragment: {
+ module: this.device.createShaderModule({ code: DISPLAY_SHADER }),
+ entryPoint: 'fs_main',
+ targets: [{ format: this.format }]
+ }
+ });
+
+ const encoder = this.device.createCommandEncoder();
+
+ const staticBG = this.device.createBindGroup({
+ layout: staticPipeline.getBindGroupLayout(0),
+ entries: [
+ { binding: 0, resource: this.inputTexture.createView() },
+ { binding: 1, resource: depthTex.createView() },
+ { binding: 2, resource: staticTex.createView() }
+ ]
+ });
+
+ const staticPass = encoder.beginComputePass();
+ staticPass.setPipeline(staticPipeline);
+ staticPass.setBindGroup(0, staticBG);
+ staticPass.dispatchWorkgroups(Math.ceil(width / 8), Math.ceil(height / 8));
+ staticPass.end();
+
+ let srcTex = staticTex;
+ let dstTex = layerTextures[0];
+
+ for (let i = 0; i < this.weights.layers.length; i++) {
+ const layer = this.weights.layers[i];
+ const isOutput = i === this.weights.layers.length - 1;
+
+ // Calculate absolute weight offset in f16 units (add header offset)
+ const headerOffsetU32 = 4 + this.weights.layers.length * 5; // Header + layer info in u32
+ const absoluteWeightOffset = headerOffsetU32 * 2 + layer.weightOffset; // Convert to f16 units
+
+ const paramsData = new Uint32Array(6);
+ paramsData[0] = layer.kernelSize;
+ paramsData[1] = layer.inChannels;
+ paramsData[2] = layer.outChannels;
+ paramsData[3] = absoluteWeightOffset; // Use absolute offset
+ paramsData[4] = isOutput ? 1 : 0;
+
+ const paramsView = new Float32Array(paramsData.buffer);
+ paramsView[5] = this.blendAmount;
+
+ const paramsBuffer = this.device.createBuffer({
+ size: 24,
+ usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST
+ });
+ this.device.queue.writeBuffer(paramsBuffer, 0, paramsData);
+
+ const cnnBG = this.device.createBindGroup({
+ layout: cnnPipeline.getBindGroupLayout(0),
+ entries: [
+ { binding: 0, resource: staticTex.createView() },
+ { binding: 1, resource: srcTex.createView() },
+ { binding: 2, resource: dstTex.createView() },
+ { binding: 3, resource: { buffer: weightsGPU } },
+ { binding: 4, resource: { buffer: paramsBuffer } },
+ { binding: 5, resource: this.inputTexture.createView() }
+ ]
+ });
+
+ const cnnPass = encoder.beginComputePass();
+ cnnPass.setPipeline(cnnPipeline);
+ cnnPass.setBindGroup(0, cnnBG);
+ cnnPass.dispatchWorkgroups(Math.ceil(width / 8), Math.ceil(height / 8));
+ cnnPass.end();
+
+ [srcTex, dstTex] = [dstTex, srcTex];
+ }
+
+ const modeBuffer = this.device.createBuffer({
+ size: 4,
+ usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST
+ });
+ this.device.queue.writeBuffer(modeBuffer, 0, new Uint32Array([this.viewMode]));
+
+ // Store result texture and display pipeline for view mode switching
+ this.resultTexture = srcTex;
+ this.displayPipeline = displayPipeline;
+ this.modeBuffer = modeBuffer;
+
+ const displayBG = this.device.createBindGroup({
+ layout: displayPipeline.getBindGroupLayout(0),
+ entries: [
+ { binding: 0, resource: srcTex.createView() },
+ { binding: 1, resource: this.inputTexture.createView() },
+ { binding: 2, resource: { buffer: modeBuffer } }
+ ]
+ });
+ this.displayBindGroup = displayBG;
+
+ const renderPass = encoder.beginRenderPass({
+ colorAttachments: [{
+ view: this.context.getCurrentTexture().createView(),
+ loadOp: 'clear',
+ storeOp: 'store'
+ }]
+ });
+ renderPass.setPipeline(displayPipeline);
+ renderPass.setBindGroup(0, displayBG);
+ renderPass.draw(6);
+ renderPass.end();
+
+ this.device.queue.submit([encoder.finish()]);
+
+ const t1 = performance.now();
+ const mode = ['CNN Output', 'Original', 'Diff (×10)'][this.viewMode];
+ this.setStatus(`GPU: ${(t1-t0).toFixed(1)}ms | ${width}×${height} | ${mode}`);
+ this.log(`Completed in ${(t1-t0).toFixed(1)}ms`);
+ }
+
+ updateDisplay() {
+ if (!this.displayPipeline || !this.displayBindGroup) return;
+
+ this.device.queue.writeBuffer(this.modeBuffer, 0, new Uint32Array([this.viewMode]));
+
+ const encoder = this.device.createCommandEncoder();
+ const renderPass = encoder.beginRenderPass({
+ colorAttachments: [{
+ view: this.context.getCurrentTexture().createView(),
+ loadOp: 'clear',
+ storeOp: 'store'
+ }]
+ });
+ renderPass.setPipeline(this.displayPipeline);
+ renderPass.setBindGroup(0, this.displayBindGroup);
+ renderPass.draw(6);
+ renderPass.end();
+
+ this.device.queue.submit([encoder.finish()]);
+ }
+}
+
+const tester = new CNNTester();
+
+function setupDropZone(id, callback) {
+ const zone = document.getElementById(id);
+ ['dragenter', 'dragover', 'dragleave', 'drop'].forEach(e => {
+ zone.addEventListener(e, ev => { ev.preventDefault(); ev.stopPropagation(); });
+ });
+ ['dragenter', 'dragover'].forEach(e => zone.addEventListener(e, () => zone.classList.add('active')));
+ ['dragleave', 'drop'].forEach(e => zone.addEventListener(e, () => zone.classList.remove('active')));
+ zone.addEventListener('drop', e => {
+ const file = e.dataTransfer.files[0];
+ if (file) callback(file).catch(err => {
+ zone.classList.add('error');
+ tester.setStatus(err.message, true);
+ tester.log(err.message, 'error');
+ setTimeout(() => zone.classList.remove('error'), 2000);
+ });
+ });
+}
+
+// Whole window drop for PNG images
+const mainArea = document.getElementById('mainDrop');
+['dragenter', 'dragover', 'dragleave', 'drop'].forEach(e => {
+ mainArea.addEventListener(e, ev => { ev.preventDefault(); ev.stopPropagation(); });
+});
+['dragenter', 'dragover'].forEach(e => mainArea.addEventListener(e, () => mainArea.classList.add('drop-active')));
+['dragleave', 'drop'].forEach(e => mainArea.addEventListener(e, () => mainArea.classList.remove('drop-active')));
+mainArea.addEventListener('drop', e => {
+ const file = e.dataTransfer.files[0];
+ if (file && file.type.startsWith('image/')) {
+ tester.loadImage(file).catch(err => {
+ tester.setStatus(err.message, true);
+ tester.log(err.message, 'error');
+ });
+ }
+});
+
+// Weights drop zone
+setupDropZone('weightsDrop', f => tester.loadWeights(f));
+
+document.getElementById('blend').addEventListener('input', e => {
+ tester.blendAmount = parseFloat(e.target.value);
+ document.getElementById('blendValue').textContent = e.target.value;
+ if (tester.image && tester.weights) {
+ tester.log(`Blend changed to ${e.target.value}`);
+ tester.run();
+ }
+});
+
+document.getElementById('depth').addEventListener('input', e => {
+ tester.depth = parseFloat(e.target.value);
+ if (tester.image && tester.weights) tester.run();
+});
+
+document.addEventListener('keydown', e => {
+ if (e.code === 'Space') {
+ e.preventDefault();
+ if (tester.viewMode === 1) {
+ tester.viewMode = 0;
+ } else {
+ tester.viewMode = 1;
+ }
+ const modeName = ['CNN Output', 'Original', 'Diff (×10)'][tester.viewMode];
+ document.getElementById('viewMode').textContent = modeName;
+ if (tester.image && tester.weights) {
+ tester.log(`View mode: ${modeName}`);
+ tester.updateDisplay();
+ const { width, height } = tester.image;
+ tester.setStatus(`${width}×${height} | ${modeName}`);
+ }
+ } else if (e.code === 'KeyD') {
+ e.preventDefault();
+ if (tester.viewMode === 2) {
+ tester.viewMode = 0;
+ } else {
+ tester.viewMode = 2;
+ }
+ const modeName = ['CNN Output', 'Original', 'Diff (×10)'][tester.viewMode];
+ document.getElementById('viewMode').textContent = modeName;
+ if (tester.image && tester.weights) {
+ tester.log(`View mode: ${modeName}`);
+ tester.updateDisplay();
+ const { width, height } = tester.image;
+ tester.setStatus(`${width}×${height} | ${modeName}`);
+ }
+ }
+});
+ </script>
+</body>
+</html>