diff options
| author | skal <pascal.massimino@gmail.com> | 2026-02-13 09:33:12 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-02-13 09:33:12 +0100 |
| commit | 2a793c23b582ed134b8294bfbbe3b6d7aaafe0c0 (patch) | |
| tree | 53073f5947515a3aad9afca7c24ff6b52ce5ef05 /tools | |
| parent | 65f6866b985fb3d0993fc2b6798c74015fb4fa6a (diff) | |
Add CNN v2 WebGPU testing tool
Implements single-file HTML tool for rapid CNN weight validation:
Features:
- Drag-drop PNG images (whole window) and .bin weights
- Real-time WebGPU compute pipeline (static features + N layers)
- Data-driven execution (reads layer count from binary)
- View modes: CNN output / Original / Diff (×10)
- Blend slider (0.0-1.0) for effect strength
- Console log with timestamps
- Keyboard shortcuts: SPACE (original), D (diff)
Architecture:
- Embedded WGSL shaders (static + compute + display)
- Binary parser for .bin format (header + layer info + f16 weights)
- Persistent textures for view mode switching
- Absolute weight offset calculation (header + layer info skip)
Implementation notes:
- Weight offsets in binary are relative to weights section
- JavaScript precalculates absolute offsets: headerOffsetU32 * 2 + offset
- Matches C++ shader behavior (simple get_weight without offset param)
- Ping-pong textures for multi-layer processing
TODO:
- Side panel: .bin metadata, weight statistics, validation
- Layer inspection: R/G/B/A plane split, intermediate outputs
- Activation heatmaps for debugging
Files:
- tools/cnn_v2_test/index.html (24 KB, 730 lines)
- tools/cnn_v2_test/README.md (usage guide, troubleshooting)
handoff(Claude): CNN v2 HTML testing tool complete, documented TODOs for future enhancements
Diffstat (limited to 'tools')
| -rw-r--r-- | tools/cnn_v2_test/README.md | 231 | ||||
| -rw-r--r-- | tools/cnn_v2_test/index.html | 809 |
2 files changed, 1040 insertions, 0 deletions
diff --git a/tools/cnn_v2_test/README.md b/tools/cnn_v2_test/README.md new file mode 100644 index 0000000..2a8e08d --- /dev/null +++ b/tools/cnn_v2_test/README.md @@ -0,0 +1,231 @@ +# CNN v2 Testing Tool + +WebGPU-based browser tool for testing trained CNN v2 weights. + +--- + +## Features + +- Drag-drop PNG images and `.bin` weights +- Real-time CNN inference with WebGPU compute shaders +- View modes: CNN output, original input, difference (×10) +- Adjustable blend amount and depth +- Data-driven pipeline (supports variable layer count) +- GPU timing display + +--- + +## Requirements + +- Browser with WebGPU support: + - Chrome/Edge 113+ (enable `chrome://flags/#enable-unsafe-webgpu` if needed) + - Safari 18+ (macOS Ventura+) +- Trained CNN v2 weights in binary format (`.bin`) +- Test images (PNG format) + +--- + +## Usage + +### 1. Open Tool + +```bash +open tools/cnn_v2_test/index.html +``` + +Or use a local server to avoid CORS: +```bash +python3 -m http.server 8000 +# Open http://localhost:8000/tools/cnn_v2_test/ +``` + +### 2. Load Data + +1. **Drop PNG image** anywhere in window (shows preview immediately) +2. **Drop `.bin` weights** into header drop zone +3. CNN runs automatically when both loaded + +### 3. Controls + +**Sliders:** +- **Blend:** Mix between original (0.0) and CNN output (1.0) +- **Depth:** Uniform depth value for all pixels (0.0–1.0) + +**Keyboard:** +- `SPACE` - Toggle original input view +- `D` - Toggle difference view (×10 amplification) + +**Status Bar:** +- Shows GPU timing (ms), image dimensions, and current view mode +- Red text indicates errors + +**Console Log:** +- Timestamped event log at bottom +- Tracks file loads, pipeline execution, errors +- Auto-scrolls to latest messages + +--- + +## Preparing Test Data + +### Export Weights + +```bash +# From trained checkpoint +./training/export_cnn_v2_weights.py \ + checkpoints/checkpoint_epoch_100.pth \ + --output-weights tools/cnn_v2_test/test_weights.bin +``` + +Binary format: 16-byte header + 20 bytes per layer + f16 weights (~3.2 KB for 3-layer model) + +### Test Images + +Use training images or any PNG: +```bash +# Copy test image +cp training/input/test.png tools/cnn_v2_test/ +``` + +**Note:** Grayscale images automatically converted to RGB. + +--- + +## Validation + +### Visual Comparison + +Compare browser output with C++ tool: + +```bash +# Generate C++ output +./build/cnn_test training/input/test.png /tmp/cpp_output.png + +# Load same image in browser tool +# Visually compare outputs +``` + +### GPU Timing + +Expected performance: +- 512×512: ~1-2 ms (integrated GPU) +- 1024×1024: ~3-5 ms +- 1920×1080: ~5-8 ms + +Slower than expected? Check: +- WebGPU enabled in browser +- Dedicated GPU selected (if available) +- No background tabs consuming GPU + +--- + +## Troubleshooting + +### "WebGPU not supported" + +- Update browser to latest version +- Enable WebGPU flag: `chrome://flags/#enable-unsafe-webgpu` +- Try Safari 18+ (native WebGPU on macOS) + +### "Invalid .bin file" + +- Check magic number: `hexdump -C weights.bin | head` +- Should start with: `43 4e 4e 32` ('CNN2') +- Re-export weights: `./training/export_cnn_v2_weights.py` + +### Black output / incorrect colors + +- Check blend slider (set to 1.0 for full CNN output) +- Verify training converged (loss < 0.01) +- Compare with C++ tool output + +### Shader compilation errors + +Open browser console (F12) for detailed errors. Common issues: +- Image too large (>4096×4096 not tested) +- Unsupported texture format (rare on modern GPUs) + +--- + +## Architecture + +**Pipeline:** +1. **Static Features Pass** - Generate 8D features (RGBD, UV, sin, bias) +2. **CNN Layer Passes** - Compute N layers with ping-pong textures +3. **Display Pass** - Unpack and render with view mode + +**Textures:** +- Input: RGBA8 (original image) +- Depth: R32F (uniform depth) +- Static features: RGBA32Uint (8×f16 packed) +- Layer buffers: RGBA32Uint (ping-pong) + +**Data-Driven Execution:** +- Layer count read from binary header +- Per-layer params (kernel size, channels, offsets) from binary +- Single CNN shader dispatched N times + +--- + +## TODO + +**Side Panel (Right):** +- Display .bin content metadata: + - Layer descriptions (kernel size, channels, weight count) + - Weight statistics (min/max/mean per layer) + - Weight heatmap visualization + - Binary format validation status + - Memory usage breakdown + +**Layer Inspection Views:** +- Split R/G/B/A plane visualization +- Intermediate layer output display: + - View static features (8D packed as heatmaps) + - View layer 0 output (before activation) + - View layer 1 output + - Toggle between channels +- Activation heatmaps (where neurons fire) + +--- + +## Extensions (v2+) + +Planned enhancements: + +**Variable Feature Count:** +- Binary v2: Add `num_features` to header +- Shader: Dynamic feature array or multiple textures + +**Multi-Scale Input (Mip Levels):** +- Uncomment mip bindings in static shader +- No binary format change needed + +**8-bit Quantized Weights:** +- Binary version bump (format field already present) +- Add quantization codepath in `get_weight()` function +- 2× size reduction (~1.6 KB) + +**Pre-defined Test Images:** +- Dropdown menu with training/input/*.png +- Requires local file server + +--- + +## Size + +- HTML structure: ~1 KB +- CSS styling: ~1 KB +- JavaScript logic: ~5 KB +- Static shader: ~1 KB +- CNN shader: ~3 KB +- Display shader: ~1 KB +- **Total: ~12 KB** (single file, no dependencies) + +--- + +## See Also + +- `doc/CNN_V2.md` - Architecture and design +- `doc/HOWTO.md` - Training workflows +- `training/export_cnn_v2_weights.py` - Binary format +- `src/gpu/effects/cnn_v2_effect.cc` - C++ reference implementation diff --git a/tools/cnn_v2_test/index.html b/tools/cnn_v2_test/index.html new file mode 100644 index 0000000..9c28455 --- /dev/null +++ b/tools/cnn_v2_test/index.html @@ -0,0 +1,809 @@ +<!DOCTYPE html> +<html lang="en"> +<!-- + CNN v2 Testing Tool - WebGPU-based inference validator + + TODO: + - Side panel: .bin metadata display, weight statistics, validation + - Layer inspection: R/G/B/A plane split, intermediate layer visualization + - Activation heatmaps for debugging +--> +<head> + <meta charset="UTF-8"> + <meta name="viewport" content="width=device-width, initial-scale=1.0"> + <title>CNN v2 Testing Tool</title> + <style> + * { margin: 0; padding: 0; box-sizing: border-box; } + body { + font-family: 'Courier New', monospace; + background: #1a1a1a; + color: #e0e0e0; + display: flex; + flex-direction: column; + height: 100vh; + overflow: hidden; + } + .header { + background: #2a2a2a; + padding: 16px; + border-bottom: 1px solid #404040; + } + h1 { font-size: 18px; margin-bottom: 12px; } + .controls { + display: flex; + gap: 16px; + align-items: center; + flex-wrap: wrap; + } + .control-group { + display: flex; + gap: 8px; + align-items: center; + } + .control-group label { font-size: 12px; } + input[type="range"] { width: 120px; } + input[type="number"] { width: 60px; background: #1a1a1a; color: #e0e0e0; border: 1px solid #404040; padding: 4px; } + .drop-zone { + border: 2px dashed #404040; + padding: 16px; + text-align: center; + cursor: pointer; + transition: all 0.2s; + font-size: 12px; + margin-top: 12px; + } + .drop-zone:hover { border-color: #606060; background: #252525; } + .drop-zone.active { border-color: #4a9eff; background: #1a2a3a; } + .drop-zone.error { border-color: #ff4a4a; background: #3a1a1a; } + .main { + flex: 1; + display: flex; + justify-content: center; + align-items: center; + padding: 24px; + overflow: auto; + position: relative; + } + .main.drop-active::after { + content: 'Drop PNG image here'; + position: absolute; + inset: 24px; + display: flex; + align-items: center; + justify-content: center; + border: 3px dashed #4a9eff; + background: rgba(74, 158, 255, 0.1); + font-size: 24px; + color: #4a9eff; + pointer-events: none; + z-index: 10; + } + canvas { + max-width: 100%; + max-height: 100%; + image-rendering: pixelated; + box-shadow: 0 4px 12px rgba(0,0,0,0.5); + } + .footer { + background: #2a2a2a; + border-top: 1px solid #404040; + font-size: 11px; + display: flex; + flex-direction: column; + gap: 8px; + } + .footer-top { + padding: 12px 16px 0; + display: flex; + justify-content: space-between; + } + .status { color: #4a9eff; } + .shortcuts { color: #808080; } + .console { + background: #1a1a1a; + padding: 8px 16px; + font-family: 'Courier New', monospace; + font-size: 10px; + color: #808080; + max-height: 100px; + overflow-y: auto; + border-top: 1px solid #404040; + } + .console-line { margin: 2px 0; } + .console-line.error { color: #ff4a4a; } + .console-line.info { color: #4a9eff; } + </style> +</head> +<body> + <div class="header"> + <h1>CNN v2 Testing Tool</h1> + <div class="controls"> + <div class="control-group"> + <label>Blend:</label> + <input type="range" id="blend" min="0" max="1" step="0.01" value="1.0"> + <span id="blendValue">1.0</span> + </div> + <div class="control-group"> + <label>Depth:</label> + <input type="number" id="depth" min="0" max="1" step="0.1" value="1.0"> + </div> + <div class="control-group"> + <label>View:</label> + <span id="viewMode">CNN Output</span> + </div> + </div> + <div class="drop-zone" id="weightsDrop">Drop .bin Weights</div> + </div> + <div class="main" id="mainDrop"> + <canvas id="canvas"></canvas> + </div> + <div class="footer"> + <div class="footer-top"> + <span class="status" id="status">Drop PNG image anywhere to begin</span> + <span class="shortcuts">[SPACE] Original | [D] Diff (×10)</span> + </div> + <div class="console" id="console"></div> + </div> + + <script> +const STATIC_SHADER = ` +@group(0) @binding(0) var input_tex: texture_2d<f32>; +@group(0) @binding(1) var depth_tex: texture_2d<f32>; +@group(0) @binding(2) var output_tex: texture_storage_2d<rgba32uint, write>; + +@compute @workgroup_size(8, 8) +fn main(@builtin(global_invocation_id) id: vec3<u32>) { + let coord = vec2<i32>(id.xy); + let dims = textureDimensions(input_tex); + if (coord.x >= i32(dims.x) || coord.y >= i32(dims.y)) { return; } + + let rgba = textureLoad(input_tex, coord, 0); + let d = textureLoad(depth_tex, coord, 0).r; + let uv_x = f32(coord.x) / f32(dims.x); + let uv_y = 1.0 - (f32(coord.y) / f32(dims.y)); + let sin10_x = sin(10.0 * uv_x); + + let packed = vec4<u32>( + pack2x16float(vec2<f32>(rgba.r, rgba.g)), + pack2x16float(vec2<f32>(rgba.b, d)), + pack2x16float(vec2<f32>(uv_x, uv_y)), + pack2x16float(vec2<f32>(sin10_x, 1.0)) + ); + textureStore(output_tex, coord, packed); +}`; + +const CNN_SHADER = ` +struct LayerParams { + kernel_size: u32, + in_channels: u32, + out_channels: u32, + weight_offset: u32, + is_output_layer: u32, + blend_amount: f32, +} + +@group(0) @binding(0) var static_features: texture_2d<u32>; +@group(0) @binding(1) var layer_input: texture_2d<u32>; +@group(0) @binding(2) var output_tex: texture_storage_2d<rgba32uint, write>; +@group(0) @binding(3) var<storage, read> weights_buffer: array<u32>; +@group(0) @binding(4) var<uniform> params: LayerParams; +@group(0) @binding(5) var original_input: texture_2d<f32>; + +fn unpack_static_features(coord: vec2<i32>) -> array<f32, 8> { + let packed = textureLoad(static_features, coord, 0); + let v0 = unpack2x16float(packed.x); + let v1 = unpack2x16float(packed.y); + let v2 = unpack2x16float(packed.z); + let v3 = unpack2x16float(packed.w); + return array<f32, 8>(v0.x, v0.y, v1.x, v1.y, v2.x, v2.y, v3.x, v3.y); +} + +fn unpack_layer_channels(coord: vec2<i32>) -> array<f32, 8> { + let packed = textureLoad(layer_input, coord, 0); + let v0 = unpack2x16float(packed.x); + let v1 = unpack2x16float(packed.y); + let v2 = unpack2x16float(packed.z); + let v3 = unpack2x16float(packed.w); + return array<f32, 8>(v0.x, v0.y, v1.x, v1.y, v2.x, v2.y, v3.x, v3.y); +} + +fn pack_channels(values: array<f32, 8>) -> vec4<u32> { + return vec4<u32>( + pack2x16float(vec2<f32>(values[0], values[1])), + pack2x16float(vec2<f32>(values[2], values[3])), + pack2x16float(vec2<f32>(values[4], values[5])), + pack2x16float(vec2<f32>(values[6], values[7])) + ); +} + +fn get_weight(idx: u32) -> f32 { + let pair_idx = idx / 2u; + let packed = weights_buffer[pair_idx]; + let unpacked = unpack2x16float(packed); + return select(unpacked.y, unpacked.x, (idx & 1u) == 0u); +} + +@compute @workgroup_size(8, 8) +fn main(@builtin(global_invocation_id) id: vec3<u32>) { + let coord = vec2<i32>(id.xy); + let dims = textureDimensions(static_features); + if (coord.x >= i32(dims.x) || coord.y >= i32(dims.y)) { return; } + + let kernel_size = params.kernel_size; + let in_channels = params.in_channels; + let out_channels = params.out_channels; + let weight_offset = params.weight_offset; + let is_output = params.is_output_layer != 0u; + let kernel_radius = i32(kernel_size / 2u); + + let static_feat = unpack_static_features(coord); + + var output: array<f32, 8>; + for (var c: u32 = 0u; c < out_channels && c < 8u; c++) { + var sum: f32 = 0.0; + for (var ky: i32 = -kernel_radius; ky <= kernel_radius; ky++) { + for (var kx: i32 = -kernel_radius; kx <= kernel_radius; kx++) { + let sample_coord = coord + vec2<i32>(kx, ky); + let clamped = vec2<i32>( + clamp(sample_coord.x, 0, i32(dims.x) - 1), + clamp(sample_coord.y, 0, i32(dims.y) - 1) + ); + let static_local = unpack_static_features(clamped); + let layer_local = unpack_layer_channels(clamped); + + let ky_idx = u32(ky + kernel_radius); + let kx_idx = u32(kx + kernel_radius); + let spatial_idx = ky_idx * kernel_size + kx_idx; + + for (var i: u32 = 0u; i < 8u; i++) { + let w_idx = weight_offset + + c * in_channels * kernel_size * kernel_size + + i * kernel_size * kernel_size + spatial_idx; + sum += get_weight(w_idx) * static_local[i]; + } + + let prev_channels = in_channels - 8u; + for (var i: u32 = 0u; i < prev_channels && i < 8u; i++) { + let w_idx = weight_offset + + c * in_channels * kernel_size * kernel_size + + (8u + i) * kernel_size * kernel_size + spatial_idx; + sum += get_weight(w_idx) * layer_local[i]; + } + } + } + + if (is_output) { + output[c] = clamp(sum, 0.0, 1.0); + } else { + output[c] = max(0.0, sum); + } + } + + for (var c: u32 = out_channels; c < 8u; c++) { + output[c] = 0.0; + } + + if (is_output) { + let original = textureLoad(original_input, coord, 0).rgb; + let result_rgb = vec3<f32>(output[0], output[1], output[2]); + let blended = mix(original, result_rgb, params.blend_amount); + output[0] = blended.r; + output[1] = blended.g; + output[2] = blended.b; + } + + textureStore(output_tex, coord, pack_channels(output)); +}`; + +const DISPLAY_SHADER = ` +@group(0) @binding(0) var result_tex: texture_2d<u32>; +@group(0) @binding(1) var original_tex: texture_2d<f32>; +@group(0) @binding(2) var<uniform> mode: u32; + +@vertex +fn vs_main(@builtin(vertex_index) idx: u32) -> @builtin(position) vec4<f32> { + var pos = array<vec2<f32>, 6>( + vec2<f32>(-1.0, -1.0), vec2<f32>(1.0, -1.0), vec2<f32>(-1.0, 1.0), + vec2<f32>(-1.0, 1.0), vec2<f32>(1.0, -1.0), vec2<f32>(1.0, 1.0) + ); + return vec4<f32>(pos[idx], 0.0, 1.0); +} + +@fragment +fn fs_main(@builtin(position) pos: vec4<f32>) -> @location(0) vec4<f32> { + let coord = vec2<i32>(pos.xy); + let packed = textureLoad(result_tex, coord, 0); + let v0 = unpack2x16float(packed.x); + let v1 = unpack2x16float(packed.y); + let result = vec3<f32>(v0.x, v0.y, v1.x); + + if (mode == 0u) { + return vec4<f32>(result, 1.0); + } else if (mode == 1u) { + let original = textureLoad(original_tex, coord, 0).rgb; + return vec4<f32>(original, 1.0); + } else { + let original = textureLoad(original_tex, coord, 0).rgb; + let diff = abs(result - original) * 10.0; + return vec4<f32>(diff, 1.0); + } +}`; + +class CNNTester { + constructor() { + this.canvas = document.getElementById('canvas'); + this.status = document.getElementById('status'); + this.console = document.getElementById('console'); + this.image = null; + this.weights = null; + this.viewMode = 0; + this.blendAmount = 1.0; + this.depth = 1.0; + this.init(); + } + + log(msg, type = 'info') { + const line = document.createElement('div'); + line.className = `console-line ${type}`; + line.textContent = `[${new Date().toLocaleTimeString()}] ${msg}`; + this.console.appendChild(line); + this.console.scrollTop = this.console.scrollHeight; + } + + async init() { + if (!navigator.gpu) { + this.setStatus('WebGPU not supported', true); + this.log('WebGPU not supported in this browser', 'error'); + return; + } + + try { + this.adapter = await navigator.gpu.requestAdapter(); + this.device = await this.adapter.requestDevice(); + this.context = this.canvas.getContext('webgpu'); + this.format = navigator.gpu.getPreferredCanvasFormat(); + this.log('WebGPU initialized successfully'); + } catch (e) { + this.setStatus(`GPU init failed: ${e.message}`, true); + this.log(`GPU initialization failed: ${e.message}`, 'error'); + } + } + + setStatus(msg, isError = false) { + this.status.textContent = msg; + this.status.style.color = isError ? '#ff4a4a' : '#4a9eff'; + } + + parseWeights(buffer) { + const view = new DataView(buffer); + const magic = view.getUint32(0, true); + if (magic !== 0x32_4E_4E_43) { + throw new Error('Invalid .bin file (bad magic)'); + } + + const version = view.getUint32(4, true); + const numLayers = view.getUint32(8, true); + const totalWeights = view.getUint32(12, true); + + this.log(`Binary header: version=${version}, layers=${numLayers}, weights=${totalWeights}`); + + const layers = []; + for (let i = 0; i < numLayers; i++) { + const offset = 16 + i * 20; + const layer = { + kernelSize: view.getUint32(offset, true), + inChannels: view.getUint32(offset + 4, true), + outChannels: view.getUint32(offset + 8, true), + weightOffset: view.getUint32(offset + 12, true), + weightCount: view.getUint32(offset + 16, true), + }; + layers.push(layer); + this.log(` Layer ${i}: ${layer.inChannels}→${layer.outChannels}, kernel=${layer.kernelSize}×${layer.kernelSize}, weights=${layer.weightCount}`); + } + + const weightsOffset = 16 + numLayers * 20; + const weights = new Uint32Array(buffer.slice(weightsOffset)); + + // Verify weights are non-zero + let nonZero = 0; + for (let i = 0; i < weights.length; i++) { + if (weights[i] !== 0) nonZero++; + } + this.log(` Weight buffer: ${weights.length} u32 (${nonZero} non-zero)`); + + return { layers, weights }; + } + + async loadImage(file) { + const img = await createImageBitmap(file); + this.image = img; + this.canvas.width = img.width; + this.canvas.height = img.height; + this.log(`Loaded image: ${file.name} (${img.width}×${img.height})`); + if (this.weights) { + this.setStatus(`Ready: ${img.width}×${img.height}`); + this.run(); + } else { + this.setStatus(`Image loaded (${img.width}×${img.height}) - drop .bin weights to process`); + this.displayOriginal(); + } + } + + async loadWeights(file) { + const buffer = await file.arrayBuffer(); + this.weights = this.parseWeights(buffer); + this.weightsBuffer = buffer; + this.log(`Loaded weights: ${file.name} (${this.weights.layers.length} layers, ${(buffer.byteLength/1024).toFixed(1)} KB)`); + if (this.image) { + this.setStatus(`Ready: ${this.image.width}×${this.image.height}`); + this.run(); + } else { + this.setStatus('Weights loaded - drop PNG image to process'); + } + } + + displayOriginal() { + if (!this.image || !this.device) return; + + const { width, height } = this.image; + this.context.configure({ device: this.device, format: this.format }); + + const inputTex = this.device.createTexture({ + size: [width, height], + format: 'rgba8unorm', + usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST | GPUTextureUsage.RENDER_ATTACHMENT + }); + + this.device.queue.copyExternalImageToTexture( + { source: this.image }, + { texture: inputTex }, + [width, height] + ); + + const simpleShader = ` + @vertex + fn vs_main(@builtin(vertex_index) idx: u32) -> @builtin(position) vec4<f32> { + var pos = array<vec2<f32>, 6>( + vec2<f32>(-1.0, -1.0), vec2<f32>(1.0, -1.0), vec2<f32>(-1.0, 1.0), + vec2<f32>(-1.0, 1.0), vec2<f32>(1.0, -1.0), vec2<f32>(1.0, 1.0) + ); + return vec4<f32>(pos[idx], 0.0, 1.0); + } + + @group(0) @binding(0) var tex: texture_2d<f32>; + + @fragment + fn fs_main(@builtin(position) pos: vec4<f32>) -> @location(0) vec4<f32> { + let coord = vec2<i32>(pos.xy); + return textureLoad(tex, coord, 0); + } + `; + + const pipeline = this.device.createRenderPipeline({ + layout: 'auto', + vertex: { module: this.device.createShaderModule({ code: simpleShader }), entryPoint: 'vs_main' }, + fragment: { + module: this.device.createShaderModule({ code: simpleShader }), + entryPoint: 'fs_main', + targets: [{ format: this.format }] + } + }); + + const bindGroup = this.device.createBindGroup({ + layout: pipeline.getBindGroupLayout(0), + entries: [{ binding: 0, resource: inputTex.createView() }] + }); + + const encoder = this.device.createCommandEncoder(); + const renderPass = encoder.beginRenderPass({ + colorAttachments: [{ + view: this.context.getCurrentTexture().createView(), + loadOp: 'clear', + storeOp: 'store' + }] + }); + renderPass.setPipeline(pipeline); + renderPass.setBindGroup(0, bindGroup); + renderPass.draw(6); + renderPass.end(); + + this.device.queue.submit([encoder.finish()]); + } + + async run() { + const t0 = performance.now(); + const { width, height } = this.image; + this.log(`Running CNN pipeline (${this.weights.layers.length} layers)...`); + + this.context.configure({ device: this.device, format: this.format }); + + // Create persistent input texture for original view + if (this.inputTexture) this.inputTexture.destroy(); + this.inputTexture = this.device.createTexture({ + size: [width, height], + format: 'rgba8unorm', + usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST | GPUTextureUsage.RENDER_ATTACHMENT + }); + + this.device.queue.copyExternalImageToTexture( + { source: this.image }, + { texture: this.inputTexture }, + [width, height] + ); + + const depthTex = this.device.createTexture({ + size: [width, height], + format: 'r32float', + usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST + }); + const depthData = new Float32Array(width * height).fill(this.depth); + this.device.queue.writeTexture( + { texture: depthTex }, + depthData, + { bytesPerRow: width * 4 }, + [width, height] + ); + + const staticTex = this.device.createTexture({ + size: [width, height], + format: 'rgba32uint', + usage: GPUTextureUsage.STORAGE_BINDING | GPUTextureUsage.TEXTURE_BINDING + }); + + const layerTextures = [ + this.device.createTexture({ + size: [width, height], + format: 'rgba32uint', + usage: GPUTextureUsage.STORAGE_BINDING | GPUTextureUsage.TEXTURE_BINDING + }), + this.device.createTexture({ + size: [width, height], + format: 'rgba32uint', + usage: GPUTextureUsage.STORAGE_BINDING | GPUTextureUsage.TEXTURE_BINDING + }) + ]; + + const weightsGPU = this.device.createBuffer({ + size: this.weightsBuffer.byteLength, + usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST + }); + this.device.queue.writeBuffer(weightsGPU, 0, this.weightsBuffer); + + const staticPipeline = this.device.createComputePipeline({ + layout: 'auto', + compute: { module: this.device.createShaderModule({ code: STATIC_SHADER }), entryPoint: 'main' } + }); + + const cnnPipeline = this.device.createComputePipeline({ + layout: 'auto', + compute: { module: this.device.createShaderModule({ code: CNN_SHADER }), entryPoint: 'main' } + }); + + const displayPipeline = this.device.createRenderPipeline({ + layout: 'auto', + vertex: { module: this.device.createShaderModule({ code: DISPLAY_SHADER }), entryPoint: 'vs_main' }, + fragment: { + module: this.device.createShaderModule({ code: DISPLAY_SHADER }), + entryPoint: 'fs_main', + targets: [{ format: this.format }] + } + }); + + const encoder = this.device.createCommandEncoder(); + + const staticBG = this.device.createBindGroup({ + layout: staticPipeline.getBindGroupLayout(0), + entries: [ + { binding: 0, resource: this.inputTexture.createView() }, + { binding: 1, resource: depthTex.createView() }, + { binding: 2, resource: staticTex.createView() } + ] + }); + + const staticPass = encoder.beginComputePass(); + staticPass.setPipeline(staticPipeline); + staticPass.setBindGroup(0, staticBG); + staticPass.dispatchWorkgroups(Math.ceil(width / 8), Math.ceil(height / 8)); + staticPass.end(); + + let srcTex = staticTex; + let dstTex = layerTextures[0]; + + for (let i = 0; i < this.weights.layers.length; i++) { + const layer = this.weights.layers[i]; + const isOutput = i === this.weights.layers.length - 1; + + // Calculate absolute weight offset in f16 units (add header offset) + const headerOffsetU32 = 4 + this.weights.layers.length * 5; // Header + layer info in u32 + const absoluteWeightOffset = headerOffsetU32 * 2 + layer.weightOffset; // Convert to f16 units + + const paramsData = new Uint32Array(6); + paramsData[0] = layer.kernelSize; + paramsData[1] = layer.inChannels; + paramsData[2] = layer.outChannels; + paramsData[3] = absoluteWeightOffset; // Use absolute offset + paramsData[4] = isOutput ? 1 : 0; + + const paramsView = new Float32Array(paramsData.buffer); + paramsView[5] = this.blendAmount; + + const paramsBuffer = this.device.createBuffer({ + size: 24, + usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST + }); + this.device.queue.writeBuffer(paramsBuffer, 0, paramsData); + + const cnnBG = this.device.createBindGroup({ + layout: cnnPipeline.getBindGroupLayout(0), + entries: [ + { binding: 0, resource: staticTex.createView() }, + { binding: 1, resource: srcTex.createView() }, + { binding: 2, resource: dstTex.createView() }, + { binding: 3, resource: { buffer: weightsGPU } }, + { binding: 4, resource: { buffer: paramsBuffer } }, + { binding: 5, resource: this.inputTexture.createView() } + ] + }); + + const cnnPass = encoder.beginComputePass(); + cnnPass.setPipeline(cnnPipeline); + cnnPass.setBindGroup(0, cnnBG); + cnnPass.dispatchWorkgroups(Math.ceil(width / 8), Math.ceil(height / 8)); + cnnPass.end(); + + [srcTex, dstTex] = [dstTex, srcTex]; + } + + const modeBuffer = this.device.createBuffer({ + size: 4, + usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST + }); + this.device.queue.writeBuffer(modeBuffer, 0, new Uint32Array([this.viewMode])); + + // Store result texture and display pipeline for view mode switching + this.resultTexture = srcTex; + this.displayPipeline = displayPipeline; + this.modeBuffer = modeBuffer; + + const displayBG = this.device.createBindGroup({ + layout: displayPipeline.getBindGroupLayout(0), + entries: [ + { binding: 0, resource: srcTex.createView() }, + { binding: 1, resource: this.inputTexture.createView() }, + { binding: 2, resource: { buffer: modeBuffer } } + ] + }); + this.displayBindGroup = displayBG; + + const renderPass = encoder.beginRenderPass({ + colorAttachments: [{ + view: this.context.getCurrentTexture().createView(), + loadOp: 'clear', + storeOp: 'store' + }] + }); + renderPass.setPipeline(displayPipeline); + renderPass.setBindGroup(0, displayBG); + renderPass.draw(6); + renderPass.end(); + + this.device.queue.submit([encoder.finish()]); + + const t1 = performance.now(); + const mode = ['CNN Output', 'Original', 'Diff (×10)'][this.viewMode]; + this.setStatus(`GPU: ${(t1-t0).toFixed(1)}ms | ${width}×${height} | ${mode}`); + this.log(`Completed in ${(t1-t0).toFixed(1)}ms`); + } + + updateDisplay() { + if (!this.displayPipeline || !this.displayBindGroup) return; + + this.device.queue.writeBuffer(this.modeBuffer, 0, new Uint32Array([this.viewMode])); + + const encoder = this.device.createCommandEncoder(); + const renderPass = encoder.beginRenderPass({ + colorAttachments: [{ + view: this.context.getCurrentTexture().createView(), + loadOp: 'clear', + storeOp: 'store' + }] + }); + renderPass.setPipeline(this.displayPipeline); + renderPass.setBindGroup(0, this.displayBindGroup); + renderPass.draw(6); + renderPass.end(); + + this.device.queue.submit([encoder.finish()]); + } +} + +const tester = new CNNTester(); + +function setupDropZone(id, callback) { + const zone = document.getElementById(id); + ['dragenter', 'dragover', 'dragleave', 'drop'].forEach(e => { + zone.addEventListener(e, ev => { ev.preventDefault(); ev.stopPropagation(); }); + }); + ['dragenter', 'dragover'].forEach(e => zone.addEventListener(e, () => zone.classList.add('active'))); + ['dragleave', 'drop'].forEach(e => zone.addEventListener(e, () => zone.classList.remove('active'))); + zone.addEventListener('drop', e => { + const file = e.dataTransfer.files[0]; + if (file) callback(file).catch(err => { + zone.classList.add('error'); + tester.setStatus(err.message, true); + tester.log(err.message, 'error'); + setTimeout(() => zone.classList.remove('error'), 2000); + }); + }); +} + +// Whole window drop for PNG images +const mainArea = document.getElementById('mainDrop'); +['dragenter', 'dragover', 'dragleave', 'drop'].forEach(e => { + mainArea.addEventListener(e, ev => { ev.preventDefault(); ev.stopPropagation(); }); +}); +['dragenter', 'dragover'].forEach(e => mainArea.addEventListener(e, () => mainArea.classList.add('drop-active'))); +['dragleave', 'drop'].forEach(e => mainArea.addEventListener(e, () => mainArea.classList.remove('drop-active'))); +mainArea.addEventListener('drop', e => { + const file = e.dataTransfer.files[0]; + if (file && file.type.startsWith('image/')) { + tester.loadImage(file).catch(err => { + tester.setStatus(err.message, true); + tester.log(err.message, 'error'); + }); + } +}); + +// Weights drop zone +setupDropZone('weightsDrop', f => tester.loadWeights(f)); + +document.getElementById('blend').addEventListener('input', e => { + tester.blendAmount = parseFloat(e.target.value); + document.getElementById('blendValue').textContent = e.target.value; + if (tester.image && tester.weights) { + tester.log(`Blend changed to ${e.target.value}`); + tester.run(); + } +}); + +document.getElementById('depth').addEventListener('input', e => { + tester.depth = parseFloat(e.target.value); + if (tester.image && tester.weights) tester.run(); +}); + +document.addEventListener('keydown', e => { + if (e.code === 'Space') { + e.preventDefault(); + if (tester.viewMode === 1) { + tester.viewMode = 0; + } else { + tester.viewMode = 1; + } + const modeName = ['CNN Output', 'Original', 'Diff (×10)'][tester.viewMode]; + document.getElementById('viewMode').textContent = modeName; + if (tester.image && tester.weights) { + tester.log(`View mode: ${modeName}`); + tester.updateDisplay(); + const { width, height } = tester.image; + tester.setStatus(`${width}×${height} | ${modeName}`); + } + } else if (e.code === 'KeyD') { + e.preventDefault(); + if (tester.viewMode === 2) { + tester.viewMode = 0; + } else { + tester.viewMode = 2; + } + const modeName = ['CNN Output', 'Original', 'Diff (×10)'][tester.viewMode]; + document.getElementById('viewMode').textContent = modeName; + if (tester.image && tester.weights) { + tester.log(`View mode: ${modeName}`); + tester.updateDisplay(); + const { width, height } = tester.image; + tester.setStatus(`${width}×${height} | ${modeName}`); + } + } +}); + </script> +</body> +</html> |
