summaryrefslogtreecommitdiff
path: root/tools/cnn_v2_test/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'tools/cnn_v2_test/README.md')
-rw-r--r--tools/cnn_v2_test/README.md231
1 files changed, 231 insertions, 0 deletions
diff --git a/tools/cnn_v2_test/README.md b/tools/cnn_v2_test/README.md
new file mode 100644
index 0000000..2a8e08d
--- /dev/null
+++ b/tools/cnn_v2_test/README.md
@@ -0,0 +1,231 @@
+# CNN v2 Testing Tool
+
+WebGPU-based browser tool for testing trained CNN v2 weights.
+
+---
+
+## Features
+
+- Drag-drop PNG images and `.bin` weights
+- Real-time CNN inference with WebGPU compute shaders
+- View modes: CNN output, original input, difference (×10)
+- Adjustable blend amount and depth
+- Data-driven pipeline (supports variable layer count)
+- GPU timing display
+
+---
+
+## Requirements
+
+- Browser with WebGPU support:
+ - Chrome/Edge 113+ (enable `chrome://flags/#enable-unsafe-webgpu` if needed)
+ - Safari 18+ (macOS Ventura+)
+- Trained CNN v2 weights in binary format (`.bin`)
+- Test images (PNG format)
+
+---
+
+## Usage
+
+### 1. Open Tool
+
+```bash
+open tools/cnn_v2_test/index.html
+```
+
+Or use a local server to avoid CORS:
+```bash
+python3 -m http.server 8000
+# Open http://localhost:8000/tools/cnn_v2_test/
+```
+
+### 2. Load Data
+
+1. **Drop PNG image** anywhere in window (shows preview immediately)
+2. **Drop `.bin` weights** into header drop zone
+3. CNN runs automatically when both loaded
+
+### 3. Controls
+
+**Sliders:**
+- **Blend:** Mix between original (0.0) and CNN output (1.0)
+- **Depth:** Uniform depth value for all pixels (0.0–1.0)
+
+**Keyboard:**
+- `SPACE` - Toggle original input view
+- `D` - Toggle difference view (×10 amplification)
+
+**Status Bar:**
+- Shows GPU timing (ms), image dimensions, and current view mode
+- Red text indicates errors
+
+**Console Log:**
+- Timestamped event log at bottom
+- Tracks file loads, pipeline execution, errors
+- Auto-scrolls to latest messages
+
+---
+
+## Preparing Test Data
+
+### Export Weights
+
+```bash
+# From trained checkpoint
+./training/export_cnn_v2_weights.py \
+ checkpoints/checkpoint_epoch_100.pth \
+ --output-weights tools/cnn_v2_test/test_weights.bin
+```
+
+Binary format: 16-byte header + 20 bytes per layer + f16 weights (~3.2 KB for 3-layer model)
+
+### Test Images
+
+Use training images or any PNG:
+```bash
+# Copy test image
+cp training/input/test.png tools/cnn_v2_test/
+```
+
+**Note:** Grayscale images automatically converted to RGB.
+
+---
+
+## Validation
+
+### Visual Comparison
+
+Compare browser output with C++ tool:
+
+```bash
+# Generate C++ output
+./build/cnn_test training/input/test.png /tmp/cpp_output.png
+
+# Load same image in browser tool
+# Visually compare outputs
+```
+
+### GPU Timing
+
+Expected performance:
+- 512×512: ~1-2 ms (integrated GPU)
+- 1024×1024: ~3-5 ms
+- 1920×1080: ~5-8 ms
+
+Slower than expected? Check:
+- WebGPU enabled in browser
+- Dedicated GPU selected (if available)
+- No background tabs consuming GPU
+
+---
+
+## Troubleshooting
+
+### "WebGPU not supported"
+
+- Update browser to latest version
+- Enable WebGPU flag: `chrome://flags/#enable-unsafe-webgpu`
+- Try Safari 18+ (native WebGPU on macOS)
+
+### "Invalid .bin file"
+
+- Check magic number: `hexdump -C weights.bin | head`
+- Should start with: `43 4e 4e 32` ('CNN2')
+- Re-export weights: `./training/export_cnn_v2_weights.py`
+
+### Black output / incorrect colors
+
+- Check blend slider (set to 1.0 for full CNN output)
+- Verify training converged (loss < 0.01)
+- Compare with C++ tool output
+
+### Shader compilation errors
+
+Open browser console (F12) for detailed errors. Common issues:
+- Image too large (>4096×4096 not tested)
+- Unsupported texture format (rare on modern GPUs)
+
+---
+
+## Architecture
+
+**Pipeline:**
+1. **Static Features Pass** - Generate 8D features (RGBD, UV, sin, bias)
+2. **CNN Layer Passes** - Compute N layers with ping-pong textures
+3. **Display Pass** - Unpack and render with view mode
+
+**Textures:**
+- Input: RGBA8 (original image)
+- Depth: R32F (uniform depth)
+- Static features: RGBA32Uint (8×f16 packed)
+- Layer buffers: RGBA32Uint (ping-pong)
+
+**Data-Driven Execution:**
+- Layer count read from binary header
+- Per-layer params (kernel size, channels, offsets) from binary
+- Single CNN shader dispatched N times
+
+---
+
+## TODO
+
+**Side Panel (Right):**
+- Display .bin content metadata:
+ - Layer descriptions (kernel size, channels, weight count)
+ - Weight statistics (min/max/mean per layer)
+ - Weight heatmap visualization
+ - Binary format validation status
+ - Memory usage breakdown
+
+**Layer Inspection Views:**
+- Split R/G/B/A plane visualization
+- Intermediate layer output display:
+ - View static features (8D packed as heatmaps)
+ - View layer 0 output (before activation)
+ - View layer 1 output
+ - Toggle between channels
+- Activation heatmaps (where neurons fire)
+
+---
+
+## Extensions (v2+)
+
+Planned enhancements:
+
+**Variable Feature Count:**
+- Binary v2: Add `num_features` to header
+- Shader: Dynamic feature array or multiple textures
+
+**Multi-Scale Input (Mip Levels):**
+- Uncomment mip bindings in static shader
+- No binary format change needed
+
+**8-bit Quantized Weights:**
+- Binary version bump (format field already present)
+- Add quantization codepath in `get_weight()` function
+- 2× size reduction (~1.6 KB)
+
+**Pre-defined Test Images:**
+- Dropdown menu with training/input/*.png
+- Requires local file server
+
+---
+
+## Size
+
+- HTML structure: ~1 KB
+- CSS styling: ~1 KB
+- JavaScript logic: ~5 KB
+- Static shader: ~1 KB
+- CNN shader: ~3 KB
+- Display shader: ~1 KB
+- **Total: ~12 KB** (single file, no dependencies)
+
+---
+
+## See Also
+
+- `doc/CNN_V2.md` - Architecture and design
+- `doc/HOWTO.md` - Training workflows
+- `training/export_cnn_v2_weights.py` - Binary format
+- `src/gpu/effects/cnn_v2_effect.cc` - C++ reference implementation