summaryrefslogtreecommitdiff
path: root/cnn_v2/docs/CNN_V2_WEB_TOOL.md
diff options
context:
space:
mode:
Diffstat (limited to 'cnn_v2/docs/CNN_V2_WEB_TOOL.md')
-rw-r--r--cnn_v2/docs/CNN_V2_WEB_TOOL.md348
1 files changed, 348 insertions, 0 deletions
diff --git a/cnn_v2/docs/CNN_V2_WEB_TOOL.md b/cnn_v2/docs/CNN_V2_WEB_TOOL.md
new file mode 100644
index 0000000..b6f5b0b
--- /dev/null
+++ b/cnn_v2/docs/CNN_V2_WEB_TOOL.md
@@ -0,0 +1,348 @@
+# CNN v2 Web Testing Tool
+
+Browser-based WebGPU tool for validating CNN v2 inference with layer visualization and weight inspection.
+
+**Location:** `tools/cnn_v2_test/index.html`
+
+---
+
+## Status (2026-02-13)
+
+**Working:**
+- ✅ WebGPU initialization and device setup
+- ✅ Binary weight file parsing (v1 and v2 formats)
+- ✅ Automatic mip-level detection from binary format v2
+- ✅ Weight statistics (min/max per layer)
+- ✅ UI layout with collapsible panels
+- ✅ Mode switching (Activations/Weights tabs)
+- ✅ Canvas context management (2D for weights, WebGPU for activations)
+- ✅ Weight visualization infrastructure (layer selection, grid layout)
+- ✅ Layer naming matches codebase convention (Layer 0, Layer 1, Layer 2)
+- ✅ Static features split visualization (Static 0-3, Static 4-7)
+- ✅ All layers visible including output layer (Layer 2)
+- ✅ Video playback support (MP4, WebM) with frame-by-frame controls
+- ✅ Video looping (automatic continuous playback)
+- ✅ Mip level selection (p0-p3 features at different resolutions)
+
+**Recent Changes (Latest):**
+- Binary format v2 support: Reads mip_level from 20-byte header
+- Backward compatible: v1 (16-byte header) → mip_level=0
+- Auto-update UI dropdown when loading weights with mip_level
+- Display mip_level in metadata panel
+- Code refactoring: Extracted FULLSCREEN_QUAD_VS shader (reused 3× across pipelines)
+- Added helper methods: `getDimensions()`, `setVideoControlsEnabled()`
+- Improved code organization with section headers and comments
+- Moved Mip Level selector to bottom of left sidebar (removed "Features (p0-p3)" label)
+- Added `loop` attribute to video element for automatic continuous playback
+
+**Previous Fixes:**
+- Fixed Layer 2 not appearing (was excluded from layerOutputs due to isOutput check)
+- Fixed canvas context switching (force clear before recreation)
+- Added Static 0-3 / Static 4-7 buttons to view all 8 static feature channels
+- Aligned naming with train_cnn_v2.py/.wgsl: Layer 0, Layer 1, Layer 2 (not Layer 1, 2, 3)
+- Disabled Static buttons in weights mode (no learnable weights)
+
+**Known Issues:**
+- Layer activation visualization may show black if texture data not properly unpacked
+- Weight kernel display depends on correct 2D context creation after canvas recreation
+
+---
+
+## Architecture
+
+### File Structure
+- Single-file HTML tool (~1100 lines)
+- Embedded shaders: STATIC_SHADER, CNN_SHADER, DISPLAY_SHADER, LAYER_VIZ_SHADER
+- Shared WGSL component: FULLSCREEN_QUAD_VS (reused across render pipelines)
+- **Embedded default weights:** DEFAULT_WEIGHTS_B64 (base64-encoded binary v2)
+ - Current: 4 layers (3×3, 5×5, 3×3, 3×3), 2496 f16 weights, mip_level=2
+ - Source: `workspaces/main/weights/cnn_v2_weights.bin`
+ - Updates: Re-encode binary with `base64 -i <file>` and update constant
+- Pure WebGPU (no external dependencies)
+
+### Code Organization
+
+**Recent Refactoring (2026-02-13):**
+- Extracted `FULLSCREEN_QUAD_VS` constant: Reused fullscreen quad vertex shader (2 triangles covering NDC)
+- Added helper methods to CNNTester class:
+ - `getDimensions()`: Returns current source dimensions (video or image)
+ - `setVideoControlsEnabled(enabled)`: Centralized video control enable/disable
+- Consolidated duplicate vertex shader code (used in mipmap generation, display, layer visualization)
+- Added section headers in JavaScript for better navigation
+- Improved inline comments explaining shader architecture
+
+**Benefits:**
+- Reduced code duplication (~40 lines saved)
+- Easier maintenance (single source of truth for fullscreen quad)
+- Clearer separation of concerns
+
+### Key Components
+
+**1. Weight Parsing**
+- Reads binary format v2: header (20B) + layer info (20B×N) + f16 weights
+- Backward compatible with v1: header (16B), mip_level defaults to 0
+- Computes min/max per layer via f16 unpacking
+- Stores `{ layers[], weights[], mipLevel, fileSize }`
+- Auto-sets UI mip-level dropdown from loaded weights
+
+**2. CNN Pipeline**
+- Static features computation (RGBD + UV + sin + bias → 7D packed)
+- Layer-by-layer convolution with storage buffer weights
+- Ping-pong buffers for intermediate results
+- Copy to persistent textures for visualization
+
+**3. Visualization Modes**
+
+**Activations Mode:**
+- 4 grayscale views per layer (channels 0-3 of up to 8 total)
+- WebGPU compute → unpack f16 → scale → grayscale
+- Auto-scale: Static features = 1.0, CNN layers = 0.2
+- Static features: Shows R,G,B,D (first 4 of 8: RGBD+UV+sin+bias)
+- CNN layers: Shows first 4 output channels
+
+**Weights Mode:**
+- 2D canvas rendering per output channel
+- Shows all input kernels horizontally
+- Normalized by layer min/max → [0, 1] → grayscale
+- 20px cells, 2px padding between kernels
+
+### Texture Management
+
+**Persistent Storage (layerTextures[]):**
+- One texture per layer output (static + all CNN layers)
+- `rgba32uint` format (packed f16 data)
+- `COPY_DST` usage for storing results
+
+**Compute Buffers (computeTextures[]):**
+- 2 textures for ping-pong computation
+- Reused across all layers
+- `COPY_SRC` usage for copying to persistent storage
+
+**Pipeline:**
+```
+Static pass → copy to layerTextures[0]
+For each CNN layer i:
+ Compute (ping-pong) → copy to layerTextures[i+1]
+```
+
+### Layer Indexing
+
+**UI Layer Buttons:**
+- "Static" → layerOutputs[0] (7D input features)
+- "Layer 1" → layerOutputs[1] (CNN layer 1 output, uses weights.layers[0])
+- "Layer 2" → layerOutputs[2] (CNN layer 2 output, uses weights.layers[1])
+- "Layer N" → layerOutputs[N] (CNN layer N output, uses weights.layers[N-1])
+
+**Weights Table:**
+- "Layer 1" → weights.layers[0] (first CNN layer weights)
+- "Layer 2" → weights.layers[1] (second CNN layer weights)
+- "Layer N" → weights.layers[N-1]
+
+**Consistency:** Both UI and weights table use same numbering (1, 2, 3...) for CNN layers.
+
+---
+
+## Known Issues
+
+### Issue #1: Layer Activations Show Black
+
+**Symptom:**
+- All 4 channel canvases render black
+- UV gradient test (debug mode 10) works
+- Raw packed data test (mode 11) shows black
+- Unpacked f16 test (mode 12) shows black
+
+**Diagnosis:**
+- Texture access works (UV gradient visible)
+- Texture data is all zeros (packed.x = 0)
+- Textures being read are empty
+
+**Root Cause:**
+- `copyTextureToTexture` operations may not be executing
+- Possible ordering issue (copies not submitted before visualization)
+- Alternative: textures created with wrong usage flags
+
+**Investigation Steps Taken:**
+1. Added `onSubmittedWorkDone()` wait before visualization
+2. Verified texture creation with `COPY_SRC` and `COPY_DST` flags
+3. Confirmed separate texture allocation per layer (no aliasing)
+4. Added debug shader modes to isolate issue
+
+**Next Steps:**
+- Verify encoder contains copy commands (add debug logging)
+- Check if compute passes actually write data (add known-value test)
+- Test copyTextureToTexture in isolation
+- Consider CPU readback to verify texture contents
+
+### Issue #2: Weight Visualization Empty
+
+**Symptom:**
+- Canvases created with correct dimensions (logged)
+- No visual output (black canvases)
+- Console logs show method execution
+
+**Potential Causes:**
+1. Weight indexing calculation incorrect
+2. Canvas not properly attached to DOM when rendering
+3. 2D context operations not flushing
+4. Min/max normalization producing black (all values equal?)
+
+**Debug Added:**
+- Comprehensive logging of dimensions, indices, ranges
+- Canvas context check before rendering
+
+**Next Steps:**
+- Add test rendering (fixed gradient) to verify 2D context works
+- Log sample weight values to verify data access
+- Check if canvas is visible in DOM inspector
+- Verify min/max calculation produces valid range
+
+---
+
+## UI Layout
+
+### Header
+- Controls: Blend slider, Depth input, View mode display
+- Drop zone for .bin weight files
+
+### Content Area
+
+**Left Sidebar (300px):**
+1. Drop zone for .bin weight files
+2. Weights Info panel (file size, layer table with min/max)
+3. Weights Visualization panel (per-layer kernel display)
+4. **Mip Level selector** (bottom) - Select p0/p1/p2 for static features
+
+**Main Canvas (center):**
+- CNN output display with video controls (Play/Pause, Frame ◄/►)
+- Supports both PNG images and video files (MP4, WebM)
+- Video loops automatically for continuous playback
+
+**Right Sidebar (panels):**
+1. **Layer Visualization Panel** (top, flex: 1)
+ - Layer selection buttons (Static 0-3, Static 4-7, Layer 0, Layer 1, ...)
+ - 2×2 grid of channel views (grayscale activations)
+ - 4× zoom view at bottom
+
+### Footer
+- Status line (GPU timing, dimensions, mode)
+- Console log (scrollable, color-coded)
+
+---
+
+## Shader Details
+
+### LAYER_VIZ_SHADER
+
+**Purpose:** Display single channel from packed layer texture
+
+**Inputs:**
+- `@binding(0) layer_tex: texture_2d<u32>` - Packed f16 layer data
+- `@binding(1) viz_params: vec2<f32>` - (channel_idx, scale)
+
+**Debug Modes:**
+- Channel 10: UV gradient (texture coordinate test)
+- Channel 11: Raw packed u32 data
+- Channel 12: First unpacked f16 value
+
+**Normal Operation:**
+- Unpack all 8 f16 channels from rgba32uint
+- Select channel by index (0-7)
+- Apply scale factor (1.0 for static, 0.2 for CNN)
+- Clamp to [0, 1] and output grayscale
+
+**Scale Rationale:**
+- Static features (RGBD, UV): already in [0, 1] range
+- CNN activations: post-ReLU [0, ~5], need scaling for visibility
+
+---
+
+## Binary Weight Format
+
+See `doc/CNN_V2_BINARY_FORMAT.md` for complete specification.
+
+**Quick Summary:**
+- Header: 16 bytes (magic, version, layer count, total weights)
+- Layer info: 20 bytes × N (kernel size, channels, offsets)
+- Weights: Packed f16 pairs as u32
+
+---
+
+## Testing Workflow
+
+### Load & Parse
+1. Drop PNG image → displays original
+2. Drop .bin weights → parses and shows info table
+3. Auto-runs CNN pipeline
+
+### Verify Pipeline
+1. Check console for "Running CNN pipeline"
+2. Verify "Completed in Xms"
+3. Check "Layer visualization ready: N layers"
+
+### Debug Activations
+1. Select "Activations" tab
+2. Click layer buttons to switch
+3. Check console for texture/canvas logs
+4. If black: note which debug modes work (UV vs data)
+
+### Debug Weights
+1. Select "Weights" tab
+2. Click Layer 1 or Layer 2 (Layer 0 has no weights)
+3. Check console for "Visualizing Layer N weights"
+4. Check canvas dimensions logged
+5. Verify weight range is non-trivial (not [0, 0])
+
+---
+
+## Integration with Main Project
+
+**Training Pipeline:**
+```bash
+# Generate weights
+./training/train_cnn_v2.py --export-binary
+
+# Test in browser
+open tools/cnn_v2_test/index.html
+# Drop: workspaces/main/cnn_v2_weights.bin
+# Drop: training/input/test.png
+```
+
+**Validation:**
+- Compare against demo CNNv2Effect (visual check)
+- Verify layer count matches binary file
+- Check weight ranges match training logs
+
+---
+
+## Future Enhancements
+
+- [ ] Fix layer activation visualization (black texture issue)
+- [ ] Fix weight kernel display (empty canvas issue)
+- [ ] Add per-channel auto-scaling (compute min/max from visible data)
+- [ ] Export rendered outputs (download PNG)
+- [ ] Side-by-side comparison with original
+- [ ] Heatmap mode (color-coded activations)
+- [ ] Weight statistics overlay (mean, std, sparsity)
+- [ ] Batch processing (multiple images in sequence)
+- [ ] Integration with Python training (live reload)
+
+---
+
+## Code Metrics
+
+- Total lines: ~1100
+- JavaScript: ~700 lines
+- WGSL shaders: ~300 lines
+- HTML/CSS: ~100 lines
+
+**Dependencies:** None (pure WebGPU + HTML5)
+
+---
+
+## Related Files
+
+- `doc/CNN_V2.md` - CNN v2 architecture and design
+- `doc/CNN_TEST_TOOL.md` - C++ offline testing tool (deprecated)
+- `training/train_cnn_v2.py` - Training script with binary export
+- `workspaces/main/cnn_v2_weights.bin` - Trained weights