summaryrefslogtreecommitdiff
path: root/doc/CNN_V2_WEB_TOOL.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/CNN_V2_WEB_TOOL.md')
-rw-r--r--doc/CNN_V2_WEB_TOOL.md348
1 files changed, 0 insertions, 348 deletions
diff --git a/doc/CNN_V2_WEB_TOOL.md b/doc/CNN_V2_WEB_TOOL.md
deleted file mode 100644
index b6f5b0b..0000000
--- a/doc/CNN_V2_WEB_TOOL.md
+++ /dev/null
@@ -1,348 +0,0 @@
-# CNN v2 Web Testing Tool
-
-Browser-based WebGPU tool for validating CNN v2 inference with layer visualization and weight inspection.
-
-**Location:** `tools/cnn_v2_test/index.html`
-
----
-
-## Status (2026-02-13)
-
-**Working:**
-- ✅ WebGPU initialization and device setup
-- ✅ Binary weight file parsing (v1 and v2 formats)
-- ✅ Automatic mip-level detection from binary format v2
-- ✅ Weight statistics (min/max per layer)
-- ✅ UI layout with collapsible panels
-- ✅ Mode switching (Activations/Weights tabs)
-- ✅ Canvas context management (2D for weights, WebGPU for activations)
-- ✅ Weight visualization infrastructure (layer selection, grid layout)
-- ✅ Layer naming matches codebase convention (Layer 0, Layer 1, Layer 2)
-- ✅ Static features split visualization (Static 0-3, Static 4-7)
-- ✅ All layers visible including output layer (Layer 2)
-- ✅ Video playback support (MP4, WebM) with frame-by-frame controls
-- ✅ Video looping (automatic continuous playback)
-- ✅ Mip level selection (p0-p3 features at different resolutions)
-
-**Recent Changes (Latest):**
-- Binary format v2 support: Reads mip_level from 20-byte header
-- Backward compatible: v1 (16-byte header) → mip_level=0
-- Auto-update UI dropdown when loading weights with mip_level
-- Display mip_level in metadata panel
-- Code refactoring: Extracted FULLSCREEN_QUAD_VS shader (reused 3× across pipelines)
-- Added helper methods: `getDimensions()`, `setVideoControlsEnabled()`
-- Improved code organization with section headers and comments
-- Moved Mip Level selector to bottom of left sidebar (removed "Features (p0-p3)" label)
-- Added `loop` attribute to video element for automatic continuous playback
-
-**Previous Fixes:**
-- Fixed Layer 2 not appearing (was excluded from layerOutputs due to isOutput check)
-- Fixed canvas context switching (force clear before recreation)
-- Added Static 0-3 / Static 4-7 buttons to view all 8 static feature channels
-- Aligned naming with train_cnn_v2.py/.wgsl: Layer 0, Layer 1, Layer 2 (not Layer 1, 2, 3)
-- Disabled Static buttons in weights mode (no learnable weights)
-
-**Known Issues:**
-- Layer activation visualization may show black if texture data not properly unpacked
-- Weight kernel display depends on correct 2D context creation after canvas recreation
-
----
-
-## Architecture
-
-### File Structure
-- Single-file HTML tool (~1100 lines)
-- Embedded shaders: STATIC_SHADER, CNN_SHADER, DISPLAY_SHADER, LAYER_VIZ_SHADER
-- Shared WGSL component: FULLSCREEN_QUAD_VS (reused across render pipelines)
-- **Embedded default weights:** DEFAULT_WEIGHTS_B64 (base64-encoded binary v2)
- - Current: 4 layers (3×3, 5×5, 3×3, 3×3), 2496 f16 weights, mip_level=2
- - Source: `workspaces/main/weights/cnn_v2_weights.bin`
- - Updates: Re-encode binary with `base64 -i <file>` and update constant
-- Pure WebGPU (no external dependencies)
-
-### Code Organization
-
-**Recent Refactoring (2026-02-13):**
-- Extracted `FULLSCREEN_QUAD_VS` constant: Reused fullscreen quad vertex shader (2 triangles covering NDC)
-- Added helper methods to CNNTester class:
- - `getDimensions()`: Returns current source dimensions (video or image)
- - `setVideoControlsEnabled(enabled)`: Centralized video control enable/disable
-- Consolidated duplicate vertex shader code (used in mipmap generation, display, layer visualization)
-- Added section headers in JavaScript for better navigation
-- Improved inline comments explaining shader architecture
-
-**Benefits:**
-- Reduced code duplication (~40 lines saved)
-- Easier maintenance (single source of truth for fullscreen quad)
-- Clearer separation of concerns
-
-### Key Components
-
-**1. Weight Parsing**
-- Reads binary format v2: header (20B) + layer info (20B×N) + f16 weights
-- Backward compatible with v1: header (16B), mip_level defaults to 0
-- Computes min/max per layer via f16 unpacking
-- Stores `{ layers[], weights[], mipLevel, fileSize }`
-- Auto-sets UI mip-level dropdown from loaded weights
-
-**2. CNN Pipeline**
-- Static features computation (RGBD + UV + sin + bias → 7D packed)
-- Layer-by-layer convolution with storage buffer weights
-- Ping-pong buffers for intermediate results
-- Copy to persistent textures for visualization
-
-**3. Visualization Modes**
-
-**Activations Mode:**
-- 4 grayscale views per layer (channels 0-3 of up to 8 total)
-- WebGPU compute → unpack f16 → scale → grayscale
-- Auto-scale: Static features = 1.0, CNN layers = 0.2
-- Static features: Shows R,G,B,D (first 4 of 8: RGBD+UV+sin+bias)
-- CNN layers: Shows first 4 output channels
-
-**Weights Mode:**
-- 2D canvas rendering per output channel
-- Shows all input kernels horizontally
-- Normalized by layer min/max → [0, 1] → grayscale
-- 20px cells, 2px padding between kernels
-
-### Texture Management
-
-**Persistent Storage (layerTextures[]):**
-- One texture per layer output (static + all CNN layers)
-- `rgba32uint` format (packed f16 data)
-- `COPY_DST` usage for storing results
-
-**Compute Buffers (computeTextures[]):**
-- 2 textures for ping-pong computation
-- Reused across all layers
-- `COPY_SRC` usage for copying to persistent storage
-
-**Pipeline:**
-```
-Static pass → copy to layerTextures[0]
-For each CNN layer i:
- Compute (ping-pong) → copy to layerTextures[i+1]
-```
-
-### Layer Indexing
-
-**UI Layer Buttons:**
-- "Static" → layerOutputs[0] (7D input features)
-- "Layer 1" → layerOutputs[1] (CNN layer 1 output, uses weights.layers[0])
-- "Layer 2" → layerOutputs[2] (CNN layer 2 output, uses weights.layers[1])
-- "Layer N" → layerOutputs[N] (CNN layer N output, uses weights.layers[N-1])
-
-**Weights Table:**
-- "Layer 1" → weights.layers[0] (first CNN layer weights)
-- "Layer 2" → weights.layers[1] (second CNN layer weights)
-- "Layer N" → weights.layers[N-1]
-
-**Consistency:** Both UI and weights table use same numbering (1, 2, 3...) for CNN layers.
-
----
-
-## Known Issues
-
-### Issue #1: Layer Activations Show Black
-
-**Symptom:**
-- All 4 channel canvases render black
-- UV gradient test (debug mode 10) works
-- Raw packed data test (mode 11) shows black
-- Unpacked f16 test (mode 12) shows black
-
-**Diagnosis:**
-- Texture access works (UV gradient visible)
-- Texture data is all zeros (packed.x = 0)
-- Textures being read are empty
-
-**Root Cause:**
-- `copyTextureToTexture` operations may not be executing
-- Possible ordering issue (copies not submitted before visualization)
-- Alternative: textures created with wrong usage flags
-
-**Investigation Steps Taken:**
-1. Added `onSubmittedWorkDone()` wait before visualization
-2. Verified texture creation with `COPY_SRC` and `COPY_DST` flags
-3. Confirmed separate texture allocation per layer (no aliasing)
-4. Added debug shader modes to isolate issue
-
-**Next Steps:**
-- Verify encoder contains copy commands (add debug logging)
-- Check if compute passes actually write data (add known-value test)
-- Test copyTextureToTexture in isolation
-- Consider CPU readback to verify texture contents
-
-### Issue #2: Weight Visualization Empty
-
-**Symptom:**
-- Canvases created with correct dimensions (logged)
-- No visual output (black canvases)
-- Console logs show method execution
-
-**Potential Causes:**
-1. Weight indexing calculation incorrect
-2. Canvas not properly attached to DOM when rendering
-3. 2D context operations not flushing
-4. Min/max normalization producing black (all values equal?)
-
-**Debug Added:**
-- Comprehensive logging of dimensions, indices, ranges
-- Canvas context check before rendering
-
-**Next Steps:**
-- Add test rendering (fixed gradient) to verify 2D context works
-- Log sample weight values to verify data access
-- Check if canvas is visible in DOM inspector
-- Verify min/max calculation produces valid range
-
----
-
-## UI Layout
-
-### Header
-- Controls: Blend slider, Depth input, View mode display
-- Drop zone for .bin weight files
-
-### Content Area
-
-**Left Sidebar (300px):**
-1. Drop zone for .bin weight files
-2. Weights Info panel (file size, layer table with min/max)
-3. Weights Visualization panel (per-layer kernel display)
-4. **Mip Level selector** (bottom) - Select p0/p1/p2 for static features
-
-**Main Canvas (center):**
-- CNN output display with video controls (Play/Pause, Frame ◄/►)
-- Supports both PNG images and video files (MP4, WebM)
-- Video loops automatically for continuous playback
-
-**Right Sidebar (panels):**
-1. **Layer Visualization Panel** (top, flex: 1)
- - Layer selection buttons (Static 0-3, Static 4-7, Layer 0, Layer 1, ...)
- - 2×2 grid of channel views (grayscale activations)
- - 4× zoom view at bottom
-
-### Footer
-- Status line (GPU timing, dimensions, mode)
-- Console log (scrollable, color-coded)
-
----
-
-## Shader Details
-
-### LAYER_VIZ_SHADER
-
-**Purpose:** Display single channel from packed layer texture
-
-**Inputs:**
-- `@binding(0) layer_tex: texture_2d<u32>` - Packed f16 layer data
-- `@binding(1) viz_params: vec2<f32>` - (channel_idx, scale)
-
-**Debug Modes:**
-- Channel 10: UV gradient (texture coordinate test)
-- Channel 11: Raw packed u32 data
-- Channel 12: First unpacked f16 value
-
-**Normal Operation:**
-- Unpack all 8 f16 channels from rgba32uint
-- Select channel by index (0-7)
-- Apply scale factor (1.0 for static, 0.2 for CNN)
-- Clamp to [0, 1] and output grayscale
-
-**Scale Rationale:**
-- Static features (RGBD, UV): already in [0, 1] range
-- CNN activations: post-ReLU [0, ~5], need scaling for visibility
-
----
-
-## Binary Weight Format
-
-See `doc/CNN_V2_BINARY_FORMAT.md` for complete specification.
-
-**Quick Summary:**
-- Header: 16 bytes (magic, version, layer count, total weights)
-- Layer info: 20 bytes × N (kernel size, channels, offsets)
-- Weights: Packed f16 pairs as u32
-
----
-
-## Testing Workflow
-
-### Load & Parse
-1. Drop PNG image → displays original
-2. Drop .bin weights → parses and shows info table
-3. Auto-runs CNN pipeline
-
-### Verify Pipeline
-1. Check console for "Running CNN pipeline"
-2. Verify "Completed in Xms"
-3. Check "Layer visualization ready: N layers"
-
-### Debug Activations
-1. Select "Activations" tab
-2. Click layer buttons to switch
-3. Check console for texture/canvas logs
-4. If black: note which debug modes work (UV vs data)
-
-### Debug Weights
-1. Select "Weights" tab
-2. Click Layer 1 or Layer 2 (Layer 0 has no weights)
-3. Check console for "Visualizing Layer N weights"
-4. Check canvas dimensions logged
-5. Verify weight range is non-trivial (not [0, 0])
-
----
-
-## Integration with Main Project
-
-**Training Pipeline:**
-```bash
-# Generate weights
-./training/train_cnn_v2.py --export-binary
-
-# Test in browser
-open tools/cnn_v2_test/index.html
-# Drop: workspaces/main/cnn_v2_weights.bin
-# Drop: training/input/test.png
-```
-
-**Validation:**
-- Compare against demo CNNv2Effect (visual check)
-- Verify layer count matches binary file
-- Check weight ranges match training logs
-
----
-
-## Future Enhancements
-
-- [ ] Fix layer activation visualization (black texture issue)
-- [ ] Fix weight kernel display (empty canvas issue)
-- [ ] Add per-channel auto-scaling (compute min/max from visible data)
-- [ ] Export rendered outputs (download PNG)
-- [ ] Side-by-side comparison with original
-- [ ] Heatmap mode (color-coded activations)
-- [ ] Weight statistics overlay (mean, std, sparsity)
-- [ ] Batch processing (multiple images in sequence)
-- [ ] Integration with Python training (live reload)
-
----
-
-## Code Metrics
-
-- Total lines: ~1100
-- JavaScript: ~700 lines
-- WGSL shaders: ~300 lines
-- HTML/CSS: ~100 lines
-
-**Dependencies:** None (pure WebGPU + HTML5)
-
----
-
-## Related Files
-
-- `doc/CNN_V2.md` - CNN v2 architecture and design
-- `doc/CNN_TEST_TOOL.md` - C++ offline testing tool (deprecated)
-- `training/train_cnn_v2.py` - Training script with binary export
-- `workspaces/main/cnn_v2_weights.bin` - Trained weights