summaryrefslogtreecommitdiff
path: root/tools/cnn_v2_test
AgeCommit message (Collapse)Author
21 hoursCNN v2 web tool: Update embedded weights to current checkpointskal
Replaces v1 weights (3 layers) with v2 weights from workspaces/main/weights/cnn_v2_weights.bin: - 4 layers: 3×3, 5×5, 3×3, 3×3 - 2496 f16 weights - mip_level=2 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
22 hoursCNN v2 web tool: Multiple fixes for feature parity with cnn_testskal
Changes: - Static shader: Point sampler (nearest filter) instead of linear - Mip handling: Use textureSampleLevel with point sampler (fixes coordinate scaling) - Save PNG: GPU readback via staging buffer (WebGPU canvas lacks toBlob support) - Depth binding: Use input texture as depth (matches C++ simplification) - Header offset: Version-aware calculation (v1=4, v2=5 u32) Known issue: Output still differs from cnn_test (color tones). Root cause TBD. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
22 hoursCNN v2 web tool: Fix static features shader sampling and header offsetskal
Root cause: HTML tool was producing incorrect output vs cnn_test due to: 1. Linear filtering: textureSampleLevel() with sampler blurred p0-p3 features 2. Header offset bug: Used 4 u32 instead of 5 u32 for version 2 binary format Changes: - Static shader: Replace textureSampleLevel (linear) with textureLoad (point) - Bind group: Use 3 separate mip views instead of sampler - Header offset: Account for version-specific header size (v1=4, v2=5 u32) - Add version field to weights object for correct offset calculation - Add savePNG button for convenience Result: HTML output now matches cnn_test output exactly. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
22 hoursCNN v2 web tool: Enhance UI visibility and layer preview interactionskal
Improve drop zone visibility with larger borders, bold blue text, and brighter hover states for better user guidance. Replace hover-based zoom with click-to-preview: clicking any of the 4 small channel views displays it large below. Active channel highlighted with white border for clear visual feedback. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
26 hoursCNN v2: Change feature #6 from sin(10*x) to sin(20*y)skal
Update positional encoding to use vertical coordinate at higher frequency. Changes: - train_cnn_v2.py: sin10_x → sin20_y (computed from uv_y) - cnn_v2_static.wgsl: sin10_x → sin20_y (computed from uv_y) - index.html: sin10_x → sin20_y (STATIC_SHADER) - CNN_V2.md: Update feature descriptions and examples - CNN_V2_BINARY_FORMAT.md: Update static features documentation Feature vector: [p0, p1, p2, p3, uv_x, uv_y, sin20_y, bias] Rationale: Higher frequency (20 vs 10) + vertical axis provides better spatial discrimination for position encoding. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
26 hoursCNN v2 HTML tool: Support binary format v2 with mip_levelskal
Parse v2 header (20 bytes) and read mip_level field. Display mip_level in metadata panel, set UI dropdown on load. Changes: - parseWeights(): Handle v1 (16-byte) and v2 (20-byte) headers - Read mip_level from header[4] for version 2 - Return mipLevel in parsed weights object - updateWeightsPanel(): Display mip level in metadata - loadWeights(): Set this.mipLevel and update UI dropdown Backward compatible: v1 weights → mipLevel=0 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
26 hoursCNN v2 test tool: Refactoring and video loop supportskal
Refactoring: - Extract FULLSCREEN_QUAD_VS shader (reused in mipmap, display, layer viz) - Add helper methods: getDimensions(), setVideoControlsEnabled() - Add section headers and improve code organization (~40 lines saved) - Move Mip Level selector to bottom of left sidebar - Remove "Features (p0-p3)" panel header Features: - Add video loop support (continuous playback) Documentation: - Update CNN_V2_WEB_TOOL.md with latest changes - Document refactoring benefits and code organization - Update UI layout section with current structure Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
26 hoursCNN v2 test tool: Add mip level selector for p0-p3 featuresskal
Add dropdown menu in left panel to select mip levels 0-2 for parametric features (p0-p3/RGBD). Uses trilinear filtering for smooth downsampling at higher mip levels. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
26 hoursCNN v2 test tool: Embed default weights for instant startupskal
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
26 hoursCNN v2: Fix activation function mismatch between training and inferenceskal
Layer 0 now uses clamp [0,1] in both training and inference (was using ReLU in shaders). - index.html: Add is_layer_0 flag to LayerParams, handle Layer 0 separately - export_cnn_v2_shader.py: Generate correct activation for Layer 0 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
26 hoursCNN v2 test tool: UI improvements and video playback fixesskal
- Change Depth control from number input to slider (0-1 range) - Move video controls to floating overlay at top of canvas - Remove View mode indicator from header (shortcuts still work) - Remove scrollbar from Layer Visualization panel - Fix layer viz flickering during video playback - Fix video controls responsiveness during playback Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
27 hoursCNN v2 test tool: Add video playback supportskal
Features: - Video file support (MP4, WebM, etc.) via drag-and-drop - Play/Pause button with non-realtime playback (drops frames if CNN slow) - Frame-by-frame navigation (◄/► step buttons) - Unified image/video processing through same CNN pipeline - Audio muted (video frames only) Optimizations: - Layer visualization updates only on pause/seek (~5-10ms saved per frame) Architecture: - copyExternalImageToTexture() works with both ImageBitmap and HTMLVideoElement - Video loading: wait for metadata → seek to frame 0 → wait for readyState≥2 (decoded) - Playback loop: requestAnimationFrame with isProcessing guard prevents overlapping inference - Controls always visible, disabled for images Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
27 hoursCNN v2 web tool: Major UI redesign with three-panel layoutskal
UI Changes: - Three-panel layout: left (weights), center (canvas), right (activations) - Left sidebar: clickable weights drop zone, weights info, kernel visualization - Right sidebar: 4 small activation views + large 4× zoom view - Controls moved to header (inline with title) Weights Visualization: - Dedicated panel in left sidebar with layer buttons - 1 pixel per weight (was 20px) - All input channels horizontal, output channels stacked vertically - Renders to separate canvas (not in activation grid) Activation Viewer: - 4 channels in horizontal row (was 2×2 grid) - Mouse-driven zoom view below (32×32 area at 4× magnification) - Zoom shows all 4 channels in 2×2 quadrant layout - Removed activations/weights mode toggle State Preservation: - Blend changes preserve selected layer/channel - Fixed activation view reset bug Documentation: - Updated README with new layout and feature descriptions - Marked implemented features (weights viz, layer viewer) - Updated size estimates (~22 KB total) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
27 hoursCNN v2 web tool: Fix WebGPU texture synchronization errorskal
Fixed validation error where staticTex was used for both storage write (in static compute pass) and texture read (in CNN bind group) within same command encoder. Now uses layerTextures[0] for reading, which is the copy destination and safe for read-only access. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
28 hoursCNN v2 web tool: Fix layer naming and visualization bugsskal
- Align layer naming with codebase: Layer 0/1/2 (not Layer 1/2/3) - Split static features: Static 0-3 (p0-p3) and Static 4-7 (uv,sin,bias) - Fix Layer 2 not appearing: removed isOutput filter from layerOutputs - Fix canvas context switching: force clear before recreation - Disable static buttons in weights mode - Add ASCII pipeline diagram to CNN_V2.md Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
30 hoursCNN v2: Refactor to uniform 12D→4D architectureskal
**Architecture changes:** - Static features (8D): p0-p3 (parametric) + uv_x, uv_y, sin(10×uv_x), bias - Input RGBD (4D): fed separately to all layers - All layers: uniform 12D→4D (4 prev/input + 8 static → 4 output) - Bias integrated in static features (bias=False in PyTorch) **Weight calculations:** - 3 layers × (12 × 3×3 × 4) = 1296 weights - f16: 2.6 KB (vs old variable arch: ~6.4 KB) **Updated files:** *Training (Python):* - train_cnn_v2.py: Uniform model, takes input_rgbd + static_features - export_cnn_v2_weights.py: Binary export for storage buffers - export_cnn_v2_shader.py: Per-layer shader export (debugging) *Shaders (WGSL):* - cnn_v2_static.wgsl: p0-p3 parametric features (mips/gradients) - cnn_v2_compute.wgsl: 12D input, 4D output, vec4 packing *Tools:* - HTML tool (cnn_v2_test): Updated for 12D→4D, layer visualization *Docs:* - CNN_V2.md: Updated architecture, training, validation sections - HOWTO.md: Reference HTML tool for validation *Removed:* - validate_cnn_v2.sh: Obsolete (used CNN v1 tool) All code consistent with bias=False (bias in static features as 1.0). handoff(Claude): CNN v2 architecture finalized and documented
31 hoursCNN v2 Web Tool: Unify layer terminology and add binary format specskal
- Rename 'Static (L0)' → 'Static' (clearer, less confusing) - Update channel labels: 'R/G/B/D' → 'Ch0 (R)/Ch1 (G)/Ch2 (B)/Ch3 (D)' - Add 'Layer' prefix in weights table for consistency - Document layer indexing: Static + Layer 1,2,3... (UI) ↔ weights.layers[0,1,2...] - Add explanatory notes about 7D input and 4-of-8 channel display - Create doc/CNN_V2_BINARY_FORMAT.md with complete .bin specification - Cross-reference spec in CNN_V2.md and CNN_V2_WEB_TOOL.md Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
31 hoursCNN v2 Web Tool: Add layer/weight visualization with debug infrastructureskal
Features: - Right sidebar with Layer Visualization (top) and Weights Info (collapsible, bottom) - Activations mode: 4-channel grayscale views per layer (Static L0 + CNN layers) - Weights mode: Kernel visualization with 2D canvas rendering - Mode tabs to switch between activation and weight inspection - Per-layer texture storage (separate from ping-pong compute buffers) - Debug shader modes (UV gradient, raw packed data, unpacked f16) - Comprehensive logging for diagnostics Architecture: - Persistent layerTextures[] for visualization (one per layer) - Separate computeTextures[] for CNN ping-pong - copyTextureToTexture after each layer pass - Canvas recreation on mode switch (2D vs WebGPU context) - Weight parsing with f16 unpacking and min/max calculation Known Issues: - Layer activations show black (texture data empty despite copies) - Weight kernels not displaying (2D canvas renders not visible) - Debug mode 10 (UV gradient) works, confirming texture access OK - Root cause: likely GPU command ordering or texture usage flags Documentation: - Added doc/CNN_V2_WEB_TOOL.md with full status, architecture, debug steps - Detailed issue tracking with investigation notes and next steps Status: Infrastructure complete, debugging data flow issues. handoff(Claude): Layer viz black due to empty textures despite copyTextureToTexture. Weight viz black despite correct canvas setup. Both issues need GPU pipeline audit.
33 hoursAdd CNN v2 WebGPU testing toolskal
Implements single-file HTML tool for rapid CNN weight validation: Features: - Drag-drop PNG images (whole window) and .bin weights - Real-time WebGPU compute pipeline (static features + N layers) - Data-driven execution (reads layer count from binary) - View modes: CNN output / Original / Diff (×10) - Blend slider (0.0-1.0) for effect strength - Console log with timestamps - Keyboard shortcuts: SPACE (original), D (diff) Architecture: - Embedded WGSL shaders (static + compute + display) - Binary parser for .bin format (header + layer info + f16 weights) - Persistent textures for view mode switching - Absolute weight offset calculation (header + layer info skip) Implementation notes: - Weight offsets in binary are relative to weights section - JavaScript precalculates absolute offsets: headerOffsetU32 * 2 + offset - Matches C++ shader behavior (simple get_weight without offset param) - Ping-pong textures for multi-layer processing TODO: - Side panel: .bin metadata, weight statistics, validation - Layer inspection: R/G/B/A plane split, intermediate outputs - Activation heatmaps for debugging Files: - tools/cnn_v2_test/index.html (24 KB, 730 lines) - tools/cnn_v2_test/README.md (usage guide, troubleshooting) handoff(Claude): CNN v2 HTML testing tool complete, documented TODOs for future enhancements