summaryrefslogtreecommitdiff
path: root/src/gpu/effects/cnn_v2_effect.cc
AgeCommit message (Collapse)Author
30 hoursCNN v2: Fix WebGPU validation error in uniform buffer alignmentskal
Fix two issues causing validation errors in test_demo: 1. Remove redundant pipeline creation without layout (static_pipeline_) 2. Change vec3<u32> to 3× u32 fields in StaticFeatureParams struct WGSL vec3<u32> aligns to 16 bytes (std140), making struct 32 bytes, while C++ struct was 16 bytes. Explicit fields ensure consistent layout. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
30 hoursCNN v2: Add TODO for flexible feature layout in binary format v3skal
Document future enhancement for arbitrary feature vector layouts. Proposed feature descriptor in binary format v3: - Specify feature types, sources, and ordering - Enable runtime experimentation without shader recompilation - Examples: [R,G,B,dx,dy,uv_x,bias] or [mip1.r,mip2.g,laplacian,uv_x,sin20_x,bias] Added TODOs in: - CNN_V2_BINARY_FORMAT.md: Detailed proposal with struct layout - CNN_V2.md: Future extensions section - train_cnn_v2.py: compute_static_features() docstring - cnn_v2_static.wgsl: Shader header comment - cnn_v2_effect.cc: Version check comment Current limitation: Hardcoded [p0,p1,p2,p3,uv_x,uv_y,sin10_x,bias] layout. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
30 hoursCNN v2: Add mip-level support to runtime effectskal
Binary format v2 includes mip_level in header (20 bytes, was 16). Effect reads mip_level and passes to static features shader via uniform. Shader samples from correct mip texture based on mip_level. Changes: - export_cnn_v2_weights.py: Header v2 with mip_level field - cnn_v2_effect.h: Add StaticFeatureParams, mip_level member, params buffer - cnn_v2_effect.cc: Read mip_level from weights, create/bind params buffer, update per-frame - cnn_v2_static.wgsl: Accept params uniform, sample from selected mip level Binary format v2: - Header: 20 bytes (magic, version=2, num_layers, total_weights, mip_level) - Backward compatible: v1 weights load with mip_level=0 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2 daystest_demo: Add beat-synchronized CNN post-processing with version selectionskal
- Add --cnn-version <1|2> flag to select between CNN v1 and v2 - Implement beat_phase modulation for dynamic blend in both CNN effects - Fix CNN v2 per-layer uniform buffer sharing (each layer needs own buffer) - Fix CNN v2 y-axis orientation to match render pass convention - Add Scene1Effect as base visual layer to test_demo timeline - Reorganize CNN v2 shaders into cnn_v2/ subdirectory - Update asset paths and documentation for new shader organization Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2 daysFix CNN v2 weights validation logicskal
FATAL_CHECK triggers when condition is TRUE (error case). Inverted equality checks: magic/version == correct_value would fatal when weights were valid. Changed to != checks to fail on invalid data.
2 daysCNN v2: Complete multi-layer compute executionskal
- Create bind groups per layer with ping-pong buffers - Update layer params uniform per dispatch - Execute all layers in sequence with proper input/output swapping - Ready for weight export and end-to-end testing
2 daysCNN v2: storage buffer architecture foundationskal
- Add binary weight format (header + layer info + packed f16) - New export_cnn_v2_weights.py for binary weight export - Single cnn_v2_compute.wgsl shader with storage buffer - Load weights in CNNv2Effect::load_weights() - Create layer compute pipeline with 5 bindings - Fast training config: 100 epochs, 3×3 kernels, 8→4→4 channels Next: Complete bind group creation and multi-layer compute execution
2 daysCNN v2 Phase 5: render pipeline implementationskal
Complete multi-pass compute execution for CNNv2Effect. Implementation: - Layer texture creation (ping-pong buffers for intermediate results) - Static features compute pipeline with bind group layout - Bind group creation with 5 bindings (input mips + depth + output) - compute() override for multi-pass execution - Static features pass with proper workgroup dispatch Architecture: - Static features: 8×f16 packed as 4×u32 (RGBD + UV + sin + bias) - Layer buffers: 2×RGBA32Uint textures (8 channels f16 each) - Input mips: 3 levels (0, 1, 2) for multi-scale features - Workgroup size: 8×8 threads Status: - Static features compute pass functional - Layer pipeline infrastructure ready - All 36/36 tests passing Next: Layer shader integration, multi-layer execution Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2 daysCNN v2: parametric static features - Phases 1-4skal
Infrastructure for enhanced CNN post-processing with 7D feature input. Phase 1: Shaders - Static features compute (RGBD + UV + sin10_x + bias → 8×f16) - Layer template (convolution skeleton, packing/unpacking) - 3 mip level support for multi-scale features Phase 2: C++ Effect - CNNv2Effect class (multi-pass architecture) - Texture management (static features, layer buffers) - Build integration (CMakeLists, assets, tests) Phase 3: Training Pipeline - train_cnn_v2.py: PyTorch model with static feature concatenation - export_cnn_v2_shader.py: f32→f16 quantization, WGSL generation - Configurable architecture (kernels, channels) Phase 4: Validation - validate_cnn_v2.sh: End-to-end pipeline - Checkpoint → shaders → build → test images Tests: 36/36 passing Next: Complete render pipeline implementation (bind groups, multi-pass) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>