summaryrefslogtreecommitdiff
path: root/src/gpu/effects/cnn_v2_effect.cc
AgeCommit message (Collapse)Author
28 hoursCNN v2: storage buffer architecture foundationskal
- Add binary weight format (header + layer info + packed f16) - New export_cnn_v2_weights.py for binary weight export - Single cnn_v2_compute.wgsl shader with storage buffer - Load weights in CNNv2Effect::load_weights() - Create layer compute pipeline with 5 bindings - Fast training config: 100 epochs, 3×3 kernels, 8→4→4 channels Next: Complete bind group creation and multi-layer compute execution
28 hoursCNN v2 Phase 5: render pipeline implementationskal
Complete multi-pass compute execution for CNNv2Effect. Implementation: - Layer texture creation (ping-pong buffers for intermediate results) - Static features compute pipeline with bind group layout - Bind group creation with 5 bindings (input mips + depth + output) - compute() override for multi-pass execution - Static features pass with proper workgroup dispatch Architecture: - Static features: 8×f16 packed as 4×u32 (RGBD + UV + sin + bias) - Layer buffers: 2×RGBA32Uint textures (8 channels f16 each) - Input mips: 3 levels (0, 1, 2) for multi-scale features - Workgroup size: 8×8 threads Status: - Static features compute pass functional - Layer pipeline infrastructure ready - All 36/36 tests passing Next: Layer shader integration, multi-layer execution Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
29 hoursCNN v2: parametric static features - Phases 1-4skal
Infrastructure for enhanced CNN post-processing with 7D feature input. Phase 1: Shaders - Static features compute (RGBD + UV + sin10_x + bias → 8×f16) - Layer template (convolution skeleton, packing/unpacking) - 3 mip level support for multi-scale features Phase 2: C++ Effect - CNNv2Effect class (multi-pass architecture) - Texture management (static features, layer buffers) - Build integration (CMakeLists, assets, tests) Phase 3: Training Pipeline - train_cnn_v2.py: PyTorch model with static feature concatenation - export_cnn_v2_shader.py: f32→f16 quantization, WGSL generation - Configurable architecture (kernels, channels) Phase 4: Validation - validate_cnn_v2.sh: End-to-end pipeline - Checkpoint → shaders → build → test images Tests: 36/36 passing Next: Complete render pipeline implementation (bind groups, multi-pass) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>