From 6944733a6a2f05c18e7e0b73f847a4c9144801fd Mon Sep 17 00:00:00 2001 From: skal Date: Tue, 10 Feb 2026 12:48:43 +0100 Subject: feat: Add multi-layer CNN support with framebuffer capture and blend control Implements automatic layer chaining and generic framebuffer capture API for multi-layer neural network effects with proper original input preservation. Key changes: - Effect::needs_framebuffer_capture() - generic API for pre-render capture - MainSequence: auto-capture to "captured_frame" auxiliary texture - CNNEffect: multi-layer support via layer_index/total_layers params - seq_compiler: expands "layers=N" to N chained effect instances - Shader: @binding(4) original_input available to all layers - Training: generates layer switches and original input binding - Blend: mix(original, result, blend_amount) uses layer 0 input Timeline syntax: CNNEffect layers=3 blend=0.7 Co-Authored-By: Claude Sonnet 4.5 --- doc/CNN_EFFECT.md | 150 +++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 138 insertions(+), 12 deletions(-) (limited to 'doc') diff --git a/doc/CNN_EFFECT.md b/doc/CNN_EFFECT.md index ec70b13..ae0f38a 100644 --- a/doc/CNN_EFFECT.md +++ b/doc/CNN_EFFECT.md @@ -10,10 +10,11 @@ Trainable convolutional neural network layers for artistic stylization (painterl **Key Features:** - Position-aware layer 0 (coordinate input for vignetting, edge effects) -- Multi-layer convolutions (3×3, 5×5, 7×7 kernels) +- Multi-layer convolutions (3×3, 5×5, 7×7 kernels) with automatic chaining +- Original input available to all layers via framebuffer capture +- Configurable final blend with original scene - Modular WGSL shader architecture - Hardcoded weights (trained offline via PyTorch) -- Residual connections for stable learning - ~5-8 KB binary footprint --- @@ -42,19 +43,34 @@ fn cnn_conv3x3_with_coord( **Use cases:** Position-dependent stylization (vignettes, corner darkening, radial gradients) +### Multi-Layer Architecture + +CNNEffect supports multi-layer networks via automatic effect chaining: + +1. **Timeline specifies total layers**: `CNNEffect layers=3 blend=0.7` +2. **Compiler expands to chain**: 3 separate CNNEffect instances (layer 0→1→2) +3. **Framebuffer capture**: Layer 0 captures original input to `"captured_frame"` +4. **Original input binding**: All layers access original via `@binding(4)` +5. **Final blend**: Last layer blends result with original: `mix(original, result, 0.7)` + +**Framebuffer Capture API:** +- `Effect::needs_framebuffer_capture()` - effect requests pre-capture +- MainSequence automatically blits input → `"captured_frame"` auxiliary texture +- Generic mechanism usable by any effect + ### File Structure ``` src/gpu/effects/ - cnn_effect.h/cc # CNNEffect class + cnn_effect.h/cc # CNNEffect class + framebuffer capture workspaces/main/shaders/cnn/ cnn_activation.wgsl # tanh, ReLU, sigmoid, leaky_relu cnn_conv3x3.wgsl # 3×3 convolution (standard + coord-aware) cnn_conv5x5.wgsl # 5×5 convolution (standard + coord-aware) cnn_conv7x7.wgsl # 7×7 convolution (standard + coord-aware) - cnn_weights_generated.wgsl # Weight arrays (auto-generated) - cnn_layer.wgsl # Main shader (composes above snippets) + cnn_weights_generated.wgsl # Weight arrays (auto-generated by train_cnn.py) + cnn_layer.wgsl # Main shader with layer switches (auto-generated by train_cnn.py) ``` --- @@ -89,7 +105,7 @@ python3 training/train_cnn.py \ --checkpoint-every 50 ``` -**Multi-layer example:** +**Multi-layer example (3 layers with varying kernel sizes):** ```bash python3 training/train_cnn.py \ --input training/input \ @@ -100,6 +116,10 @@ python3 training/train_cnn.py \ --checkpoint-every 100 ``` +**Note:** Training script auto-generates: +- `cnn_weights_generated.wgsl` - weight arrays for all layers +- `cnn_layer.wgsl` - shader with layer switches and original input binding + **Resume from checkpoint:** ```bash python3 training/train_cnn.py \ @@ -108,9 +128,16 @@ python3 training/train_cnn.py \ --resume training/checkpoints/checkpoint_epoch_200.pth ``` +**Export WGSL from checkpoint (no training):** +```bash +python3 training/train_cnn.py \ + --export-only training/checkpoints/checkpoint_epoch_200.pth \ + --output workspaces/main/shaders/cnn/cnn_weights_generated.wgsl +``` + ### 3. Rebuild Demo -Training script auto-generates `cnn_weights_generated.wgsl`: +Training script auto-generates both `cnn_weights_generated.wgsl` and `cnn_layer.wgsl`: ```bash cmake --build build -j4 ./build/demo64k @@ -122,23 +149,101 @@ cmake --build build -j4 ### C++ Integration +**Single layer (manual):** ```cpp #include "gpu/effects/cnn_effect.h" -auto cnn = std::make_shared(ctx, /*num_layers=*/1); +CNNEffectParams p; +p.layer_index = 0; +p.total_layers = 1; +p.blend_amount = 1.0f; +auto cnn = std::make_shared(ctx, p); timeline.add_effect(cnn, start_time, end_time); ``` -### Timeline Example +**Multi-layer (automatic via timeline compiler):** + +Use timeline syntax - `seq_compiler` expands to multiple instances. + +### Timeline Examples + +**Single-layer CNN (full stylization):** +``` +SEQUENCE 10.0 0 + EFFECT + Hybrid3DEffect 0.00 5.00 + EFFECT + CNNEffect 0.50 5.00 layers=1 +``` +**Multi-layer CNN with blend:** ``` SEQUENCE 10.0 0 - EFFECT CNNEffect 10.0 15.0 0 + EFFECT + Hybrid3DEffect 0.00 5.00 + EFFECT + CNNEffect 0.50 5.00 layers=3 blend=0.7 +``` + +Expands to: +```cpp +// Layer 0 (captures original, blend=1.0) +{ + CNNEffectParams p; + p.layer_index = 0; + p.total_layers = 3; + p.blend_amount = 1.0f; + seq->add_effect(std::make_shared(ctx, p), 0.50f, 5.00f, 1); +} +// Layer 1 (blend=1.0) +{ + CNNEffectParams p; + p.layer_index = 1; + p.total_layers = 3; + p.blend_amount = 1.0f; + seq->add_effect(std::make_shared(ctx, p), 0.50f, 5.00f, 2); +} +// Layer 2 (final blend=0.7) +{ + CNNEffectParams p; + p.layer_index = 2; + p.total_layers = 3; + p.blend_amount = 0.7f; + seq->add_effect(std::make_shared(ctx, p), 0.50f, 5.00f, 3); +} ``` --- -## Weight Storage +## Shader Structure + +**Bindings:** +```wgsl +@group(0) @binding(0) var smplr: sampler; +@group(0) @binding(1) var txt: texture_2d; // Current layer input +@group(0) @binding(2) var uniforms: CommonUniforms; +@group(0) @binding(3) var params: CNNLayerParams; +@group(0) @binding(4) var original_input: texture_2d; // Layer 0 input (captured) +``` + +**Fragment shader logic:** +```wgsl +@fragment fn fs_main(@builtin(position) p: vec4) -> @location(0) vec4 { + let uv = p.xy / uniforms.resolution; + let input = textureSample(txt, smplr, uv); // Layer N-1 output + let original = textureSample(original_input, smplr, uv); // Layer 0 input + + var result = vec4(0.0); + + if (params.layer_index == 0) { + result = cnn_conv3x3_with_coord(txt, smplr, uv, uniforms.resolution, + rgba_weights_layer0, coord_weights_layer0, bias_layer0); + result = cnn_tanh(result); + } + // ... other layers + + // Blend with ORIGINAL input (not previous layer) + return mix(original, result, params.blend_amount); +} +``` + +**Weight Storage:** **Layer 0 (coordinate-aware):** ```wgsl @@ -188,15 +293,36 @@ const bias_layer1 = vec4(0.0, 0.0, 0.0, 0.0); --- +## Blend Parameter Behavior + +**blend_amount** controls final compositing with original: +- `blend=0.0`: Pure original (no CNN effect) +- `blend=0.5`: 50% original + 50% CNN +- `blend=1.0`: Pure CNN output (full stylization) + +**Important:** Blend uses captured layer 0 input, not previous layer output. + +**Example use cases:** +- `blend=1.0`: Full stylization (default) +- `blend=0.7`: Subtle effect preserving original details +- `blend=0.3`: Light artistic touch + ## Troubleshooting **Shader compilation fails:** - Check `cnn_weights_generated.wgsl` syntax - Verify snippets registered in `shaders.cc::InitShaderComposer()` +- Ensure `cnn_layer.wgsl` has 5 bindings (including `original_input`) **Black/corrupted output:** - Weights untrained (identity placeholder) -- Check residual blending (0.3 default) +- Check `captured_frame` auxiliary texture is registered +- Verify layer priorities in timeline are sequential + +**Wrong blend result:** +- Ensure layer 0 has `needs_framebuffer_capture() == true` +- Check MainSequence framebuffer capture logic +- Verify `original_input` binding is populated **Training loss not decreasing:** - Lower learning rate (`--learning-rate 0.0001`) -- cgit v1.2.3