summaryrefslogtreecommitdiff
path: root/doc/CNN_EFFECT.md
diff options
context:
space:
mode:
authorskal <pascal.massimino@gmail.com>2026-02-10 12:48:43 +0100
committerskal <pascal.massimino@gmail.com>2026-02-10 12:48:43 +0100
commit6944733a6a2f05c18e7e0b73f847a4c9144801fd (patch)
tree10713cd41a0e038a016a2e6b357471690f232834 /doc/CNN_EFFECT.md
parentcc9cbeb75353181193e3afb880dc890aa8bf8985 (diff)
feat: Add multi-layer CNN support with framebuffer capture and blend control
Implements automatic layer chaining and generic framebuffer capture API for multi-layer neural network effects with proper original input preservation. Key changes: - Effect::needs_framebuffer_capture() - generic API for pre-render capture - MainSequence: auto-capture to "captured_frame" auxiliary texture - CNNEffect: multi-layer support via layer_index/total_layers params - seq_compiler: expands "layers=N" to N chained effect instances - Shader: @binding(4) original_input available to all layers - Training: generates layer switches and original input binding - Blend: mix(original, result, blend_amount) uses layer 0 input Timeline syntax: CNNEffect layers=3 blend=0.7 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Diffstat (limited to 'doc/CNN_EFFECT.md')
-rw-r--r--doc/CNN_EFFECT.md150
1 files changed, 138 insertions, 12 deletions
diff --git a/doc/CNN_EFFECT.md b/doc/CNN_EFFECT.md
index ec70b13..ae0f38a 100644
--- a/doc/CNN_EFFECT.md
+++ b/doc/CNN_EFFECT.md
@@ -10,10 +10,11 @@ Trainable convolutional neural network layers for artistic stylization (painterl
**Key Features:**
- Position-aware layer 0 (coordinate input for vignetting, edge effects)
-- Multi-layer convolutions (3×3, 5×5, 7×7 kernels)
+- Multi-layer convolutions (3×3, 5×5, 7×7 kernels) with automatic chaining
+- Original input available to all layers via framebuffer capture
+- Configurable final blend with original scene
- Modular WGSL shader architecture
- Hardcoded weights (trained offline via PyTorch)
-- Residual connections for stable learning
- ~5-8 KB binary footprint
---
@@ -42,19 +43,34 @@ fn cnn_conv3x3_with_coord(
**Use cases:** Position-dependent stylization (vignettes, corner darkening, radial gradients)
+### Multi-Layer Architecture
+
+CNNEffect supports multi-layer networks via automatic effect chaining:
+
+1. **Timeline specifies total layers**: `CNNEffect layers=3 blend=0.7`
+2. **Compiler expands to chain**: 3 separate CNNEffect instances (layer 0→1→2)
+3. **Framebuffer capture**: Layer 0 captures original input to `"captured_frame"`
+4. **Original input binding**: All layers access original via `@binding(4)`
+5. **Final blend**: Last layer blends result with original: `mix(original, result, 0.7)`
+
+**Framebuffer Capture API:**
+- `Effect::needs_framebuffer_capture()` - effect requests pre-capture
+- MainSequence automatically blits input → `"captured_frame"` auxiliary texture
+- Generic mechanism usable by any effect
+
### File Structure
```
src/gpu/effects/
- cnn_effect.h/cc # CNNEffect class
+ cnn_effect.h/cc # CNNEffect class + framebuffer capture
workspaces/main/shaders/cnn/
cnn_activation.wgsl # tanh, ReLU, sigmoid, leaky_relu
cnn_conv3x3.wgsl # 3×3 convolution (standard + coord-aware)
cnn_conv5x5.wgsl # 5×5 convolution (standard + coord-aware)
cnn_conv7x7.wgsl # 7×7 convolution (standard + coord-aware)
- cnn_weights_generated.wgsl # Weight arrays (auto-generated)
- cnn_layer.wgsl # Main shader (composes above snippets)
+ cnn_weights_generated.wgsl # Weight arrays (auto-generated by train_cnn.py)
+ cnn_layer.wgsl # Main shader with layer switches (auto-generated by train_cnn.py)
```
---
@@ -89,7 +105,7 @@ python3 training/train_cnn.py \
--checkpoint-every 50
```
-**Multi-layer example:**
+**Multi-layer example (3 layers with varying kernel sizes):**
```bash
python3 training/train_cnn.py \
--input training/input \
@@ -100,6 +116,10 @@ python3 training/train_cnn.py \
--checkpoint-every 100
```
+**Note:** Training script auto-generates:
+- `cnn_weights_generated.wgsl` - weight arrays for all layers
+- `cnn_layer.wgsl` - shader with layer switches and original input binding
+
**Resume from checkpoint:**
```bash
python3 training/train_cnn.py \
@@ -108,9 +128,16 @@ python3 training/train_cnn.py \
--resume training/checkpoints/checkpoint_epoch_200.pth
```
+**Export WGSL from checkpoint (no training):**
+```bash
+python3 training/train_cnn.py \
+ --export-only training/checkpoints/checkpoint_epoch_200.pth \
+ --output workspaces/main/shaders/cnn/cnn_weights_generated.wgsl
+```
+
### 3. Rebuild Demo
-Training script auto-generates `cnn_weights_generated.wgsl`:
+Training script auto-generates both `cnn_weights_generated.wgsl` and `cnn_layer.wgsl`:
```bash
cmake --build build -j4
./build/demo64k
@@ -122,23 +149,101 @@ cmake --build build -j4
### C++ Integration
+**Single layer (manual):**
```cpp
#include "gpu/effects/cnn_effect.h"
-auto cnn = std::make_shared<CNNEffect>(ctx, /*num_layers=*/1);
+CNNEffectParams p;
+p.layer_index = 0;
+p.total_layers = 1;
+p.blend_amount = 1.0f;
+auto cnn = std::make_shared<CNNEffect>(ctx, p);
timeline.add_effect(cnn, start_time, end_time);
```
-### Timeline Example
+**Multi-layer (automatic via timeline compiler):**
+Use timeline syntax - `seq_compiler` expands to multiple instances.
+
+### Timeline Examples
+
+**Single-layer CNN (full stylization):**
+```
+SEQUENCE 10.0 0
+ EFFECT + Hybrid3DEffect 0.00 5.00
+ EFFECT + CNNEffect 0.50 5.00 layers=1
+```
+
+**Multi-layer CNN with blend:**
```
SEQUENCE 10.0 0
- EFFECT CNNEffect 10.0 15.0 0
+ EFFECT + Hybrid3DEffect 0.00 5.00
+ EFFECT + CNNEffect 0.50 5.00 layers=3 blend=0.7
+```
+
+Expands to:
+```cpp
+// Layer 0 (captures original, blend=1.0)
+{
+ CNNEffectParams p;
+ p.layer_index = 0;
+ p.total_layers = 3;
+ p.blend_amount = 1.0f;
+ seq->add_effect(std::make_shared<CNNEffect>(ctx, p), 0.50f, 5.00f, 1);
+}
+// Layer 1 (blend=1.0)
+{
+ CNNEffectParams p;
+ p.layer_index = 1;
+ p.total_layers = 3;
+ p.blend_amount = 1.0f;
+ seq->add_effect(std::make_shared<CNNEffect>(ctx, p), 0.50f, 5.00f, 2);
+}
+// Layer 2 (final blend=0.7)
+{
+ CNNEffectParams p;
+ p.layer_index = 2;
+ p.total_layers = 3;
+ p.blend_amount = 0.7f;
+ seq->add_effect(std::make_shared<CNNEffect>(ctx, p), 0.50f, 5.00f, 3);
+}
```
---
-## Weight Storage
+## Shader Structure
+
+**Bindings:**
+```wgsl
+@group(0) @binding(0) var smplr: sampler;
+@group(0) @binding(1) var txt: texture_2d<f32>; // Current layer input
+@group(0) @binding(2) var<uniform> uniforms: CommonUniforms;
+@group(0) @binding(3) var<uniform> params: CNNLayerParams;
+@group(0) @binding(4) var original_input: texture_2d<f32>; // Layer 0 input (captured)
+```
+
+**Fragment shader logic:**
+```wgsl
+@fragment fn fs_main(@builtin(position) p: vec4<f32>) -> @location(0) vec4<f32> {
+ let uv = p.xy / uniforms.resolution;
+ let input = textureSample(txt, smplr, uv); // Layer N-1 output
+ let original = textureSample(original_input, smplr, uv); // Layer 0 input
+
+ var result = vec4<f32>(0.0);
+
+ if (params.layer_index == 0) {
+ result = cnn_conv3x3_with_coord(txt, smplr, uv, uniforms.resolution,
+ rgba_weights_layer0, coord_weights_layer0, bias_layer0);
+ result = cnn_tanh(result);
+ }
+ // ... other layers
+
+ // Blend with ORIGINAL input (not previous layer)
+ return mix(original, result, params.blend_amount);
+}
+```
+
+**Weight Storage:**
**Layer 0 (coordinate-aware):**
```wgsl
@@ -188,15 +293,36 @@ const bias_layer1 = vec4<f32>(0.0, 0.0, 0.0, 0.0);
---
+## Blend Parameter Behavior
+
+**blend_amount** controls final compositing with original:
+- `blend=0.0`: Pure original (no CNN effect)
+- `blend=0.5`: 50% original + 50% CNN
+- `blend=1.0`: Pure CNN output (full stylization)
+
+**Important:** Blend uses captured layer 0 input, not previous layer output.
+
+**Example use cases:**
+- `blend=1.0`: Full stylization (default)
+- `blend=0.7`: Subtle effect preserving original details
+- `blend=0.3`: Light artistic touch
+
## Troubleshooting
**Shader compilation fails:**
- Check `cnn_weights_generated.wgsl` syntax
- Verify snippets registered in `shaders.cc::InitShaderComposer()`
+- Ensure `cnn_layer.wgsl` has 5 bindings (including `original_input`)
**Black/corrupted output:**
- Weights untrained (identity placeholder)
-- Check residual blending (0.3 default)
+- Check `captured_frame` auxiliary texture is registered
+- Verify layer priorities in timeline are sequential
+
+**Wrong blend result:**
+- Ensure layer 0 has `needs_framebuffer_capture() == true`
+- Check MainSequence framebuffer capture logic
+- Verify `original_input` binding is populated
**Training loss not decreasing:**
- Lower learning rate (`--learning-rate 0.0001`)