doc/CNN_EFFECT.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223

# CNN Post-Processing Effect

Neural network-based stylization for rendered scenes.

---

## Overview

The CNN effect applies trainable convolutional neural network layers to post-process 3D rendered output, enabling artistic stylization (e.g., painterly, sketch, cel-shaded effects) with minimal runtime overhead.

**Key Features:**
- Multi-layer convolutions (3×3, 5×5, 7×7 kernels)
- Modular WGSL shader architecture
- Hardcoded weights (trained offline)
- Residual connections for stable learning
- ~5-8 KB binary footprint

---

## Architecture

### File Structure

```
src/gpu/effects/
  cnn_effect.h            # CNNEffect class
  cnn_effect.cc           # Implementation

workspaces/main/shaders/cnn/
  cnn_activation.wgsl     # Activation functions (tanh, ReLU, sigmoid, leaky_relu)
  cnn_conv3x3.wgsl        # 3×3 convolution
  cnn_conv5x5.wgsl        # 5×5 convolution
  cnn_conv7x7.wgsl        # 7×7 convolution
  cnn_weights_generated.wgsl  # Weight arrays (generated by training script)
  cnn_layer.wgsl          # Main shader (composes above snippets)
```

### Shader Composition

`cnn_layer.wgsl` uses `#include` directives (resolved by `ShaderComposer`):
```wgsl
#include "common_uniforms"
#include "cnn_activation"
#include "cnn_conv3x3"
#include "cnn_weights_generated"
```

---

## Usage

### C++ Integration

```cpp
#include "gpu/effects/cnn_effect.h"

// Create effect (1 layer for now, expandable to 4)
auto cnn = std::make_shared<CNNEffect>(ctx, /*num_layers=*/1);

// Add to timeline
timeline.add_effect(cnn, start_time, end_time);
```

### Timeline Example

```
SEQUENCE 10.0 0
  EFFECT CNNEffect 10.0 15.0 0  # Apply CNN stylization for 5 seconds
```

---

## Training Workflow (Planned)

**Step 1: Prepare Training Data**
```bash
# Collect before/after image pairs
# - Before: Raw 3D render
# - After: Target artistic style (hand-painted, filtered, etc.)
```

**Step 2: Train Network**
```bash
python scripts/train_cnn.py \
  --input rendered_scene.png \
  --target stylized_scene.png \
  --layers 3 \
  --kernel_sizes 3,5,3 \
  --epochs 100
```

**Step 3: Export Weights**
```python
# scripts/train_cnn.py automatically generates:
# workspaces/main/shaders/cnn/cnn_weights_generated.wgsl
```

**Step 4: Rebuild**
```bash
cmake --build build -j4
```

---

## Implementation Details

### Convolution Function Signature

```wgsl
fn cnn_conv3x3(
  tex: texture_2d<f32>,
  samp: sampler,
  uv: vec2<f32>,
  resolution: vec2<f32>,
  weights: array<mat4x4<f32>, 9>,  # 9 samples × 4×4 matrix
  bias: vec4<f32>
) -> vec4<f32>
```

- Samples 9 pixels (3×3 neighborhood)
- Applies 4×4 weight matrix per sample (RGBA channels)
- Returns weighted sum + bias (pre-activation)

### Weight Storage

Weights are stored as WGSL constants:
```wgsl
const weights_layer0: array<mat4x4<f32>, 9> = array(
  mat4x4<f32>(1.0, 0.0, 0.0, 0.0, ...),  # Center pixel
  mat4x4<f32>(0.0, 0.0, 0.0, 0.0, ...),  # Neighbor 1
  // ... 7 more matrices
);
const bias_layer0 = vec4<f32>(0.0, 0.0, 0.0, 0.0);
```

### Residual Connection

Final layer adds original input:
```wgsl
if (params.use_residual != 0) {
  let input = textureSample(txt, smplr, uv);
  result = input + result * 0.3;  # Blend 30% stylization
}
```

---

## Multi-Layer Rendering (Future)

For N layers, use ping-pong textures:

```
Pass 0: input → temp_a (conv + activate)
Pass 1: temp_a → temp_b (conv + activate)
Pass 2: temp_b → temp_a (conv + activate)
Pass 3: temp_a → screen (conv + activate + residual)
```

**Current Status:** Single-layer implementation. Multi-pass infrastructure ready but not exposed.

---

## Size Budget

| Component | Size | Notes |
|-----------|------|-------|
| `cnn_activation.wgsl` | ~200 B | 4 activation functions |
| `cnn_conv3x3.wgsl` | ~400 B | 3×3 convolution logic |
| `cnn_conv5x5.wgsl` | ~600 B | 5×5 convolution logic |
| `cnn_conv7x7.wgsl` | ~800 B | 7×7 convolution logic |
| `cnn_layer.wgsl` | ~800 B | Main shader |
| `cnn_effect.cc` | ~300 B | C++ implementation |
| **Weights (variable)** | **2-6 KB** | Depends on network depth/width |
| **Total** | **5-9 KB** | Acceptable for 64k demo |

**Optimization Strategies:**
- Quantize weights (float32 → int8)
- Prune near-zero weights
- Share weights across layers
- Use separable convolutions (not yet implemented)

---

## Testing

```bash
# Run effect test
./build/test_demo_effects

# Visual test in demo
./build/demo64k  # CNN appears in timeline if added
```

**Test Coverage:**
- Construction/initialization
- Shader compilation
- Bind group creation
- Render pass execution

---

## Troubleshooting

**Shader compilation fails:**
- Check `cnn_weights_generated.wgsl` syntax
- Verify all snippets registered in `shaders.cc::InitShaderComposer()`

**Black/corrupted output:**
- Weights likely untrained (using placeholder identity)
- Check residual blending factor (0.3 default)

**Performance issues:**
- Reduce kernel sizes (7×7 → 3×3)
- Decrease layer count
- Profile with `--hot-reload` to measure frame time

---

## References

- **Shader Composition:** `doc/SEQUENCE.md` (shader parameters)
- **Effect System:** `src/gpu/effect.h` (Effect base class)
- **Training (external):** TensorFlow/PyTorch CNN tutorials