diff options
| -rw-r--r-- | PROJECT_CONTEXT.md | 3 | ||||
| -rw-r--r-- | doc/CNN.md | 40 | ||||
| -rw-r--r-- | doc/CNN_EFFECT.md | 223 |
3 files changed, 259 insertions, 7 deletions
diff --git a/PROJECT_CONTEXT.md b/PROJECT_CONTEXT.md index ff6bc48..fb876e5 100644 --- a/PROJECT_CONTEXT.md +++ b/PROJECT_CONTEXT.md @@ -35,6 +35,7 @@ - **Audio:** Sample-accurate sync. Zero heap allocations per frame. Variable tempo. Comprehensive tests. - **Shaders:** Parameterized effects (UniformHelper, .seq syntax). Modular WGSL composition. - **3D:** Hybrid SDF/rasterization with BVH. Binary scene loader. Blender pipeline. +- **Effects:** CNN post-processing foundation (single-layer, modular snippets, ready for training integration). - **Build:** Asset dependency tracking. Size measurement. Hot-reload (debug-only). - **Testing:** **36/36 passing (100%)** @@ -54,7 +55,7 @@ See `TODO.md` for current priorities and active tasks. - `doc/CONTRIBUTING.md` - Development protocols **Technical Reference:** -- Core: `ASSET_SYSTEM.md`, `SEQUENCE.md`, `TRACKER.md`, `3D.md` +- Core: `ASSET_SYSTEM.md`, `SEQUENCE.md`, `TRACKER.md`, `3D.md`, `CNN_EFFECT.md` - Formats: `SCENE_FORMAT.md`, `MASKING_SYSTEM.md` - Tools: `BUILD.md`, `WORKSPACE_SYSTEM.md`, `SIZE_MEASUREMENT.md` @@ -1,11 +1,15 @@ # Convolutional Neural Net Shader (CNN) post-processing +**Status:** ✅ Foundation implemented (single-layer, expandable to multi-pass) + ## Idea Have the input 3d scene be processed by a multi-layer CNN trained on the side. Input: some rendered scene. Output: 'stylized' scene with CNN post-processing. +**See `doc/CNN_EFFECT.md` for implementation details, usage, and API reference.** + ## Shader implementation ### input / output @@ -36,16 +40,40 @@ we need 3 or 4 layer ? Several different shaders for each layer. Ping-pong for input/output texture buffer between each layers? -## Training +## Implementation Status + +**Completed:** +- ✅ Modular WGSL shader architecture (6 snippet files) +- ✅ CNNEffect C++ class (single-layer rendering) +- ✅ ShaderComposer integration (#include resolution) +- ✅ Asset registration (7 new shader assets) +- ✅ Test coverage (test_demo_effects.cc) +- ✅ Placeholder identity weights for testing + +**Size:** ~3-4 KB shader code + ~2-4 KB weights = **5-8 KB total** + +**Pending:** +- ⏳ Training script (`scripts/train_cnn.py`) to generate real weights +- ⏳ Multi-layer rendering with ping-pong textures +- ⏳ Weight quantization for size optimization + +--- + +## Training (To Be Implemented) The layer weight/bias data are hard-coded in the shaders. -Need training with external python script. -File: CNN.py contains an example of what the training script could be. -Just an example, doesn't match our requirement yet. +Training workflow: + +1. Prepare image pairs (before: raw render, after: target style) +2. Run `python scripts/train_cnn.py --input scene.png --target stylized.png` +3. Script generates `cnn_weights_generated.wgsl` +4. Rebuild: `cmake --build build -j4` + +**Reference:** File `CNN.py` contains training example (needs adaptation). Need a repository of reference image pairs (before/after) for training and validation. -Each input image is randomly sampled into 3x3 patch of (r,g,b,1/z) input samples. +Each input image is randomly sampled into 3×3 patch of (r,g,b,1/z) input samples. And trained to match the (r,g,b,a) output. -Training generates the .wgsl code for layers' shaders, and the c++ code for the post-processing 'Effect'. +Training generates the .wgsl code for layers' shaders. diff --git a/doc/CNN_EFFECT.md b/doc/CNN_EFFECT.md new file mode 100644 index 0000000..9045739 --- /dev/null +++ b/doc/CNN_EFFECT.md @@ -0,0 +1,223 @@ +# CNN Post-Processing Effect + +Neural network-based stylization for rendered scenes. + +--- + +## Overview + +The CNN effect applies trainable convolutional neural network layers to post-process 3D rendered output, enabling artistic stylization (e.g., painterly, sketch, cel-shaded effects) with minimal runtime overhead. + +**Key Features:** +- Multi-layer convolutions (3×3, 5×5, 7×7 kernels) +- Modular WGSL shader architecture +- Hardcoded weights (trained offline) +- Residual connections for stable learning +- ~5-8 KB binary footprint + +--- + +## Architecture + +### File Structure + +``` +src/gpu/effects/ + cnn_effect.h # CNNEffect class + cnn_effect.cc # Implementation + +workspaces/main/shaders/cnn/ + cnn_activation.wgsl # Activation functions (tanh, ReLU, sigmoid, leaky_relu) + cnn_conv3x3.wgsl # 3×3 convolution + cnn_conv5x5.wgsl # 5×5 convolution + cnn_conv7x7.wgsl # 7×7 convolution + cnn_weights_generated.wgsl # Weight arrays (generated by training script) + cnn_layer.wgsl # Main shader (composes above snippets) +``` + +### Shader Composition + +`cnn_layer.wgsl` uses `#include` directives (resolved by `ShaderComposer`): +```wgsl +#include "common_uniforms" +#include "cnn_activation" +#include "cnn_conv3x3" +#include "cnn_weights_generated" +``` + +--- + +## Usage + +### C++ Integration + +```cpp +#include "gpu/effects/cnn_effect.h" + +// Create effect (1 layer for now, expandable to 4) +auto cnn = std::make_shared<CNNEffect>(ctx, /*num_layers=*/1); + +// Add to timeline +timeline.add_effect(cnn, start_time, end_time); +``` + +### Timeline Example + +``` +SEQUENCE 10.0 0 + EFFECT CNNEffect 10.0 15.0 0 # Apply CNN stylization for 5 seconds +``` + +--- + +## Training Workflow (Planned) + +**Step 1: Prepare Training Data** +```bash +# Collect before/after image pairs +# - Before: Raw 3D render +# - After: Target artistic style (hand-painted, filtered, etc.) +``` + +**Step 2: Train Network** +```bash +python scripts/train_cnn.py \ + --input rendered_scene.png \ + --target stylized_scene.png \ + --layers 3 \ + --kernel_sizes 3,5,3 \ + --epochs 100 +``` + +**Step 3: Export Weights** +```python +# scripts/train_cnn.py automatically generates: +# workspaces/main/shaders/cnn/cnn_weights_generated.wgsl +``` + +**Step 4: Rebuild** +```bash +cmake --build build -j4 +``` + +--- + +## Implementation Details + +### Convolution Function Signature + +```wgsl +fn cnn_conv3x3( + tex: texture_2d<f32>, + samp: sampler, + uv: vec2<f32>, + resolution: vec2<f32>, + weights: array<mat4x4<f32>, 9>, # 9 samples × 4×4 matrix + bias: vec4<f32> +) -> vec4<f32> +``` + +- Samples 9 pixels (3×3 neighborhood) +- Applies 4×4 weight matrix per sample (RGBA channels) +- Returns weighted sum + bias (pre-activation) + +### Weight Storage + +Weights are stored as WGSL constants: +```wgsl +const weights_layer0: array<mat4x4<f32>, 9> = array( + mat4x4<f32>(1.0, 0.0, 0.0, 0.0, ...), # Center pixel + mat4x4<f32>(0.0, 0.0, 0.0, 0.0, ...), # Neighbor 1 + // ... 7 more matrices +); +const bias_layer0 = vec4<f32>(0.0, 0.0, 0.0, 0.0); +``` + +### Residual Connection + +Final layer adds original input: +```wgsl +if (params.use_residual != 0) { + let input = textureSample(txt, smplr, uv); + result = input + result * 0.3; # Blend 30% stylization +} +``` + +--- + +## Multi-Layer Rendering (Future) + +For N layers, use ping-pong textures: + +``` +Pass 0: input → temp_a (conv + activate) +Pass 1: temp_a → temp_b (conv + activate) +Pass 2: temp_b → temp_a (conv + activate) +Pass 3: temp_a → screen (conv + activate + residual) +``` + +**Current Status:** Single-layer implementation. Multi-pass infrastructure ready but not exposed. + +--- + +## Size Budget + +| Component | Size | Notes | +|-----------|------|-------| +| `cnn_activation.wgsl` | ~200 B | 4 activation functions | +| `cnn_conv3x3.wgsl` | ~400 B | 3×3 convolution logic | +| `cnn_conv5x5.wgsl` | ~600 B | 5×5 convolution logic | +| `cnn_conv7x7.wgsl` | ~800 B | 7×7 convolution logic | +| `cnn_layer.wgsl` | ~800 B | Main shader | +| `cnn_effect.cc` | ~300 B | C++ implementation | +| **Weights (variable)** | **2-6 KB** | Depends on network depth/width | +| **Total** | **5-9 KB** | Acceptable for 64k demo | + +**Optimization Strategies:** +- Quantize weights (float32 → int8) +- Prune near-zero weights +- Share weights across layers +- Use separable convolutions (not yet implemented) + +--- + +## Testing + +```bash +# Run effect test +./build/test_demo_effects + +# Visual test in demo +./build/demo64k # CNN appears in timeline if added +``` + +**Test Coverage:** +- Construction/initialization +- Shader compilation +- Bind group creation +- Render pass execution + +--- + +## Troubleshooting + +**Shader compilation fails:** +- Check `cnn_weights_generated.wgsl` syntax +- Verify all snippets registered in `shaders.cc::InitShaderComposer()` + +**Black/corrupted output:** +- Weights likely untrained (using placeholder identity) +- Check residual blending factor (0.3 default) + +**Performance issues:** +- Reduce kernel sizes (7×7 → 3×3) +- Decrease layer count +- Profile with `--hot-reload` to measure frame time + +--- + +## References + +- **Shader Composition:** `doc/SEQUENCE.md` (shader parameters) +- **Effect System:** `src/gpu/effect.h` (Effect base class) +- **Training (external):** TensorFlow/PyTorch CNN tutorials |
