# Convolutional Neural Net Shader (CNN) post-processing **Status:** ✅ Foundation implemented (single-layer, expandable to multi-pass) ## Idea Have the input 3d scene be processed by a multi-layer CNN trained on the side. Input: some rendered scene. Output: 'stylized' scene with CNN post-processing. **See `doc/CNN_EFFECT.md` for implementation details, usage, and API reference.** ## Shader implementation ### input / output Need 1 texture buffer per CNN layer. Input (r,g,b,1/z) for layer 0 (render 3d scene), or output from layer N-1 for layer N. output: (r,g,b, alpha). Don't need the 1/z information (can be fetched from input) ### size of one layer Notation: S: the number of input samples from layer N-1. Example: 3x3 input -> S = 3x3 = 9. Each S samples is 4 values (r,g,b, w=1/z). Each sample is processed by a mat4 matrix. 4 input => 4 output. Weight matrix = S x mat4 Final bias: 4 values. WGSL code example: See file CNN.shader ### Layers we need 3 or 4 layer ? Several different shaders for each layer. Ping-pong for input/output texture buffer between each layers? ## Implementation Status **Completed:** - ✅ Modular WGSL shader architecture (6 snippet files) - ✅ CNNEffect C++ class (single-layer rendering) - ✅ ShaderComposer integration (#include resolution) - ✅ Asset registration (7 new shader assets) - ✅ Test coverage (test_demo_effects.cc) - ✅ Placeholder identity weights for testing **Size:** ~3-4 KB shader code + ~2-4 KB weights = **5-8 KB total** **Pending:** - ⏳ Training script (`scripts/train_cnn.py`) to generate real weights - ⏳ Multi-layer rendering with ping-pong textures - ⏳ Weight quantization for size optimization --- ## Training (To Be Implemented) The layer weight/bias data are hard-coded in the shaders. Training workflow: 1. Prepare image pairs (before: raw render, after: target style) 2. Run `python scripts/train_cnn.py --input scene.png --target stylized.png` 3. Script generates `cnn_weights_generated.wgsl` 4. Rebuild: `cmake --build build -j4` **Reference:** File `CNN.py` contains training example (needs adaptation). Need a repository of reference image pairs (before/after) for training and validation. Each input image is randomly sampled into 3×3 patch of (r,g,b,1/z) input samples. And trained to match the (r,g,b,a) output. Training generates the .wgsl code for layers' shaders.