diff options
| author | skal <pascal.massimino@gmail.com> | 2026-02-10 07:36:32 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-02-10 07:36:32 +0100 |
| commit | c51c146da9590845b864cbba3a7317c5b5bed56a (patch) | |
| tree | 80fda2cad06622f367ae004527e4bea21d687e68 /doc/CNN.md | |
| parent | dcd52c3c595c1f37229b880fad11248b98bbced1 (diff) | |
initial doc for the CNN project
Diffstat (limited to 'doc/CNN.md')
| -rw-r--r-- | doc/CNN.md | 51 |
1 files changed, 51 insertions, 0 deletions
diff --git a/doc/CNN.md b/doc/CNN.md new file mode 100644 index 0000000..8bf2860 --- /dev/null +++ b/doc/CNN.md @@ -0,0 +1,51 @@ +# Convolutional Neural Net Shader (CNN) post-processing + +## Idea + +Have the input 3d scene be processed by a multi-layer CNN trained on the side. +Input: some rendered scene. +Output: 'stylized' scene with CNN post-processing. + +## Shader implementation + +### input / output + +Need 1 texture buffer per CNN layer. +Input (r,g,b,1/z) for layer 0 (render 3d scene), or output from layer N-1 for layer N. +output: (r,g,b, alpha). Don't need the 1/z information (can be fetched from input) + +### size of one layer + +Notation: +S: the number of input samples from layer N-1. +Example: 3x3 input -> S = 3x3 = 9. + +Each S samples is 4 values (r,g,b, w=1/z). + +Each sample is processed by a mat4 matrix. 4 input => 4 output. + +Weight matrix = S x mat4 + +Final bias: 4 values. + +WGSL code example: See file CNN.shader + +### Layers + +we need 3 or 4 layer ? +Several different shaders for each layer. +Ping-pong for input/output texture buffer between each layers? + +## Training + +The layer weight/bias data are hard-coded in the shaders. +Need training with external python script. +File: CNN.py contains an example of what the training script could be. +Just an example, doesn't match our requirement yet. + +Need a repository of reference image pairs (before/after) for training and validation. +Each input image is randomly sampled into 3x3 patch of (r,g,b,1/z) input samples. +And trained to match the (r,g,b,a) output. + +Training generates the .wgsl code for layers' shaders, and the c++ code for the post-processing 'Effect'. + |
