summaryrefslogtreecommitdiff
path: root/doc/CNN.md
diff options
context:
space:
mode:
authorskal <pascal.massimino@gmail.com>2026-02-10 07:36:32 +0100
committerskal <pascal.massimino@gmail.com>2026-02-10 07:36:32 +0100
commitc51c146da9590845b864cbba3a7317c5b5bed56a (patch)
tree80fda2cad06622f367ae004527e4bea21d687e68 /doc/CNN.md
parentdcd52c3c595c1f37229b880fad11248b98bbced1 (diff)
initial doc for the CNN project
Diffstat (limited to 'doc/CNN.md')
-rw-r--r--doc/CNN.md51
1 files changed, 51 insertions, 0 deletions
diff --git a/doc/CNN.md b/doc/CNN.md
new file mode 100644
index 0000000..8bf2860
--- /dev/null
+++ b/doc/CNN.md
@@ -0,0 +1,51 @@
+# Convolutional Neural Net Shader (CNN) post-processing
+
+## Idea
+
+Have the input 3d scene be processed by a multi-layer CNN trained on the side.
+Input: some rendered scene.
+Output: 'stylized' scene with CNN post-processing.
+
+## Shader implementation
+
+### input / output
+
+Need 1 texture buffer per CNN layer.
+Input (r,g,b,1/z) for layer 0 (render 3d scene), or output from layer N-1 for layer N.
+output: (r,g,b, alpha). Don't need the 1/z information (can be fetched from input)
+
+### size of one layer
+
+Notation:
+S: the number of input samples from layer N-1.
+Example: 3x3 input -> S = 3x3 = 9.
+
+Each S samples is 4 values (r,g,b, w=1/z).
+
+Each sample is processed by a mat4 matrix. 4 input => 4 output.
+
+Weight matrix = S x mat4
+
+Final bias: 4 values.
+
+WGSL code example: See file CNN.shader
+
+### Layers
+
+we need 3 or 4 layer ?
+Several different shaders for each layer.
+Ping-pong for input/output texture buffer between each layers?
+
+## Training
+
+The layer weight/bias data are hard-coded in the shaders.
+Need training with external python script.
+File: CNN.py contains an example of what the training script could be.
+Just an example, doesn't match our requirement yet.
+
+Need a repository of reference image pairs (before/after) for training and validation.
+Each input image is randomly sampled into 3x3 patch of (r,g,b,1/z) input samples.
+And trained to match the (r,g,b,a) output.
+
+Training generates the .wgsl code for layers' shaders, and the c++ code for the post-processing 'Effect'.
+