summaryrefslogtreecommitdiff
path: root/doc/CNN_RGBD_GRAYSCALE_SUMMARY.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/CNN_RGBD_GRAYSCALE_SUMMARY.md')
-rw-r--r--doc/CNN_RGBD_GRAYSCALE_SUMMARY.md10
1 files changed, 6 insertions, 4 deletions
diff --git a/doc/CNN_RGBD_GRAYSCALE_SUMMARY.md b/doc/CNN_RGBD_GRAYSCALE_SUMMARY.md
index 4c13693..3439f2c 100644
--- a/doc/CNN_RGBD_GRAYSCALE_SUMMARY.md
+++ b/doc/CNN_RGBD_GRAYSCALE_SUMMARY.md
@@ -20,7 +20,7 @@ Implemented CNN architecture upgrade: RGBD input → grayscale output with 7-cha
- **RGBD:** `(rgbd - 0.5) * 2`
- **UV coords:** `(uv - 0.5) * 2`
-- **Grayscale:** `(0.2126*R + 0.7152*G + 0.0722*B - 0.5) * 2`
+- **Grayscale:** `dot(original.rgb, vec3<f32>(0.2126, 0.7152, 0.0722))` (computed once, passed as parameter)
**Rationale:** Zero-centered inputs for tanh activation, better gradient flow.
@@ -48,13 +48,14 @@ Implemented CNN architecture upgrade: RGBD input → grayscale output with 7-cha
**Shaders (`/Users/skal/demo/workspaces/main/shaders/cnn/cnn_conv3x3.wgsl`):**
1. Added `cnn_conv3x3_7to4()`:
- - 7-channel input: [RGBD, uv_x, uv_y, gray]
+ - 7-channel input: [RGBD, uv_x, uv_y, gray] (gray passed as parameter)
- 4-channel output: RGBD
- Weights: `array<array<f32, 8>, 36>`
2. Added `cnn_conv3x3_7to1()`:
- - 7-channel input: [RGBD, uv_x, uv_y, gray]
+ - 7-channel input: [RGBD, uv_x, uv_y, gray] (gray passed as parameter)
- 1-channel output: grayscale
- Weights: `array<array<f32, 8>, 9>`
+3. Optimized: gray computed once in caller using `dot()`, not per-function
**Documentation (`/Users/skal/demo/doc/CNN_EFFECT.md`):**
1. Updated architecture section with RGBD→grayscale pipeline
@@ -71,7 +72,8 @@ CNNLayerParams and bind groups remain unchanged.
2. Each layer:
- Samples previous layer output (RGBD in [0,1])
- Normalizes RGBD to [-1,1]
- - Computes UV coords and grayscale, normalizes to [-1,1]
+ - Computes gray once using `dot()` (fs_main level)
+ - Normalizes UV coords to [-1,1] (inside conv functions)
- Concatenates 7-channel input
- Applies convolution with layer-specific weights
- Outputs RGBD (inner) or grayscale (final) in [-1,1]