From 61104d5b9e1774c11f0dba3b6d6018dabc2bce8f Mon Sep 17 00:00:00 2001 From: skal Date: Tue, 10 Feb 2026 16:44:39 +0100 Subject: feat: CNN RGBD→grayscale with 7-channel augmented input MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Upgrade CNN architecture to process RGBD input, output grayscale, with 7-channel layer inputs (RGBD + UV coords + grayscale). Architecture changes: - Inner layers: Conv2d(7→4) output RGBD - Final layer: Conv2d(7→1) output grayscale - All inputs normalized to [-1,1] for tanh activation - Removed CoordConv2d in favor of unified 7-channel input Training (train_cnn.py): - SimpleCNN: 7→4 (inner), 7→1 (final) architecture - Forward: Normalize RGBD/coords/gray to [-1,1] - Weight export: array, 36> (inner), array, 9> (final) - Dataset: Load RGBA (RGBD) input Shaders (cnn_conv3x3.wgsl): - Added cnn_conv3x3_7to4: 7-channel input → RGBD output - Added cnn_conv3x3_7to1: 7-channel input → grayscale output - Both normalize inputs and use flattened weight arrays Documentation: - CNN_EFFECT.md: Updated architecture, training, weight format - CNN_RGBD_GRAYSCALE_SUMMARY.md: Implementation summary - HOWTO.md: Added training command example Next: Train with RGBD input data Co-Authored-By: Claude Sonnet 4.5 --- doc/HOWTO.md | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'doc/HOWTO.md') diff --git a/doc/HOWTO.md b/doc/HOWTO.md index bdc0214..2c813f7 100644 --- a/doc/HOWTO.md +++ b/doc/HOWTO.md @@ -86,6 +86,14 @@ make run_util_tests # Utility tests --- +## Training + +```bash +./training/train_cnn.py --layers 3 --kernel_sizes 3,5,3 --epochs 10000 --batch_size 8 --input training/input/ --target training/output/ --checkpoint-every 1000 +``` + +--- + ## Timeline Edit `workspaces/main/timeline.seq`: -- cgit v1.2.3