From aed21707f9ca43b70e7fbdae4144f9d64bd70d00 Mon Sep 17 00:00:00 2001 From: skal Date: Fri, 13 Feb 2026 12:49:05 +0100 Subject: Doc: Clarify CNN v2 training uses RGBA targets Updated CNN_V2.md to document that: - Model outputs 4 channels (RGBA) - Training targets preserve alpha from target images - Loss function compares all 4 channels Co-Authored-By: Claude Sonnet 4.5 --- doc/CNN_V2.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/doc/CNN_V2.md b/doc/CNN_V2.md index 6242747..b0aa24c 100644 --- a/doc/CNN_V2.md +++ b/doc/CNN_V2.md @@ -119,9 +119,11 @@ Requires quantization-aware training. ``` Layer 0: input RGBD (4D) + static (8D) = 12D → 4 channels (3×3 kernel) Layer 1: previous (4D) + static (8D) = 12D → 4 channels (3×3 kernel) -Layer 2: previous (4D) + static (8D) = 12D → 4 channels (3×3 kernel, output) +Layer 2: previous (4D) + static (8D) = 12D → 4 channels (3×3 kernel, output RGBA) ``` +**Output:** 4 channels (RGBA). Training targets preserve alpha from target images. + ### Weight Calculations **Per-layer weights (uniform 12D→4D, 3×3 kernels):** @@ -256,6 +258,9 @@ learning_rate = 1e-3 batch_size = 16 epochs = 5000 +# Dataset: Input RGB, Target RGBA (preserves alpha channel from image) +# Model outputs RGBA, loss compares all 4 channels + # Training loop (standard PyTorch f32) for epoch in range(epochs): for rgb_batch, depth_batch, target_batch in dataloader: -- cgit v1.2.3