summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorskal <pascal.massimino@gmail.com>2026-02-13 12:49:05 +0100
committerskal <pascal.massimino@gmail.com>2026-02-13 12:49:05 +0100
commitaed21707f9ca43b70e7fbdae4144f9d64bd70d00 (patch)
treee89836fc4e7aa5da84caa35c945d13ba8512b1a9 /doc
parent5074e4caec017d6607de5806858d0271a554d77c (diff)
Doc: Clarify CNN v2 training uses RGBA targets
Updated CNN_V2.md to document that: - Model outputs 4 channels (RGBA) - Training targets preserve alpha from target images - Loss function compares all 4 channels Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Diffstat (limited to 'doc')
-rw-r--r--doc/CNN_V2.md7
1 files changed, 6 insertions, 1 deletions
diff --git a/doc/CNN_V2.md b/doc/CNN_V2.md
index 6242747..b0aa24c 100644
--- a/doc/CNN_V2.md
+++ b/doc/CNN_V2.md
@@ -119,9 +119,11 @@ Requires quantization-aware training.
```
Layer 0: input RGBD (4D) + static (8D) = 12D → 4 channels (3×3 kernel)
Layer 1: previous (4D) + static (8D) = 12D → 4 channels (3×3 kernel)
-Layer 2: previous (4D) + static (8D) = 12D → 4 channels (3×3 kernel, output)
+Layer 2: previous (4D) + static (8D) = 12D → 4 channels (3×3 kernel, output RGBA)
```
+**Output:** 4 channels (RGBA). Training targets preserve alpha from target images.
+
### Weight Calculations
**Per-layer weights (uniform 12D→4D, 3×3 kernels):**
@@ -256,6 +258,9 @@ learning_rate = 1e-3
batch_size = 16
epochs = 5000
+# Dataset: Input RGB, Target RGBA (preserves alpha channel from image)
+# Model outputs RGBA, loss compares all 4 channels
+
# Training loop (standard PyTorch f32)
for epoch in range(epochs):
for rgb_batch, depth_batch, target_batch in dataloader: