summaryrefslogtreecommitdiff
path: root/training/train_cnn_v2.py
diff options
context:
space:
mode:
authorskal <pascal.massimino@gmail.com>2026-02-14 02:12:12 +0100
committerskal <pascal.massimino@gmail.com>2026-02-14 02:12:12 +0100
commit043044ae7563c2f92760c428765e35b411da82ea (patch)
tree0d640fec1517169d195747707b6c589c92fe7161 /training/train_cnn_v2.py
parent4d119a1b6a6f460ca6d5a8ef85176c45663fd40a (diff)
Replace hard clamp with sigmoid activation in CNN v2
Fixes training collapse where p1/p2 channels saturate due to gradient blocking at clamp boundaries. Sigmoid provides smooth [0,1] mapping with continuous gradients. Changes: - Layer 0: clamp(x, 0, 1) → sigmoid(x) - Final layer: clamp(x, 0, 1) → sigmoid(x) - Middle layers: ReLU unchanged (already stable) Updated files: - training/train_cnn_v2.py: PyTorch model activations - workspaces/main/shaders/cnn_v2/cnn_v2_compute.wgsl: WGSL shader - tools/cnn_v2_test/index.html: HTML validation tool - doc/CNN_V2.md: Documentation Validation: - Build clean (no shader errors) - 34/36 tests pass (2 unrelated script tests fail) - 10-epoch training: loss 0.153 → 0.088 (good convergence) - cnn_test processes images successfully Breaking change: Old checkpoints trained with clamp() incompatible. Retrain from scratch required. handoff(Claude): CNN v2 sigmoid activation implemented and validated.
Diffstat (limited to 'training/train_cnn_v2.py')
-rwxr-xr-xtraining/train_cnn_v2.py4
1 files changed, 2 insertions, 2 deletions
diff --git a/training/train_cnn_v2.py b/training/train_cnn_v2.py
index d80e3a5..9e5df2f 100755
--- a/training/train_cnn_v2.py
+++ b/training/train_cnn_v2.py
@@ -121,7 +121,7 @@ class CNNv2(nn.Module):
# Layer 0: input RGBD (4D) + static (8D) = 12D
x = torch.cat([input_rgbd, static_features], dim=1)
x = self.layers[0](x)
- x = torch.clamp(x, 0, 1) # Output [0,1] for layer 0
+ x = torch.sigmoid(x) # Soft [0,1] for layer 0
# Layer 1+: previous (4D) + static (8D) = 12D
for i in range(1, self.num_layers):
@@ -130,7 +130,7 @@ class CNNv2(nn.Module):
if i < self.num_layers - 1:
x = F.relu(x)
else:
- x = torch.clamp(x, 0, 1) # Final output [0,1]
+ x = torch.sigmoid(x) # Soft [0,1] for final layer
return x