CNN v2: Use alpha channel for p3 depth feature + layer visualization

Training changes (train_cnn_v2.py): - p3 now uses target image alpha channel (depth proxy for 2D images) - Default changed from 0.0 → 1.0 (far plane semantics) - Both PatchDataset and ImagePairDataset updated Test tools (cnn_test.cc): - New load_depth_from_alpha() extracts PNG alpha → p3 texture - Fixed bind group layout: use UnfilterableFloat for R32Float depth - Added --save-intermediates support for CNN v2: * Each layer_N.png shows 4 channels horizontally (1812×345 grayscale) * layers_composite.png stacks all layers vertically (1812×1380) * static_features.png shows 4 feature channels horizontally - Per-channel visualization enables debugging layer-by-layer differences HTML tool (index.html): - Extract alpha channel from input image → depth texture - Matches training data distribution for validation Note: Current weights trained with p3=0 are now mismatched. Both tools use p3=alpha consistently, so outputs remain comparable for debugging. Retrain required for optimal quality. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
author: skal <pascal.massimino@gmail.com> 2026-02-13 22:42:45 +0100
committer: skal <pascal.massimino@gmail.com> 2026-02-13 22:42:45 +0100
commit: f81a30d15e1e7db0492f45a0b9bec6aaa20ae5c2 (patch)
tree: deb202a7d995895ec90e8ddc8c3fbf92082ea434 /tools/cnn_v2_test/index.html
parent: 7c1f937222d0e36294ebd25db949c6227aed6985 (diff)
1 files changed, 30 insertions, 1 deletions
diff --git a/tools/cnn_v2_test/index.html b/tools/cnn_v2_test/index.html
index 9636ecf..ca89fb4 100644
--- a/tools/cnn_v2_test/index.html
+++ b/tools/cnn_v2_test/index.html
@@ -1211,12 +1211,41 @@ class CNNTester {
       });
     }
 
+    // Extract depth from alpha channel (or 1.0 if no alpha)
+    const depthTex = this.device.createTexture({
+      size: [width, height, 1],
+      format: 'r32float',
+      usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST
+    });
+
+    // Read image data to extract alpha channel
+    const tempCanvas = document.createElement('canvas');
+    tempCanvas.width = width;
+    tempCanvas.height = height;
+    const tempCtx = tempCanvas.getContext('2d');
+    tempCtx.drawImage(source, 0, 0, width, height);
+    const imageData = tempCtx.getImageData(0, 0, width, height);
+    const pixels = imageData.data;
+
+    // Extract alpha channel (RGBA format: every 4th byte)
+    const depthData = new Float32Array(width * height);
+    for (let i = 0; i < width * height; i++) {
+      depthData[i] = pixels[i * 4 + 3] / 255.0;  // Alpha channel [0, 255] → [0, 1]
+    }
+
+    this.device.queue.writeTexture(
+      { texture: depthTex },
+      depthData,
+      { bytesPerRow: width * 4 },
+      [width, height, 1]
+    );
+
     const staticBG = this.device.createBindGroup({
       layout: staticPipeline.getBindGroupLayout(0),
       entries: [
         { binding: 0, resource: this.inputTexture.createView() },
         { binding: 1, resource: this.pointSampler },
-        { binding: 2, resource: this.inputTexture.createView() },  // Use input as depth (matches C++)
+        { binding: 2, resource: depthTex.createView() },  // Depth from alpha (matches training)
         { binding: 3, resource: staticTex.createView() },
         { binding: 4, resource: { buffer: mipLevelBuffer } }
       ]
author	skal <pascal.massimino@gmail.com>	2026-02-13 22:42:45 +0100
committer	skal <pascal.massimino@gmail.com>	2026-02-13 22:42:45 +0100
commit	f81a30d15e1e7db0492f45a0b9bec6aaa20ae5c2 (patch)
tree	deb202a7d995895ec90e8ddc8c3fbf92082ea434 /tools/cnn_v2_test/index.html
parent	7c1f937222d0e36294ebd25db949c6227aed6985 (diff)