diff options
| author | skal <pascal.massimino@gmail.com> | 2026-02-13 23:40:30 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-02-13 23:40:30 +0100 |
| commit | 25044d63057cdb134cc3930bb67b178cff1aebb4 (patch) | |
| tree | bf00411b0af159ef090dc9cffbd8c6dd793f1cff /training | |
| parent | 87a27bf022d7fba68e3a945ee29c854c6e1ae2d7 (diff) | |
CNN v2: Fix Layer 0 visualization scale (was 0.5, now 1.0)
Layer 0 output is clamped [0,1], does not need 0.5 dimming.
Middle layers (ReLU) keep 0.5 scale for values >1.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Diffstat (limited to 'training')
| -rw-r--r-- | training/diagnose_255_to_253.md | 69 | ||||
| -rwxr-xr-x | training/test_viz_precision.py | 38 |
2 files changed, 107 insertions, 0 deletions
diff --git a/training/diagnose_255_to_253.md b/training/diagnose_255_to_253.md new file mode 100644 index 0000000..764d328 --- /dev/null +++ b/training/diagnose_255_to_253.md @@ -0,0 +1,69 @@ +# Diagnosis: 255 → 253 Loss (-2 LSBs) + +## Findings + +### F16 Precision +✅ **No loss:** 1.0 → f16(0x3c00) → 1.0 (exact round-trip) + +### Visualization Scale +⚠️ **Inconsistent:** +- Layer 1 uses `vizScale = 0.5` (line 1530) +- Should render as 128, not 253 +- **User seeing 253 suggests viewing Static Features (scale=1.0), not CNN output** + +### Suspected Issue: Input Alpha Channel + +**Code:** `tools/cnn_v2_test/index.html` line 1233 +```javascript +depthData[i] = pixels[i * 4 + 3] / 255.0; // Alpha from canvas +``` + +**Hypothesis:** Input PNG alpha channel = 253 (not 255) +- Browsers may set alpha < 255 for certain images +- Pre-multiplied alpha corrections +- PNG encoder compression artifacts + +### Test + +**Check input alpha:** +```javascript +// In HTML tool console: +const canvas = document.createElement('canvas'); +canvas.width = tester.image.width; +canvas.height = tester.image.height; +const ctx = canvas.getContext('2d'); +ctx.drawImage(tester.image, 0, 0); +const imgData = ctx.getImageData(0, 0, canvas.width, canvas.height); +const alpha = imgData.data[3]; // First pixel alpha +console.log('First pixel alpha:', alpha); +``` + +### Alternative: C++ Reference + +Check if `cnn_test` tool produces same -2 loss: +```bash +# Generate solid white 8×8 test image with alpha=255 +python3 -c " +from PIL import Image +import numpy as np +img = np.ones((8, 8, 4), dtype=np.uint8) * 255 +Image.fromarray(img, 'RGBA').save('test_white_255.png') +print('Created test_white_255.png: all pixels RGBA=(255,255,255,255)') +" + +# Test with HTML tool → check if p3 = 1.0 or 0.9921875 +# Test with cnn_test → compare output +./build/cnn_test test_white_255.png output.png --cnn-version 2 --debug-hex +``` + +### Next Steps + +1. **Verify input:** Check alpha channel of user's input image +2. **Add debug:** Log first pixel RGBA values in HTML tool +3. **Compare:** Run same image through C++ cnn_test +4. **Isolate:** Test with synthetic 255 alpha image + +## Conclusion + +**Most likely:** Input image alpha ≠ 255, already 253 before CNN processing. +**Verify:** User should check input PNG metadata and alpha channel values. diff --git a/training/test_viz_precision.py b/training/test_viz_precision.py new file mode 100755 index 0000000..143f4ea --- /dev/null +++ b/training/test_viz_precision.py @@ -0,0 +1,38 @@ +#!/usr/bin/env python3 +"""Test WebGPU → Canvas → PNG precision loss + +Check if bgra8unorm → 2D canvas → PNG loses 2 LSBs. +""" + +import numpy as np + +# Simulate WebGPU bgra8unorm conversion +# Float [0, 1] → uint8 [0, 255] + +test_values = [ + 1.0, # Perfect white + 0.9999, # Near-white + 254.5/255, # Exactly 254.5 + 253.5/255, # Exactly 253.5 +] + +for val in test_values: + # WebGPU bgra8unorm: round(val * 255) + gpu_u8 = int(np.round(val * 255)) + + # Convert back to normalized + gpu_f32 = gpu_u8 / 255.0 + + # JavaScript canvas getImageData: uint8 + canvas_u8 = int(np.round(gpu_f32 * 255)) + + print(f"Input: {val:.6f} → GPU u8: {gpu_u8} → Canvas: {canvas_u8}") + if canvas_u8 != 255: + print(f" ⚠️ Lost {255 - canvas_u8} LSBs") + +print("\nConclusion:") +print("If WebGPU stores 1.0 as 255, canvas should read 255.") +print("If user sees 253, likely:") +print(" a) Not viewing CNN layer (viewing static features at scale=1.0)") +print(" b) Value in texture is already 253/255 = 0.9921875") +print(" c) F16 storage or unpacking issue") |
