CNN v2 Web Tool: Unify layer terminology and add binary format spec

- Rename 'Static (L0)' → 'Static' (clearer, less confusing) - Update channel labels: 'R/G/B/D' → 'Ch0 (R)/Ch1 (G)/Ch2 (B)/Ch3 (D)' - Add 'Layer' prefix in weights table for consistency - Document layer indexing: Static + Layer 1,2,3... (UI) ↔ weights.layers[0,1,2...] - Add explanatory notes about 7D input and 4-of-8 channel display - Create doc/CNN_V2_BINARY_FORMAT.md with complete .bin specification - Cross-reference spec in CNN_V2.md and CNN_V2_WEB_TOOL.md Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
author: skal <pascal.massimino@gmail.com> 2026-02-13 11:44:41 +0100
committer: skal <pascal.massimino@gmail.com> 2026-02-13 11:44:41 +0100
commit: c27b34279c0d1c2a8f1dbceb0e154b585b5c6916 (patch)
tree: 5918fbaadad369ec8213df1682919ebaf9f57b56 /doc
parent: 6ca832296a74b3a3342320cf4edaa368ebc56afe (diff)
3 files changed, 188 insertions, 21 deletions
diff --git a/doc/CNN_V2.md b/doc/CNN_V2.md
index 09d0841..588c3db 100644
--- a/doc/CNN_V2.md
+++ b/doc/CNN_V2.md
@@ -669,6 +669,15 @@ workspaces/main/shaders/cnn_*.wgsl               # Original v1 shaders
 
 ---
 
+## Related Documentation
+
+- `doc/CNN_V2_BINARY_FORMAT.md` - Binary weight file specification (.bin format)
+- `doc/CNN_V2_WEB_TOOL.md` - WebGPU testing tool with layer visualization
+- `doc/CNN_TEST_TOOL.md` - C++ offline validation tool (deprecated)
+- `doc/HOWTO.md` - Training and validation workflows
+
+---
+
 **Document Version:** 1.0
 **Last Updated:** 2026-02-12
 **Status:** Design approved, ready for implementation
diff --git a/doc/CNN_V2_BINARY_FORMAT.md b/doc/CNN_V2_BINARY_FORMAT.md
new file mode 100644
index 0000000..650177f
--- /dev/null
+++ b/doc/CNN_V2_BINARY_FORMAT.md
@@ -0,0 +1,155 @@
+# CNN v2 Binary Weight Format Specification
+
+Binary format for storing trained CNN v2 weights with static feature architecture.
+
+**File Extension:** `.bin`
+**Byte Order:** Little-endian
+**Version:** 1.0
+
+---
+
+## File Structure
+
+```
+┌─────────────────────┐
+│  Header (16 bytes)  │
+├─────────────────────┤
+│  Layer Info         │
+│  (20 bytes × N)     │
+├─────────────────────┤
+│  Weight Data        │
+│  (variable size)    │
+└─────────────────────┘
+```
+
+---
+
+## Header (16 bytes)
+
+| Offset | Type | Field          | Description                          |
+|--------|------|----------------|--------------------------------------|
+| 0x00   | u32  | magic          | Magic number: `0x32_4E_4E_43` ("CNN2") |
+| 0x04   | u32  | version        | Format version (currently 1)         |
+| 0x08   | u32  | num_layers     | Number of CNN layers (excludes static features) |
+| 0x0C   | u32  | total_weights  | Total f16 weight count across all layers |
+
+---
+
+## Layer Info (20 bytes per layer)
+
+Repeated `num_layers` times, starting at offset 0x10.
+
+| Offset      | Type | Field          | Description                          |
+|-------------|------|----------------|--------------------------------------|
+| 0x00        | u32  | kernel_size    | Convolution kernel dimension (3, 5, 7, etc.) |
+| 0x04        | u32  | in_channels    | Input channel count (includes 8 static features for Layer 1) |
+| 0x08        | u32  | out_channels   | Output channel count (max 8)         |
+| 0x0C        | u32  | weight_offset  | Weight array start index (f16 units, relative to weight data section) |
+| 0x10        | u32  | weight_count   | Number of f16 weights for this layer |
+
+**Layer Order:** Sequential (Layer 1, Layer 2, Layer 3, ...)
+
+---
+
+## Weight Data (variable size)
+
+Starts at offset: `16 + (num_layers × 20)`
+
+**Format:** Packed f16 pairs stored as u32
+**Packing:** `u32 = (f16_hi << 16) | f16_lo`
+**Storage:** Sequential by layer, then by output channel, input channel, spatial position
+
+**Weight Indexing:**
+```
+weight_idx = output_ch × (in_channels × kernel_size²) +
+             input_ch × kernel_size² +
+             (ky × kernel_size + kx)
+```
+
+Where:
+- `output_ch` ∈ [0, out_channels)
+- `input_ch` ∈ [0, in_channels)
+- `ky`, `kx` ∈ [0, kernel_size)
+
+**Unpacking f16 from u32:**
+```c
+uint32_t packed = weights_buffer[weight_idx / 2];
+uint16_t f16_bits = (weight_idx % 2 == 0) ? (packed & 0xFFFF) : (packed >> 16);
+```
+
+---
+
+## Example: 3-Layer Network
+
+**Configuration:**
+- Layer 1: 15→8, kernel 3×3 (1,080 weights)
+- Layer 2: 8→4, kernel 3×3 (288 weights)
+- Layer 3: 4→3, kernel 3×3 (108 weights)
+
+**File Layout:**
+```
+Offset   Size   Content
+------   ----   -------
+0x00     16     Header (magic, version=1, layers=3, weights=1476)
+0x10     20     Layer 1 info (kernel=3, in=15, out=8, offset=0, count=1080)
+0x24     20     Layer 2 info (kernel=3, in=8, out=4, offset=1080, count=288)
+0x38     20     Layer 3 info (kernel=3, in=4, out=3, offset=1368, count=108)
+0x4C     1476   Weight data (738 u32 packed f16 pairs)
+         ----
+Total:   1528 bytes (~1.5 KB)
+```
+
+---
+
+## Static Features
+
+Not stored in .bin file (computed at runtime):
+
+**7D Input Features (packed as 8 channels):**
+1. R (red channel)
+2. G (green channel)
+3. B (blue channel)
+4. D (depth value)
+5. UV_X (normalized x coordinate)
+6. UV_Y (normalized y coordinate)
+7. sin(10 × UV_X) (spatial frequency encoding)
+8. 1.0 (bias term)
+
+**First CNN layer** receives all 8 static features + 0-7 previous layer outputs (total 8-15 input channels).
+
+---
+
+## Validation
+
+**Magic Check:**
+```c
+uint32_t magic;
+fread(&magic, 4, 1, fp);
+if (magic != 0x32_4E_4E_43) { error("Invalid CNN v2 file"); }
+```
+
+**Size Check:**
+```c
+expected_size = 16 + (num_layers × 20) + (total_weights × 2);
+if (file_size != expected_size) { error("Size mismatch"); }
+```
+
+**Weight Offset Sanity:**
+```c
+// Each layer's offset should match cumulative count
+uint32_t cumulative = 0;
+for (int i = 0; i < num_layers; i++) {
+    if (layers[i].weight_offset != cumulative) { error("Invalid offset"); }
+    cumulative += layers[i].weight_count;
+}
+if (cumulative != total_weights) { error("Total mismatch"); }
+```
+
+---
+
+## Related Files
+
+- `training/export_cnn_v2_weights.py` - Binary export tool
+- `src/gpu/effects/cnn_v2_effect.cc` - C++ loader
+- `tools/cnn_v2_test/index.html` - WebGPU validator
+- `doc/CNN_V2.md` - Architecture design
diff --git a/doc/CNN_V2_WEB_TOOL.md b/doc/CNN_V2_WEB_TOOL.md
index 2fbc70e..81549ab 100644
--- a/doc/CNN_V2_WEB_TOOL.md
+++ b/doc/CNN_V2_WEB_TOOL.md
@@ -49,9 +49,11 @@ Browser-based WebGPU tool for validating CNN v2 inference with layer visualizati
 **3. Visualization Modes**
 
 **Activations Mode:**
-- 4 grayscale views per layer (channels 0-3)
+- 4 grayscale views per layer (channels 0-3 of up to 8 total)
 - WebGPU compute → unpack f16 → scale → grayscale
-- Auto-scale: Layer 0 (static) = 1.0, CNN layers = 0.2
+- Auto-scale: Static features = 1.0, CNN layers = 0.2
+- Static features: Shows R,G,B,D (first 4 of 8: RGBD+UV+sin+bias)
+- CNN layers: Shows first 4 output channels
 
 **Weights Mode:**
 - 2D canvas rendering per output channel
@@ -78,6 +80,21 @@ For each CNN layer i:
   Compute (ping-pong) → copy to layerTextures[i+1]
 ```
 
+### Layer Indexing
+
+**UI Layer Buttons:**
+- "Static" → layerOutputs[0] (7D input features)
+- "Layer 1" → layerOutputs[1] (CNN layer 1 output, uses weights.layers[0])
+- "Layer 2" → layerOutputs[2] (CNN layer 2 output, uses weights.layers[1])
+- "Layer N" → layerOutputs[N] (CNN layer N output, uses weights.layers[N-1])
+
+**Weights Table:**
+- "Layer 1" → weights.layers[0] (first CNN layer weights)
+- "Layer 2" → weights.layers[1] (second CNN layer weights)
+- "Layer N" → weights.layers[N-1]
+
+**Consistency:** Both UI and weights table use same numbering (1, 2, 3...) for CNN layers.
+
 ---
 
 ## Known Issues
@@ -192,26 +209,12 @@ For each CNN layer i:
 
 ## Binary Weight Format
 
-**Header (16 bytes):**
-```
-u32 magic;        // 0x32_4E_4E_43 ("CNN2")
-u32 version;      // Format version
-u32 num_layers;   // Layer count
-u32 total_weights;// Total f16 weight count
-```
-
-**Layer Info (20 bytes × N):**
-```
-u32 kernel_size;   // 3, 5, 7, etc.
-u32 in_channels;   // Input channel count
-u32 out_channels;  // Output channel count
-u32 weight_offset; // Offset in f16 units
-u32 weight_count;  // Number of f16 weights
-```
+See `doc/CNN_V2_BINARY_FORMAT.md` for complete specification.
 
-**Weights (variable):**
-- Packed f16 pairs as u32 (lo 16 bits, hi 16 bits)
-- Sequential storage: [layer0_weights][layer1_weights]...
+**Quick Summary:**
+- Header: 16 bytes (magic, version, layer count, total weights)
+- Layer info: 20 bytes × N (kernel size, channels, offsets)
+- Weights: Packed f16 pairs as u32
 
 ---
author	skal <pascal.massimino@gmail.com>	2026-02-13 11:44:41 +0100
committer	skal <pascal.massimino@gmail.com>	2026-02-13 11:44:41 +0100
commit	c27b34279c0d1c2a8f1dbceb0e154b585b5c6916 (patch)
tree	5918fbaadad369ec8213df1682919ebaf9f57b56 /doc
parent	6ca832296a74b3a3342320cf4edaa368ebc56afe (diff)