summaryrefslogtreecommitdiff
path: root/cnn_v3/docs/HOW_TO_CNN.md
diff options
context:
space:
mode:
Diffstat (limited to 'cnn_v3/docs/HOW_TO_CNN.md')
-rw-r--r--cnn_v3/docs/HOW_TO_CNN.md14
1 files changed, 7 insertions, 7 deletions
diff --git a/cnn_v3/docs/HOW_TO_CNN.md b/cnn_v3/docs/HOW_TO_CNN.md
index 458b68f..4966a61 100644
--- a/cnn_v3/docs/HOW_TO_CNN.md
+++ b/cnn_v3/docs/HOW_TO_CNN.md
@@ -97,7 +97,7 @@ It calls `pack_photo_sample.py` with both `--photo` and `--target` in a single s
| `normal.png` | (128, 128, 0) uint8 | Neutral "no normal" → reconstructed (0,0,1) |
| `depth.png` | All zeros uint16 | No depth data |
| `matid.png` | All zeros uint8 | No material IDs |
-| `shadow.png` | 255 everywhere uint8 | Assume fully lit |
+| `shadow.png` | 255 everywhere uint8 | Assume fully lit (used to compute dif) |
| `transp.png` | 1 − alpha uint8 | 0 = opaque |
| `target.png` | Stylized target RGBA | Ground truth for training |
@@ -134,7 +134,7 @@ done
### 1b. From Blender (Full G-Buffer)
-Produces all 20 feature channels including normals, depth, mat IDs, and shadow.
+Produces all 20 feature channels including normals, depth, mat IDs, and dif (diffuse×shadow).
#### Blender requirements
@@ -420,7 +420,7 @@ Applied per-sample to make the model robust to missing channels:
| Channel group | Channels | Drop probability |
|---------------|----------|-----------------|
| Geometric | normal.xy, depth, depth_grad.xy [3,4,5,6,7] | `channel_dropout_p` (default 0.3) |
-| Context | mat_id, shadow, transp [8,18,19] | `channel_dropout_p × 0.67` (~0.2) |
+| Context | mat_id, dif, transp [8,18,19] | `channel_dropout_p × 0.67` (~0.2) |
| Temporal | prev.rgb [9,10,11] | 0.5 (always) |
This is why a model trained on Blender data also works on photos (geometry zeroed).
@@ -781,7 +781,7 @@ Both produced by `export_cnn_v3_weights.py` (§3).
| Texture | Format | Size |
|---------|--------|------|
| `feat_tex0` | rgba32uint | W × H (8 f16: albedo, normal, depth, depth_grad) |
-| `feat_tex1` | rgba32uint | W × H (12 u8: mat_id, prev, mip1, mip2, shadow, transp) |
+| `feat_tex1` | rgba32uint | W × H (12 u8: mat_id, prev, mip1, mip2, dif, transp) |
| `enc0_tex` | rgba16float | W × H |
| `enc1_tex` | rgba32uint | W/2 × H/2 (8 f16 packed) |
| `bn_tex` | rgba32uint | W/4 × H/4 |
@@ -790,7 +790,7 @@ Both produced by `export_cnn_v3_weights.py` (§3).
### Simple mode (photo input)
-Albedo = image RGB, mip1/mip2 from GPU mipmaps, shadow = 1.0, transp = 1 − alpha,
+Albedo = image RGB, mip1/mip2 from GPU mipmaps, dif = 1.0 (fully lit assumed), transp = 1 − alpha,
all geometric channels (normal, depth, depth_grad, mat_id, prev) = 0.
### Browser requirements
@@ -843,7 +843,7 @@ all geometric channels (normal, depth, depth_grad, mat_id, prev) = 0.
| 9–11 | prev.rgb | previous frame output | zero during training |
| 12–14 | mip1.rgb | pyrdown(albedo) | f32 [0,1] |
| 15–17 | mip2.rgb | pyrdown(mip1) | f32 [0,1] |
-| 18 | shadow | `shadow.png` | f32 [0,1] (1=lit) |
+| 18 | dif | computed | f32 [0,1] max(0,dot(normal,KEY_LIGHT))×shadow |
| 19 | transp | `transp.png` | f32 [0,1] (0=opaque) |
**Feature texture packing** (`feat_tex0` / `feat_tex1`, both `rgba32uint`):
@@ -858,6 +858,6 @@ feat_tex0 (4×u32 = 8 f16 channels via pack2x16float):
feat_tex1 (4×u32 = 12 u8 channels + padding via pack4x8unorm):
.x = pack4x8unorm(mat_id, prev.r, prev.g, prev.b)
.y = pack4x8unorm(mip1.r, mip1.g, mip1.b, mip2.r)
- .z = pack4x8unorm(mip2.g, mip2.b, shadow, transp)
+ .z = pack4x8unorm(mip2.g, mip2.b, dif, transp)
.w = 0 (unused, 8 reserved channels)
```