<feed xmlns='http://www.w3.org/2005/Atom'>
<title>demo.git/cnn_v3/shaders, branch main</title>
<subtitle>Vide-coded 64k demo system</subtitle>
<id>https://git.taar-o.com/demo.git/atom?h=main</id>
<link rel='self' href='https://git.taar-o.com/demo.git/atom?h=main'/>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/'/>
<updated>2026-03-27T07:41:05Z</updated>
<entry>
<title>fix(cnn_v3): L1 loss + depth-grad tanh normalization to reduce flat convergence</title>
<updated>2026-03-27T07:41:05Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-27T07:41:05Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=37df61d1a0dbd5e253f9db778c17c4187e453b8d'/>
<id>urn:sha1:37df61d1a0dbd5e253f9db778c17c4187e453b8d</id>
<content type='text'>
- Switch MSELoss → L1Loss in train_cnn_v3.py (median-seeking, avoids gray-blob)
- Normalize depth_grad channels with tanh(10x) in cnn_v3_utils.py (bounds ±∞ signed values)
- Match normalization in gbuf_pack.wgsl: tanh((right-left)*5.0) == tanh(10*central_diff)

handoff(Gemini): training pipeline only; no C++ or test changes needed.
</content>
</entry>
<entry>
<title>fix(cnn_v3): remove dec0 ReLU, load FiLM MLP at runtime</title>
<updated>2026-03-27T06:59:00Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-27T06:59:00Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=fb13e67acbc7d7dd2974a456fcb134966c47cee0'/>
<id>urn:sha1:fb13e67acbc7d7dd2974a456fcb134966c47cee0</id>
<content type='text'>
Two bugs blocking training convergence:

1. dec0 ReLU before sigmoid constrained output to [0.5,1.0] — network
   could never produce dark pixels. Removed F.relu in train_cnn_v3.py
   and max(0,…) in cnn_v3_dec0.wgsl. Test vectors regenerated.

2. set_film_params() used hardcoded heuristics instead of the trained MLP.
   Added CNNv3FilmMlp struct + load_film_mlp() to cnn_v3_effect.h/.cc.
   MLP auto-loaded from ASSET_WEIGHTS_CNN_V3_FILM_MLP at construction;
   Linear(5→16)→ReLU→Linear(16→72) runs CPU-side each frame.

36/36 tests pass. Parity max_err=4.88e-4 unchanged.

handoff(Gemini): retrain from scratch — needs ≥50 samples (currently 11).
See cnn_v3/docs/HOWTO.md §2-3.
</content>
</entry>
<entry>
<title>feat(cnn_v3): upgrade architecture to enc_channels=[8,16]</title>
<updated>2026-03-26T06:03:01Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-26T06:03:01Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=8f14bdd66cb002b2f89265b2a578ad93249089c9'/>
<id>urn:sha1:8f14bdd66cb002b2f89265b2a578ad93249089c9</id>
<content type='text'>
Double encoder capacity: enc0 4→8ch, enc1 8→16ch, bottleneck 16→16ch,
dec1 32→8ch, dec0 16→4ch. Total weights 2476→7828 f16 (~15.3 KB).
FiLM MLP output 40→72 params (L1: 16×40→16×72).

16-ch textures split into _lo/_hi rgba32uint pairs (enc1, bottleneck).
enc0 and dec1 textures changed from rgba16float to rgba32uint (8ch).
GBUF_RGBA32UINT node gains CopySrc for parity test readback.

- WGSL shaders: all 5 passes rewritten for new channel counts
- C++ CNNv3Effect: new weight offsets/sizes, 8ch uniform structs
- Web tool (shaders.js + tester.js): matching texture formats and bindings
- Parity test: readback_rgba32uint_8ch helper, updated vector counts
- Training scripts: default enc_channels=[8,16], updated docstrings
- Docs + architecture PNG regenerated

handoff(Gemini): CNN v3 [8,16] upgrade complete. All code, tests, web
tool, training scripts, and docs updated. Next: run training pass.
</content>
</entry>
<entry>
<title>feat(cnn_v3): 3×3 dilated bottleneck + Sobel loss + FiLM warmup + architecture PNG</title>
<updated>2026-03-25T09:05:42Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-25T09:05:42Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=ce6e5b99f26e4e7c69a3cacf360bd0d492de928c'/>
<id>urn:sha1:ce6e5b99f26e4e7c69a3cacf360bd0d492de928c</id>
<content type='text'>
- Replace 1×1 pointwise bottleneck with Conv(8→8, 3×3, dilation=2):
  effective RF grows from ~13px to ~29px at ¼res (~+1 KB weights)
- Add Sobel edge loss in training (--edge-loss-weight, default 0.1)
- Add FiLM 2-phase training: freeze MLP for warmup epochs then
  unfreeze at lr×0.1 (--film-warmup-epochs, default 50)
- Update weight layout: BN 72→584 f16, total 1964→2476 f16 (4952 B)
- Cascade offsets in C++ effect, JS tool, export/gen_test_vectors scripts
- Regenerate test_vectors.h (1238 u32); parity max_err=9.77e-04
- Generate dark-theme U-Net+FiLM architecture PNG (gen_architecture_png.py)
- Replace ASCII art in CNN_V3.md and HOW_TO_CNN.md with PNG embed

handoff(Gemini): bottleneck dilation + Sobel loss + FiLM warmup landed.
Next: run first real training pass (see cnn_v3/docs/HOWTO.md §3).
</content>
</entry>
<entry>
<title>feat(cnn_v3): shadow→dif migration complete (ch18)</title>
<updated>2026-03-22T23:43:20Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-22T23:43:20Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=13cf1438caa56b34529d4031ddf73d38286b70e5'/>
<id>urn:sha1:13cf1438caa56b34529d4031ddf73d38286b70e5</id>
<content type='text'>
Replace raw shadow (ch18) with dif = max(0,dot(normal,KEY_LIGHT))*shadow
across all layers. Channel count stays 20, weight shapes unchanged.

- gbuf_pack.wgsl: t1.z = pack4x8unorm(mip2.g, mip2.b, dif, transp); t1.w = 0u
- gbuf_deferred.wgsl: read dif from unpack4x8unorm(t1.z).z
- gbuf_view.wgsl: revert to 4×5 grid, ch18=dif label, ch19=trns label
- tools/shaders.js: FULL_PACK_SHADER adds oct_decode + computes dif
- cnn_v3_utils.py: assemble_features() computes dif on-the-fly via oct_decode
- docs: CNN_V3.md, HOWTO.md, HOW_TO_CNN.md, GBUF_DIF_MIGRATION.md updated

handoff(Gemini): shadow→dif migration done, ready for first training pass
</content>
</entry>
<entry>
<title>wip(cnn_v3): shadow→dif intermediate + scene tweaks + migration plan</title>
<updated>2026-03-22T23:26:52Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-22T23:26:52Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=1470dd240f48652d1fe97957fe44a49b0e1ee9a6'/>
<id>urn:sha1:1470dd240f48652d1fe97957fe44a49b0e1ee9a6</id>
<content type='text'>
- gbuf_shadow.wgsl: normal bias 0.05→0.02
- gbuf_pack.wgsl: compute dif=diffuse*shadow, drop shadow from t1.z,
  store dif in t1.w (INTERMEDIATE — incorrect packing, see migration plan)
- gbuf_deferred.wgsl: read dif from t1.w.x (matches intermediate packing)
- gbuf_view.wgsl: expand to 4×6 grid, show dif.r/g/b in row 5
  (INTERMEDIATE — to be reverted to 4×5 with ch18=dif)
- gbuffer_effect.cc: add small hovering sphere (r=0.6) above scene;
  swap cube/sphere positions; both spheres pulsate
- docs/GBUF_DIF_MIGRATION.md: full migration plan with checklist

handoff(Claude): intermediate commit — GBUF_DIF_MIGRATION.md §Current State
describes what is wrong and the full implementation checklist (5 steps).
</content>
</entry>
<entry>
<title>refactor(cnn_v3): simplify sphere SDF in shadow pass, remove per-frame alloc</title>
<updated>2026-03-22T22:51:40Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-22T22:51:40Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=12d5d5f1762a0c00405950b6ff5e564880f0ff36'/>
<id>urn:sha1:12d5d5f1762a0c00405950b6ff5e564880f0ff36</id>
<content type='text'>
gbuf_shadow.wgsl — dfWithID():
- Sphere: replace inv_model local-space transform with direct world-space
  formula (length(p - center) - radius). Exact, no matrix multiply, no
  floating-point error from matrix inversion that can corrupt soft-shadow
  penumbra over 64 march steps.
- lp/scale now computed only inside the cases that need them (box/torus/plane)
  instead of eagerly for every object.

gbuffer_effect.cc — upload_scene_data():
- Replace per-frame std::vector&lt;GBufObjectData&gt; heap allocation with a
  file-static staging buffer s_obj_staging[256]: zero alloc per frame.

handoff(Gemini): sphere SDF now exact; shadow march should be cleaner.
</content>
</entry>
<entry>
<title>fix(cnn_v3): shadow pass — 5 bugs fixed, labels in gbuf_view</title>
<updated>2026-03-22T22:17:50Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-22T22:17:50Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=8fd3eda0ed069b1a817261f8f4d6a35c565b3fe4'/>
<id>urn:sha1:8fd3eda0ed069b1a817261f8f4d6a35c565b3fe4</id>
<content type='text'>
1. Camera Y-inversion: proj.m[5] = -proj.m[5] in upload_scene_data
   + WGPUFrontFace_CCW on raster pipeline.
2. Shadow formula: replace shadowWithStoredDistance with 64-step
   IQ soft shadow (8*d/t, unbounded).
3. Local→world SDF scale: d *= length(obj.model[0].xyz).
4. Shadow bias: use rasterized normal from normal_mat_tex (binding 4)
   instead of light direction — fixes terminator self-shadow on spheres.
5. ShaderComposer: GBufViewEffect now resolves #include via
   ShaderComposer::Get().Compose().

Also: per-tile channel labels in gbuf_view.wgsl via debug_str.
Scene simplified to 1 cube + 1 sphere for debugging (restore TODO).
Scale propagation for pulsating sphere confirmed correct end-to-end.

handoff(Gemini): shadow validated. Next: restore full scene in
GBufferEffect::set_scene() (20 cubes + 4 spheres, 2 lights), then
run training pass per cnn_v3/docs/HOWTO.md §3.
</content>
</entry>
<entry>
<title>docs+feat(cnn_v3): compact context, re-enable shadow in GBufDeferredEffect</title>
<updated>2026-03-22T19:31:45Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-22T19:31:45Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=a2697faa005337c4d8e8e6376d9e57edadf63f44'/>
<id>urn:sha1:a2697faa005337c4d8e8e6376d9e57edadf63f44</id>
<content type='text'>
- TODO/PROJECT_CONTEXT updated to reflect operational pipeline state
- GBufDeferredEffect: shadow re-enabled (albedo * (ambient + diffuse * shadow))
  feat_tex1 binding restored for shadow channel debugging

handoff(Gemini): shadow pass live again — investigate why shadow looks broken.
</content>
</entry>
<entry>
<title>feat(shaders): add ray_sphere snippet, use in gbuf_raster impostor</title>
<updated>2026-03-22T19:28:48Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-22T19:28:48Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=ce22f79c55e68f9fa496a47a528a6978b89e1261'/>
<id>urn:sha1:ce22f79c55e68f9fa496a47a528a6978b89e1261</id>
<content type='text'>
</content>
</entry>
</feed>
