<feed xmlns='http://www.w3.org/2005/Atom'>
<title>demo.git/doc, branch main</title>
<subtitle>Vide-coded 64k demo system</subtitle>
<id>https://git.taar-o.com/demo.git/atom?h=main</id>
<link rel='self' href='https://git.taar-o.com/demo.git/atom?h=main'/>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/'/>
<updated>2026-05-14T17:11:28Z</updated>
<entry>
<title>ans: order-0 rANS coder + WGSL asset compression</title>
<updated>2026-05-14T17:11:28Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-05-14T17:09:39Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=6ef8f578817ee0134fd5867ca3b80590e3eb2368'/>
<id>urn:sha1:6ef8f578817ee0134fd5867ca3b80590e3eb2368</id>
<content type='text'>
Adds src/util/ans.{h,cc}, a per-chunk-adaptive order-0 rANS entropy
coder. Decoder is always built; encoder is gated on ANS_ENABLE_ENCODER
(tools only). Both sides take an optional 256-entry initial_counts
table to seed the adaptive model.

The per-chunk initial state is (1 &lt;&lt; kBits). Higher initial states
(e.g. with a signature packed into the upper bits) force a renorm-emit
at iter 0 that the decoder never consumes, corrupting multi-chunk
streams once stats become skewed.

Asset pipeline:
- AssetRecord gains 'compression' and 'uncompressed_size' fields.
- asset_packer scans every WGSL file to build a corpus-wide byte
  histogram, then ANS-encodes each shader using that histogram as the
  seed. Histogram and accessor are emitted alongside the asset table.
  Round-trip verification runs at pack time for every compressed
  asset; failures fall back to uncompressed storage.
- asset_manager decompresses on first GetAsset(), caches the
  heap-allocated buffer, and DropAsset / ReloadAssetsFromFile free it
  along with the procedural cache.
- Disk-load (dev) builds are unchanged: WGSL paths stay as filenames.

Tests:
- src/tests/util/test_ans.cc: roundtrip variants (empty, single byte,
  single-symbol run, all-zeros, random uniform/skewed, repeated ASCII),
  seeded-vs-uniform compression, rejection of mismatched counts /
  corruption / truncation, PeekUncompressedSize.
- 37/37 dev, 36/36 STRIP_ALL.

Compression observed: WGSL shaders shrink to ~0.62-0.71x in the main
workspace (81 of 105 assets qualify).

Docs:
- doc/ANS.md (new): algorithm, bitstream, API, asset pipeline
  integration, compression numbers, limitations, tests.
- doc/ASSET_SYSTEM.md: new Compression section + updated technical
  guarantees for compressed assets.
- doc/COMPLETED.md: May 2026 entry.
- PROJECT_CONTEXT.md: Build status line mentions WGSL ANS compression.
- CLAUDE.md, GEMINI.md: tier-3 build doc list includes ANS.md.
</content>
</entry>
<entry>
<title>docs: consolidate and sync docs with current codebase state</title>
<updated>2026-03-29T08:15:38Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-29T08:15:38Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=e22256e374694fd92cc55ba198d3f7b1911713fe'/>
<id>urn:sha1:e22256e374694fd92cc55ba198d3f7b1911713fe</id>
<content type='text'>
- PROJECT_CONTEXT.md: fix effect count (12→18), shader count (27→37),
  update CNN v3 pipeline description, tighten Next Up section
- TODO.md: fix priority numbering, restore GPU PCM synthesis as pending,
  streamline CNN v3 section, consolidate Future items
- doc/SEQUENCE.md: effect count 12→18
- cnn_v3/README.md: phases 1–7→1–9, test count 36→38, add phases 8–9
- cnn_v3/docs/HOWTO.md: fix dataset layout blender/photos→full/simple,
  update test counts 36→38 throughout
- doc/COMPLETED.md: archive FFT/timing/OLA fixes, remove false GPU PCM claim
- src/audio/audio_engine.cc: fix step comment numbering (6→5)
- src/audio/synth.cc: remove stale fractional_pos tempo-scaling comment

handoff(Gemini): docs now accurate — 18 effects, 37 shaders, 38/38 tests,
GPU PCM synthesis back in TODO as pending, CNN v3 dataset layout corrected.
</content>
</entry>
<entry>
<title>docs(3d): document set_direct_render() in Y-flip rule section</title>
<updated>2026-03-29T00:51:21Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-29T00:51:21Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=dc9ddd618b12e7cadcfaf87bfb42af86f6a00386'/>
<id>urn:sha1:dc9ddd618b12e7cadcfaf87bfb42af86f6a00386</id>
<content type='text'>
</content>
</entry>
<entry>
<title>fix(cnn_v3): remove dec0 ReLU, load FiLM MLP at runtime</title>
<updated>2026-03-27T06:59:00Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-27T06:59:00Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=fb13e67acbc7d7dd2974a456fcb134966c47cee0'/>
<id>urn:sha1:fb13e67acbc7d7dd2974a456fcb134966c47cee0</id>
<content type='text'>
Two bugs blocking training convergence:

1. dec0 ReLU before sigmoid constrained output to [0.5,1.0] — network
   could never produce dark pixels. Removed F.relu in train_cnn_v3.py
   and max(0,…) in cnn_v3_dec0.wgsl. Test vectors regenerated.

2. set_film_params() used hardcoded heuristics instead of the trained MLP.
   Added CNNv3FilmMlp struct + load_film_mlp() to cnn_v3_effect.h/.cc.
   MLP auto-loaded from ASSET_WEIGHTS_CNN_V3_FILM_MLP at construction;
   Linear(5→16)→ReLU→Linear(16→72) runs CPU-side each frame.

36/36 tests pass. Parity max_err=4.88e-4 unchanged.

handoff(Gemini): retrain from scratch — needs ≥50 samples (currently 11).
See cnn_v3/docs/HOWTO.md §2-3.
</content>
</entry>
<entry>
<title>docs: update temporal feedback docs — wire_dag auto-wiring, F16X8 format</title>
<updated>2026-03-23T07:07:31Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-23T07:07:31Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=193fdc276a43ad94a961aa9214be27cfa879d225'/>
<id>urn:sha1:193fdc276a43ad94a961aa9214be27cfa879d225</id>
<content type='text'>
- HOWTO.md: replace manual set_cnn_output_node() instructions with
  wire_dag() auto-wiring explanation; add timeline.seq snippet as canonical
  wiring example; document F16X8/GBUF_ALBEDO format requirement
- CNN_V3.md: fix prev_tex format (rgba16float, not rgba8unorm);
  mark timeline.seq CNNv3Effect TODO as done
- SEQUENCE.md: already updated in previous commit (wire_dag pattern,
  format-matching rule, sink guard)

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
</content>
</entry>
<entry>
<title>feat(gbuffer): wire_dag() + find_downstream_output() for temporal feedback</title>
<updated>2026-03-23T06:54:18Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-23T06:54:18Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=491a3c1ccbd0f46be655e97d2e3697135df6e3a2'/>
<id>urn:sha1:491a3c1ccbd0f46be655e97d2e3697135df6e3a2</id>
<content type='text'>
- Add Effect::wire_dag() virtual (called from init_effect_nodes after full DAG built)
- Add Effect::find_downstream_output() protected helper (first downstream consumer output)
- GBufferEffect::wire_dag() auto-sets cnn_output_node_ via find_downstream_output,
  guarding against sink (external view, null texture)
- GBufferEffect::post_render() null-checks src texture before CopyTextureToTexture
- Tests: find_downstream_output cases + wire_dag integration in test_effect_base
- Doc: SEQUENCE.md updated with wire_dag pattern, helper contract, and sink guard

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
</content>
</entry>
<entry>
<title>feat(cnn_v3): GBufferEffect temporal feedback via post_render()</title>
<updated>2026-03-23T06:31:14Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-23T06:31:14Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=1e3813355e37f903314ec2069ff788c6f69becfd'/>
<id>urn:sha1:1e3813355e37f903314ec2069ff788c6f69becfd</id>
<content type='text'>
- Add Effect::post_render() virtual hook, called after all effects in
  the sequence have rendered each frame. Default is no-op.
- Sequence::render_effects() runs a second pass invoking post_render()
  on all DAG nodes after the render pass completes.
- GBufferEffect: declare internal node_prev_tex_ (U8X4_NORM) for
  persistent prev-frame CNN output. post_render() copies cnn_output_node_
  → node_prev_tex_ via CopyTextureToTexture. render() binds node_prev_tex_
  as prev_cnn (binding 6) — zero on frame 0 (matches training convention).
- Expose set_cnn_output_node(name) API; call once at setup.
- Drop brittle ping-pong / input_nodes_[0] fallback.
- Update doc/SEQUENCE.md: post_render() semantics, frame execution order,
  temporal feedback canonical pattern, node types table with G-buffer types.
- Update cnn_v3/docs/HOWTO.md: temporal feedback wiring section.

36/36 tests passing.

handoff(Gemini): prev.rgb temporal feedback now correct and generic.
Set set_cnn_output_node("sink") (or CNN output node name) once at setup.
</content>
</entry>
<entry>
<title>fix(cnn_v3): shadow pass — 5 bugs fixed, labels in gbuf_view</title>
<updated>2026-03-22T22:17:50Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-22T22:17:50Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=8fd3eda0ed069b1a817261f8f4d6a35c565b3fe4'/>
<id>urn:sha1:8fd3eda0ed069b1a817261f8f4d6a35c565b3fe4</id>
<content type='text'>
1. Camera Y-inversion: proj.m[5] = -proj.m[5] in upload_scene_data
   + WGPUFrontFace_CCW on raster pipeline.
2. Shadow formula: replace shadowWithStoredDistance with 64-step
   IQ soft shadow (8*d/t, unbounded).
3. Local→world SDF scale: d *= length(obj.model[0].xyz).
4. Shadow bias: use rasterized normal from normal_mat_tex (binding 4)
   instead of light direction — fixes terminator self-shadow on spheres.
5. ShaderComposer: GBufViewEffect now resolves #include via
   ShaderComposer::Get().Compose().

Also: per-tile channel labels in gbuf_view.wgsl via debug_str.
Scene simplified to 1 cube + 1 sphere for debugging (restore TODO).
Scale propagation for pulsating sphere confirmed correct end-to-end.

handoff(Gemini): shadow validated. Next: restore full scene in
GBufferEffect::set_scene() (20 cubes + 4 spheres, 2 lights), then
run training pass per cnn_v3/docs/HOWTO.md §3.
</content>
</entry>
<entry>
<title>feat(cnn_v3): add G-buffer visualizer + web sample loader (Phase 7)</title>
<updated>2026-03-22T15:21:25Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-22T15:21:25Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=159ca2ca19345515cdfebed9fd88646730492cd2'/>
<id>urn:sha1:159ca2ca19345515cdfebed9fd88646730492cd2</id>
<content type='text'>
C++ GBufViewEffect: renders all 20 feature channels from feat_tex0/feat_tex1
in a 4×5 tiled grid. Custom BGL with WGPUTextureSampleType_Uint; bind group
rebuilt per frame via wgpuRenderPipelineGetBindGroupLayout.

Web tool: "Load sample directory" button — webkitdirectory picker, FULL_PACK_SHADER
compute (matches gbuf_pack.wgsl packing), runFromFeat() skips photo-pack step,
computePSNR() readback + comparison vs target.png side-by-side.

36/36 tests pass. Docs updated: HOWTO.md §9, README, PROJECT_CONTEXT, TODO,
COMPLETED.

handoff(Gemini): CNN v3 Phase 7 done. Next: run train_cnn_v3.py (see HOWTO §3).
</content>
</entry>
<entry>
<title>feat(cnn_v3): Phase 5 complete — parity validation passing (36/36 tests)</title>
<updated>2026-03-21T08:51:58Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-21T08:51:58Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=673a24215b2670007317060325256059d1448f3b'/>
<id>urn:sha1:673a24215b2670007317060325256059d1448f3b</id>
<content type='text'>
- Add test_cnn_v3_parity.cc: zero_weights + random_weights tests
- Add gen_test_vectors.py: PyTorch reference implementation for enc0/enc1/bn/dec1/dec0
- Add test_vectors.h: generated C header with enc0, dec1, output expected values
- Fix declare_nodes(): intermediate textures at fractional resolutions (W/2, W/4)
  using new NodeRegistry::default_width()/default_height() getters
- Add layer-by-layer readback (enc0, dec1) for regression coverage
- Final parity: enc0 max_err=1.95e-3, dec1 max_err=1.95e-3, out max_err=4.88e-4

handoff(Claude): CNN v3 parity done. Next: train_cnn_v3.py (FiLM MLP training).
</content>
</entry>
</feed>
