<feed xmlns='http://www.w3.org/2005/Atom'>
<title>demo.git/cmake, branch main</title>
<subtitle>Vide-coded 64k demo system</subtitle>
<id>https://git.taar-o.com/demo.git/atom?h=main</id>
<link rel='self' href='https://git.taar-o.com/demo.git/atom?h=main'/>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/'/>
<updated>2026-05-20T21:21:59Z</updated>
<entry>
<title>fix: code review cleanup — bugs, dead code, factorization, simplification</title>
<updated>2026-05-20T21:21:59Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-05-20T20:44:44Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=a91f89c8ea15665853176c05597760d0fcf6e0df'/>
<id>urn:sha1:a91f89c8ea15665853176c05597760d0fcf6e0df</id>
<content type='text'>
Bugs:
- B1: fix dead tempo debug (prev_tempo captured after assignment)
- B2: fix ReloadAssetsFromFile leak for disk-loaded assets; simplify DropAsset
- B3: fix get_free_pool_slot leak (unregister synth + free data on reuse)
- B4: volatile -&gt; std::atomic with acquire/release in miniaudio_backend, synth
- B5: fix unaligned reads in scene_loader (memcpy-based read_f32/read_u32)
- B6: fix shader module + BGL + pipeline layout leaks in gpu.cc, pipeline_builder

Dead code:
- D1: remove unused particle_defs.h
- D3: remove create_post_process_pipeline_simple (zero callers)
- D4: remove empty gpu_draw()
- D5: remove write-only Hybrid3D::initialized_
- D6: remove legacy pending buffer path in audio.cc

Factorization:
- F1: Effect::run_fullscreen_pass() replaces boilerplate in 5 effects
- F2: particle_common.wgsl snippet, #include in 3 WGSL shaders
- F3: gpu_create_shader_module() helper, used in 3 call sites
- F5: get_world_aabb() shared between bvh.cc and physics.cc
- F6: samples_to_seconds() replaces 6 inline expressions
- F7: gpu_create_linear/nearest_sampler use SamplerCache; add nearest() preset

Simplification:
- S9+S1: WgslSamplerType param; Scene2Effect collapsed to thin wrapper
- S4: FFT heap allocs -&gt; stack arrays (zero allocs on hot path)
- S5: ObjectType::CUBE documented as legacy alias for BOX; default changed
- S6: bind group dirty-flag in Renderer3D; remove duplicate pipeline set
- S7: create_gpu_procedural() helper in texture_manager (~80 lines removed)

37/37 tests passing.

handoff(Claude): code review batch — all items verified, no regressions.
</content>
</entry>
<entry>
<title>ans: order-0 rANS coder + WGSL asset compression</title>
<updated>2026-05-14T17:11:28Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-05-14T17:09:39Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=6ef8f578817ee0134fd5867ca3b80590e3eb2368'/>
<id>urn:sha1:6ef8f578817ee0134fd5867ca3b80590e3eb2368</id>
<content type='text'>
Adds src/util/ans.{h,cc}, a per-chunk-adaptive order-0 rANS entropy
coder. Decoder is always built; encoder is gated on ANS_ENABLE_ENCODER
(tools only). Both sides take an optional 256-entry initial_counts
table to seed the adaptive model.

The per-chunk initial state is (1 &lt;&lt; kBits). Higher initial states
(e.g. with a signature packed into the upper bits) force a renorm-emit
at iter 0 that the decoder never consumes, corrupting multi-chunk
streams once stats become skewed.

Asset pipeline:
- AssetRecord gains 'compression' and 'uncompressed_size' fields.
- asset_packer scans every WGSL file to build a corpus-wide byte
  histogram, then ANS-encodes each shader using that histogram as the
  seed. Histogram and accessor are emitted alongside the asset table.
  Round-trip verification runs at pack time for every compressed
  asset; failures fall back to uncompressed storage.
- asset_manager decompresses on first GetAsset(), caches the
  heap-allocated buffer, and DropAsset / ReloadAssetsFromFile free it
  along with the procedural cache.
- Disk-load (dev) builds are unchanged: WGSL paths stay as filenames.

Tests:
- src/tests/util/test_ans.cc: roundtrip variants (empty, single byte,
  single-symbol run, all-zeros, random uniform/skewed, repeated ASCII),
  seeded-vs-uniform compression, rejection of mismatched counts /
  corruption / truncation, PeekUncompressedSize.
- 37/37 dev, 36/36 STRIP_ALL.

Compression observed: WGSL shaders shrink to ~0.62-0.71x in the main
workspace (81 of 105 assets qualify).

Docs:
- doc/ANS.md (new): algorithm, bitstream, API, asset pipeline
  integration, compression numbers, limitations, tests.
- doc/ASSET_SYSTEM.md: new Compression section + updated technical
  guarantees for compressed assets.
- doc/COMPLETED.md: May 2026 entry.
- PROJECT_CONTEXT.md: Build status line mentions WGSL ANS compression.
- CLAUDE.md, GEMINI.md: tier-3 build doc list includes ANS.md.
</content>
</entry>
<entry>
<title>feat(cnn_v3): add infer_cnn_v3.py + rewrite cnn_test for v3 parity</title>
<updated>2026-03-25T07:07:53Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-25T07:07:53Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=64095c683f15e8bd7c19d32041fcc81b1bd6c214'/>
<id>urn:sha1:64095c683f15e8bd7c19d32041fcc81b1bd6c214</id>
<content type='text'>
- cnn_v3/training/infer_cnn_v3.py: PyTorch inference tool; simple mode
  (single PNG, zeroed geometry) and full mode (sample directory); supports
  --identity-film (γ=1 β=0) to match C++ default, --cond for FiLM MLP,
  --blend, --debug-hex for pixel comparison
- tools/cnn_test.cc: full rewrite, v3 only; packs 20-channel features
  on CPU (training format: [0,1] oct normals, pyrdown mip), uploads to
  GPU, runs CNNv3Effect, reads back RGBA16Float, saves PNG; --sample-dir
  for full G-buffer input, --weights for .bin override, --debug-hex
- cmake/DemoTests.cmake: add cnn_v3/src include path, drop unused
  offscreen_render_target.cc from cnn_test sources
- cnn_v3/docs/HOWTO.md: new §10 documenting both tools, comparison
  workflow, and feature-format convention (training vs runtime)

handoff(Gemini): cnn_test + infer_cnn_v3.py ready for parity testing.
Run both with --identity-film / --debug-hex on same image to compare.
</content>
</entry>
<entry>
<title>feat(cnn_v3): GBufDeferredEffect — simple deferred render (albedo * shadow)</title>
<updated>2026-03-22T18:58:04Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-22T18:58:04Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=9bf9b0aa0573f77bd667e6976a8bb413153daa1d'/>
<id>urn:sha1:9bf9b0aa0573f77bd667e6976a8bb413153daa1d</id>
<content type='text'>
New effect unpacks feat_tex0/feat_tex1 and outputs albedo * shadow.
Replaces CNNv3Effect in cnn_v3_test sequence until training is complete.
37/37 tests passing.

handoff(Gemini): GBufDeferredEffect wired in timeline; CNN v3 pipeline: GBufferEffect → GBufDeferredEffect → sink.
</content>
</entry>
<entry>
<title>feat(cnn_v3): add G-buffer visualizer + web sample loader (Phase 7)</title>
<updated>2026-03-22T15:21:25Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-22T15:21:25Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=159ca2ca19345515cdfebed9fd88646730492cd2'/>
<id>urn:sha1:159ca2ca19345515cdfebed9fd88646730492cd2</id>
<content type='text'>
C++ GBufViewEffect: renders all 20 feature channels from feat_tex0/feat_tex1
in a 4×5 tiled grid. Custom BGL with WGPUTextureSampleType_Uint; bind group
rebuilt per frame via wgpuRenderPipelineGetBindGroupLayout.

Web tool: "Load sample directory" button — webkitdirectory picker, FULL_PACK_SHADER
compute (matches gbuf_pack.wgsl packing), runFromFeat() skips photo-pack step,
computePSNR() readback + comparison vs target.png side-by-side.

36/36 tests pass. Docs updated: HOWTO.md §9, README, PROJECT_CONTEXT, TODO,
COMPLETED.

handoff(Gemini): CNN v3 Phase 7 done. Next: run train_cnn_v3.py (see HOWTO §3).
</content>
</entry>
<entry>
<title>feat(cnn_v3): Phase 5 complete — parity validation passing (36/36 tests)</title>
<updated>2026-03-21T08:51:58Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-21T08:51:58Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=673a24215b2670007317060325256059d1448f3b'/>
<id>urn:sha1:673a24215b2670007317060325256059d1448f3b</id>
<content type='text'>
- Add test_cnn_v3_parity.cc: zero_weights + random_weights tests
- Add gen_test_vectors.py: PyTorch reference implementation for enc0/enc1/bn/dec1/dec0
- Add test_vectors.h: generated C header with enc0, dec1, output expected values
- Fix declare_nodes(): intermediate textures at fractional resolutions (W/2, W/4)
  using new NodeRegistry::default_width()/default_height() getters
- Add layer-by-layer readback (enc0, dec1) for regression coverage
- Final parity: enc0 max_err=1.95e-3, dec1 max_err=1.95e-3, out max_err=4.88e-4

handoff(Claude): CNN v3 parity done. Next: train_cnn_v3.py (FiLM MLP training).
</content>
</entry>
<entry>
<title>feat(cnn_v3): Phase 4 complete — CNNv3Effect C++ + FiLM uniform upload</title>
<updated>2026-03-21T07:52:53Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-21T07:52:53Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=fe008df92f7a68d81c9bedb4328da7001e0775f0'/>
<id>urn:sha1:fe008df92f7a68d81c9bedb4328da7001e0775f0</id>
<content type='text'>
- cnn_v3/src/cnn_v3_effect.{h,cc}: full Effect subclass with 5 compute
  passes (enc0→enc1→bottleneck→dec1→dec0), shared weights storage buffer,
  per-pass uniform buffers, set_film_params() API
- Fixed WGSL/C++ struct alignment: vec3u has align=16, so CnnV3Params4ch
  is 64 bytes and CnnV3ParamsEnc1 is 96 bytes (not 48/80)
- Weight offsets computed as explicit formulas (e.g. 20*4*9+4) for clarity
- Registered in CMake, shaders.h/cc, demo_effects.h, test_demo_effects.cc
- 35/35 tests pass

handoff(Gemini): CNN v3 Phase 5 next — parity validation (Python ref vs WGSL)
</content>
</entry>
<entry>
<title>feat(cnn_v3): Phase 1 complete - GBufferEffect integrated + HOWTO playbook</title>
<updated>2026-03-20T08:22:18Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-20T08:22:18Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=a10cabbe3a5ae05730c2e76493e42554ee6037ba'/>
<id>urn:sha1:a10cabbe3a5ae05730c2e76493e42554ee6037ba</id>
<content type='text'>
- Wire GBufferEffect into demo build: assets.txt, DemoSourceLists.cmake,
  demo_effects.h, shaders.h/cc. ShaderComposer::Compose() applied to
  gbuf_raster.wgsl (resolves #include "common_uniforms").
- Add GBufferEffect construction test. 35/35 passing.
- Write cnn_v3/docs/HOWTO.md: G-buffer wiring, training data prep,
  training plan, per-pixel validation workflow, phase status table,
  troubleshooting guide.
- Add project hooks: remind to update HOWTO.md on cnn_v3/ edits;
  warn on direct str_view(*_wgsl) usage bypassing ShaderComposer.
- Update PROJECT_CONTEXT.md and TODO.md: Phase 1 done,
  Phase 3 (WGSL U-Net shaders) is next active.

handoff(Gemini): CNN v3 Phase 3 is next - WGSL enc/dec/bottleneck/FiLM
shaders in cnn_v3/shaders/. See cnn_v3/docs/CNN_V3.md Architecture
section and cnn_v3/docs/HOWTO.md section 3 for spec. GBufferEffect
outputs feat_tex0 + feat_tex1 (rgba32uint, 20ch, 32 bytes/pixel).
C++ CNNv3Effect (Phase 4) takes those as input nodes.
</content>
</entry>
<entry>
<title>fix(build): move generated assets to per-build binary dir</title>
<updated>2026-03-12T20:04:38Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-12T20:04:38Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=67c6f166748a4012122a6e978f2f7dbdaee503cc'/>
<id>urn:sha1:67c6f166748a4012122a6e978f2f7dbdaee503cc</id>
<content type='text'>
assets_data.cc was shared in src/generated/ across all builds. When a
native debug build regenerated it with --disk_load (file paths), then
build_win compiled with STRIP_ALL=ON, the disk-load fopen() path was
compiled out, leaving raw macOS paths as shader source content — causing
wgpu WGSL parse errors on Wine.

Fix: GEN_DEMO_H/CC and all stamps now live in CMAKE_CURRENT_BINARY_DIR/
src/generated/ so each build dir independently generates assets in the
correct mode (embedded vs disk-load). Added CMAKE_CURRENT_BINARY_DIR/src
to CORE_INCLUDES so the binary-dir assets.h is resolved first.

handoff(Gemini): build system fix — assets are now per-build-dir; tests
35/35 pass; build_win produces embedded shaders.

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
</content>
</entry>
<entry>
<title>fix(assets): regenerate assets when DEMO_STRIP_ALL toggles</title>
<updated>2026-03-12T19:52:15Z</updated>
<author>
<name>skal</name>
<email>pascal.massimino@gmail.com</email>
</author>
<published>2026-03-12T19:52:15Z</published>
<link rel='alternate' type='text/html' href='https://git.taar-o.com/demo.git/commit/?id=747250a4e4d2a1dd9109b30b3fcfad05d526b408'/>
<id>urn:sha1:747250a4e4d2a1dd9109b30b3fcfad05d526b408</id>
<content type='text'>
Write asset_packer_mode.flag at configure time so that switching between
disk-load (debug) and embedded (STRIP_ALL) modes invalidates assets_data.cc.
Previously a stale embedded assets_data.cc caused fopen() on WGSL bytes,
silently failing all shader snippet registration (vs_main not found crash).

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
</content>
</entry>
</feed>
