demo.git - Vide-coded 64k demo system

Age	Commit message (Collapse)	Author
2026-03-26	fix(audio): P1-P3 fixes from audio code review	skal
	P1 — correctness bugs: - tracker.cc: move delete[] loop before pool reset so guard condition is valid - audio_engine: replace tracker_reset() with tracker_init() in reset()/seek() so synth IDs are re-registered after synth_init() clears spectrogram slots - spectrogram_resource_manager: set spec.version in load_procedural() (was UB) P2 — minor bugs: - synth.cc: move pan clamp unconditionally before debug-only block - gen.cc: remove dead `freq` variable in generate_note_spectrogram() - tracker.cc: remove duplicate g_sample_synth_cache clear loop P3 — cleanup: - Replace hardcoded 32000.0f with RING_BUFFER_SAMPLE_RATE (5 sites) - audio.cc: extract clip_samples() helper, remove duplicated clip loops - audio_engine: inline update_silent(), remove no-op prewarm_for_time_range() - Remove stale comments: stdio.h include, NEW: labels, CACHING block, NOTE: - Move TODO(timing) drift notes from source to TODO.md handoff(Gemini): audio review implemented, 36/36 tests passing
2026-03-26	update the weights	skal

2026-03-26	chore(scripts): document and guard mingw-w64 setup	skal
	- check_all.sh: guard Windows cross-compile step with command -v check instead of failing unconditionally; print install hint when skipped - project_init.sh: add Linux/WSL setup path alongside existing macOS path — installs build-essential/cmake/glfw/wgpu-native via apt-get, prints mingw-w64 install hint if cross-compile toolchain is absent
2026-03-26	fix(src/platform): code review cleanup	skal
	- platform.h: deduplicate str_view/label_view/platform_wgpu_wait_any (were identical in WIN32 and default branches); remove dead platform_wgpu_set_error_callback and its never-used callback typedefs - platform.cc: remove glfwSetWindowUserPointer(nullptr) + stale comment block; drop if-guard on user pointer fixup in platform_poll; remove redundant aspect_ratio recompute (framebuffer_size_callback owns it) - headless_platform.cc: write g_virtual_time back to state->time in platform_poll; remove never-set g_should_close variable - stub_types.h: remove dead platform_wgpu_set_error_callback and callback typedefs; add comment on non-empty WGPURenderPassColorAttachment handoff(Gemini): platform layer cleaned up, no behaviour change
2026-03-26	chore(src/gpu): remove stale cnn_v1/v2 artifacts and dead comments	skal
	- demo_effects.h: drop commented-out cnn_v1/v2 includes (will be removed) - gpu.cc: replace stale "V2:" migration comments with accurate descriptions - effect.h, sequence.h: drop redundant #ifndef guards (kept #pragma once) handoff(Gemini): stale comment cleanup in src/gpu/ — no logic changes
2026-03-26	fix(cnn_v3/training): fix defaults and help strings across py tools	skal
	- train_cnn_v3.py: enc-channels 4,8→8,16; checkpoint-every 50→100; add help strings for epochs/batch-size/lr/checkpoint-dir - gen_test_vectors.py: add help strings for --W/--H/--seed - export_cnn_v3_weights.py: fix --output help string (export/→export)
2026-03-26	feat(cnn_v3): upgrade architecture to enc_channels=[8,16]	skal
	Double encoder capacity: enc0 4→8ch, enc1 8→16ch, bottleneck 16→16ch, dec1 32→8ch, dec0 16→4ch. Total weights 2476→7828 f16 (~15.3 KB). FiLM MLP output 40→72 params (L1: 16×40→16×72). 16-ch textures split into _lo/_hi rgba32uint pairs (enc1, bottleneck). enc0 and dec1 textures changed from rgba16float to rgba32uint (8ch). GBUF_RGBA32UINT node gains CopySrc for parity test readback. - WGSL shaders: all 5 passes rewritten for new channel counts - C++ CNNv3Effect: new weight offsets/sizes, 8ch uniform structs - Web tool (shaders.js + tester.js): matching texture formats and bindings - Parity test: readback_rgba32uint_8ch helper, updated vector counts - Training scripts: default enc_channels=[8,16], updated docstrings - Docs + architecture PNG regenerated handoff(Gemini): CNN v3 [8,16] upgrade complete. All code, tests, web tool, training scripts, and docs updated. Next: run training pass.
2026-03-25	fix(cnn_v3/training): rebuild optimizer before loading state on resume past ↵	skal
	FiLM warmup When resuming a checkpoint saved after the FiLM warmup phase, the optimizer was created with frozen (fewer) param groups, causing a size mismatch when loading the saved optimizer state. Fix: detect ckpt['epoch'] >= film_warmup, unfreeze FiLM MLP, and rebuild the optimizer before loading its state dict. handoff(Gemini): train_cnn_v3.py --resume now works past epoch 1500.
2026-03-25	fix(cnn_v3/tools): rename Output→Dec0 in viz panel; fix BN weight cnt 72→584	skal
	- Layer viz button was labeled 'Output' instead of 'Dec0' - BN parseWeights cnt was stale (old 1×1 conv size); now 8×8×9+8=584 handoff(Gemini): web tool only, no C++ or shader changes
2026-03-25	feat(cnn_v3/training): load prev.png when available; document web tool prev gap	skal
	- assemble_features() accepts optional prev ndarray (None → zeros) - _load_sample() loads prev.png if present, else None - __getitem__ slices/resizes prev alongside other channels - TODO.md: note that cnn_v3/tools/shaders.js hardcodes prev=0 in both pack shaders while C++ gbuf_pack.wgsl reads a real prev_cnn texture handoff(Gemini): prev.png now used in training when present; web tool gap documented in TODO.md
2026-03-25	feat(cnn_v3): 3×3 dilated bottleneck + Sobel loss + FiLM warmup + ↵	skal
	architecture PNG - Replace 1×1 pointwise bottleneck with Conv(8→8, 3×3, dilation=2): effective RF grows from ~13px to ~29px at ¼res (~+1 KB weights) - Add Sobel edge loss in training (--edge-loss-weight, default 0.1) - Add FiLM 2-phase training: freeze MLP for warmup epochs then unfreeze at lr×0.1 (--film-warmup-epochs, default 50) - Update weight layout: BN 72→584 f16, total 1964→2476 f16 (4952 B) - Cascade offsets in C++ effect, JS tool, export/gen_test_vectors scripts - Regenerate test_vectors.h (1238 u32); parity max_err=9.77e-04 - Generate dark-theme U-Net+FiLM architecture PNG (gen_architecture_png.py) - Replace ASCII art in CNN_V3.md and HOW_TO_CNN.md with PNG embed handoff(Gemini): bottleneck dilation + Sobel loss + FiLM warmup landed. Next: run first real training pass (see cnn_v3/docs/HOWTO.md §3).
2026-03-25	update weights	skal

2026-03-25	feat(cnn_v3/training): add --single-sample option + doc fixes	skal
	- train_cnn_v3.py: --single-sample <dir> implies --full-image + --batch-size 1 - cnn_v3_utils.py: CNNv3Dataset accepts single_sample= kwarg (explicit override) - HOWTO.md: document --single-sample workflow, fix pack_photo_sample.py usage (--target required) - HOW_TO_CNN.md: fix GBufferEffect seq input (prev_cnn→source), fix binary name (demo→demo64k), add --resume to flag table, remove stale "pack without target" block handoff(Gemini): --single-sample <dir> added to train_cnn_v3.py; docs audited and corrected
2026-03-25	feat(cnn_v3): add infer_cnn_v3.py + rewrite cnn_test for v3 parity	skal
	- cnn_v3/training/infer_cnn_v3.py: PyTorch inference tool; simple mode (single PNG, zeroed geometry) and full mode (sample directory); supports --identity-film (γ=1 β=0) to match C++ default, --cond for FiLM MLP, --blend, --debug-hex for pixel comparison - tools/cnn_test.cc: full rewrite, v3 only; packs 20-channel features on CPU (training format: [0,1] oct normals, pyrdown mip), uploads to GPU, runs CNNv3Effect, reads back RGBA16Float, saves PNG; --sample-dir for full G-buffer input, --weights for .bin override, --debug-hex - cmake/DemoTests.cmake: add cnn_v3/src include path, drop unused offscreen_render_target.cc from cnn_test sources - cnn_v3/docs/HOWTO.md: new §10 documenting both tools, comparison workflow, and feature-format convention (training vs runtime) handoff(Gemini): cnn_test + infer_cnn_v3.py ready for parity testing. Run both with --identity-film / --debug-hex on same image to compare.
2026-03-25	feat(cnn_v3/tools): embed default weights in HTML tool; add --html export flag	skal
	- cnn_v3/tools/weights.js: new file — base64-encoded cnn_v3_weights.bin + cnn_v3_film_mlp.bin; loaded at startup so the tool works without dropping files - tester.js: preload() falls back to embedded weights.js constants when fetch fails; logs "Loaded embedded" vs "Preloaded" to distinguish the two paths - index.html: load weights.js before tester.js - export_cnn_v3_weights.py: add --html / --html-output flags that call update_weights_js() to regenerate weights.js after a training run - HOW_TO_CNN.md: update pipeline diagram, §3 export commands, §7 HTML tool section (file table, workflow, weights.js description), Appendix A handoff(Gemini): weights.js now the canonical source for HTML tool defaults; regenerate with `uv run export_cnn_v3_weights.py <ckpt> --output ... --html`
2026-03-25	update weights	skal

2026-03-24	add more samples + tool html	skal

2026-03-24	move and improve the 'adjust.html' tool	skal

2026-03-24	fix(fft): make bit_reverse_permute static, remove from public API and tests	skal

2026-03-24	fix(fft): replace iterative twiddle with direct cosf/sinf, add tests A-E	skal
	fft_radix2 now computes wr=cosf(anglek)/wi=sinf(anglek) directly per k, eliminating float drift over long iteration runs. Iterative approach documented in comment for reference. Tests A-E added (bit-reverse, small-N DFT, twiddle drift, DCT small/large N). arrays_match tolerance reverted to 5e-3. TODO.md updated. handoff(Gemini): fft twiddle fix complete, 38/38 tests passing.
2026-03-23	test(fft): re-enable DCT tests, document twiddle accumulation bug	skal
	- Remove unused variable `bits` in bit_reverse_permute - Re-enable previously skipped DCT correctness tests (impulse at N/2, sinusoidal, complex inputs) with tolerance bumped to 2e-2 - Close TODO for FFT-DCT discrepancy investigation - Add detailed TODO for fixing twiddle factor accumulation bug in fft_radix2 (root cause of sign errors at large N), with step-by-step test plan (components A–E) handoff(Gemini): FFT twiddle bug plan in TODO.md §"Fix FFT twiddle factor accumulation bug". Tests currently pass at 2e-2; target <1e-5 after fix.
2026-03-23	add a registration tool	skal

2026-03-23	docs: update temporal feedback docs — wire_dag auto-wiring, F16X8 format	skal
	- HOWTO.md: replace manual set_cnn_output_node() instructions with wire_dag() auto-wiring explanation; add timeline.seq snippet as canonical wiring example; document F16X8/GBUF_ALBEDO format requirement - CNN_V3.md: fix prev_tex format (rgba16float, not rgba8unorm); mark timeline.seq CNNv3Effect TODO as done - SEQUENCE.md: already updated in previous commit (wire_dag pattern, format-matching rule, sink guard) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23	fix(cnn_v3_debug): add CNNv3Effect to debug sequence for prev.r/g/b temporal ↵	skal
	feedback timeline.seq is the canonical source — timeline.cc was wrongly hand-edited. Add CNNv3Effect + cnn_out (gbuf_albedo) node to cnn_v3_debug sequence so wire_dag() can wire GBufferEffect.cnn_output_node_ correctly. Also fix node_prev_tex_ NodeType: F16X8 (Rgba16Float+CopyDst) to match CNNv3Effect output format (GBUF_ALBEDO = Rgba16Float). Regenerated timeline.cc via: python3 tools/seq_compiler.py workspaces/main/timeline.seq Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23	feat(gbuffer): wire_dag() + find_downstream_output() for temporal feedback	skal
	- Add Effect::wire_dag() virtual (called from init_effect_nodes after full DAG built) - Add Effect::find_downstream_output() protected helper (first downstream consumer output) - GBufferEffect::wire_dag() auto-sets cnn_output_node_ via find_downstream_output, guarding against sink (external view, null texture) - GBufferEffect::post_render() null-checks src texture before CopyTextureToTexture - Tests: find_downstream_output cases + wire_dag integration in test_effect_base - Doc: SEQUENCE.md updated with wire_dag pattern, helper contract, and sink guard Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23	feat(cnn_v3): GBufferEffect temporal feedback via post_render()	skal
	- Add Effect::post_render() virtual hook, called after all effects in the sequence have rendered each frame. Default is no-op. - Sequence::render_effects() runs a second pass invoking post_render() on all DAG nodes after the render pass completes. - GBufferEffect: declare internal node_prev_tex_ (U8X4_NORM) for persistent prev-frame CNN output. post_render() copies cnn_output_node_ → node_prev_tex_ via CopyTextureToTexture. render() binds node_prev_tex_ as prev_cnn (binding 6) — zero on frame 0 (matches training convention). - Expose set_cnn_output_node(name) API; call once at setup. - Drop brittle ping-pong / input_nodes_[0] fallback. - Update doc/SEQUENCE.md: post_render() semantics, frame execution order, temporal feedback canonical pattern, node types table with G-buffer types. - Update cnn_v3/docs/HOWTO.md: temporal feedback wiring section. 36/36 tests passing. handoff(Gemini): prev.rgb temporal feedback now correct and generic. Set set_cnn_output_node("sink") (or CNN output node name) once at setup.
2026-03-23	feat(cnn_v3): shadow→dif migration complete (ch18)	skal
	Replace raw shadow (ch18) with dif = max(0,dot(normal,KEY_LIGHT))*shadow across all layers. Channel count stays 20, weight shapes unchanged. - gbuf_pack.wgsl: t1.z = pack4x8unorm(mip2.g, mip2.b, dif, transp); t1.w = 0u - gbuf_deferred.wgsl: read dif from unpack4x8unorm(t1.z).z - gbuf_view.wgsl: revert to 4×5 grid, ch18=dif label, ch19=trns label - tools/shaders.js: FULL_PACK_SHADER adds oct_decode + computes dif - cnn_v3_utils.py: assemble_features() computes dif on-the-fly via oct_decode - docs: CNN_V3.md, HOWTO.md, HOW_TO_CNN.md, GBUF_DIF_MIGRATION.md updated handoff(Gemini): shadow→dif migration done, ready for first training pass
2026-03-23	wip(cnn_v3): shadow→dif intermediate + scene tweaks + migration plan	skal
	- gbuf_shadow.wgsl: normal bias 0.05→0.02 - gbuf_pack.wgsl: compute dif=diffuse*shadow, drop shadow from t1.z, store dif in t1.w (INTERMEDIATE — incorrect packing, see migration plan) - gbuf_deferred.wgsl: read dif from t1.w.x (matches intermediate packing) - gbuf_view.wgsl: expand to 4×6 grid, show dif.r/g/b in row 5 (INTERMEDIATE — to be reverted to 4×5 with ch18=dif) - gbuffer_effect.cc: add small hovering sphere (r=0.6) above scene; swap cube/sphere positions; both spheres pulsate - docs/GBUF_DIF_MIGRATION.md: full migration plan with checklist handoff(Claude): intermediate commit — GBUF_DIF_MIGRATION.md §Current State describes what is wrong and the full implementation checklist (5 steps).
2026-03-22	refactor(cnn_v3): simplify sphere SDF in shadow pass, remove per-frame alloc	skal
	gbuf_shadow.wgsl — dfWithID(): - Sphere: replace inv_model local-space transform with direct world-space formula (length(p - center) - radius). Exact, no matrix multiply, no floating-point error from matrix inversion that can corrupt soft-shadow penumbra over 64 march steps. - lp/scale now computed only inside the cases that need them (box/torus/plane) instead of eagerly for every object. gbuffer_effect.cc — upload_scene_data(): - Replace per-frame std::vector<GBufObjectData> heap allocation with a file-static staging buffer s_obj_staging[256]: zero alloc per frame. handoff(Gemini): sphere SDF now exact; shadow march should be cleaner.
2026-03-22	fix(cnn_v3): shadow pass — 5 bugs fixed, labels in gbuf_view	skal
	1. Camera Y-inversion: proj.m[5] = -proj.m[5] in upload_scene_data + WGPUFrontFace_CCW on raster pipeline. 2. Shadow formula: replace shadowWithStoredDistance with 64-step IQ soft shadow (8d/t, unbounded). 3. Local→world SDF scale: d = length(obj.model[0].xyz). 4. Shadow bias: use rasterized normal from normal_mat_tex (binding 4) instead of light direction — fixes terminator self-shadow on spheres. 5. ShaderComposer: GBufViewEffect now resolves #include via ShaderComposer::Get().Compose(). Also: per-tile channel labels in gbuf_view.wgsl via debug_str. Scene simplified to 1 cube + 1 sphere for debugging (restore TODO). Scale propagation for pulsating sphere confirmed correct end-to-end. handoff(Gemini): shadow validated. Next: restore full scene in GBufferEffect::set_scene() (20 cubes + 4 spheres, 2 lights), then run training pass per cnn_v3/docs/HOWTO.md §3.
2026-03-22	docs+feat(cnn_v3): compact context, re-enable shadow in GBufDeferredEffect	skal
	- TODO/PROJECT_CONTEXT updated to reflect operational pipeline state - GBufDeferredEffect: shadow re-enabled (albedo * (ambient + diffuse * shadow)) feat_tex1 binding restored for shadow channel debugging handoff(Gemini): shadow pass live again — investigate why shadow looks broken.
2026-03-22	feat(shaders): add ray_sphere snippet, use in gbuf_raster impostor	skal

2026-03-22	fix(cnn_v3): frontFace_CW for raster pipeline + sphere impostor in gbuf_raster	skal
	- Missing WGPUFrontFace_CW (Y-flipped perspective) caused back faces to render instead of front faces → cubes appeared inside-out. - Sphere objects now use ray-sphere impostor in fs_main: correct silhouette, smooth normal from hit point, and reprojected clip-space depth.
2026-03-22	fix(cnn_v3): resolve #include via ShaderComposer in GBufDeferredEffect	skal
	Raw WGSL was sent to WebGPU without resolving the math/normal include. Also removed unused feat_tex1 binding (shadow dropped for now).
2026-03-22	fix(cnn_v3): deferred render — diffuse only, drop broken shadow	skal

2026-03-22	fix(cnn_v3): use model matrix for normal transform in gbuf_raster	skal
	inv_model applies the inverse rotation → normals pointing inward. model matrix is correct for uniform-scale objects (matches rotating_cube.wgsl).
2026-03-22	refactor(shaders): extract oct-normal encode/decode into math/normal snippet	skal
	New src/shaders/math/normal.wgsl: oct_encode, oct_decode, oct_encode_unorm, oct_decode_unorm. Registered in InitShaderComposer as "math/normal". Removed inline copies from gbuf_raster.wgsl and gbuf_pack.wgsl. 18/18 tests passing.
2026-03-22	feat(cnn_v3): GBufDeferredEffect — simple deferred render (albedo * shadow)	skal
	New effect unpacks feat_tex0/feat_tex1 and outputs albedo * shadow. Replaces CNNv3Effect in cnn_v3_test sequence until training is complete. 37/37 tests passing. handoff(Gemini): GBufDeferredEffect wired in timeline; CNN v3 pipeline: GBufferEffect → GBufDeferredEffect → sink.
2026-03-22	feat(cnn_v3): wire CNNv3Effect into cnn_v3_test sequence	skal
	Replace GBufViewEffect with CNNv3Effect → Passthrough. G-buffer confirmed working; now running full CNN inference pipeline.
2026-03-22	fix(cnn_v3): call set_scene() in constructor + orbiting camera	skal
	- GBufferEffect::render() was a no-op (scene_ready_=false) because set_scene() was never called from the timeline sequence constructor. Fixed by calling set_scene() at the end of the constructor. - Camera now orbits the scene at 0.3 rad/s (R=6, y=2.5). handoff(Gemini): cnn_v3_test sequence now renders G-buffer + GBufViewEffect with animated orbiting camera.
2026-03-22	refactor(cnn_v3): GBufferEffect cleanup	skal
	Remove dead code and reduce duplication: - drop create_bilinear_sampler() (never called) - drop update_pack_bind_group() stub and pack_bind_group_ member - drop node_feat0_/node_feat1_; use output_nodes_[0/1] directly - Compose({}, src) consistently for all three pipelines - extract clear_r8_node() helper to replace two identical 10-line blocks No behavior change. 36/36 tests pass.
2026-03-22	feat(cnn_v3): Phase 4 — type-aware SDF in shadow pass	skal
	dfWithID() in gbuf_shadow.wgsl now branches on obj.params.x (ObjectType) instead of using sdBox for everything: 0=CUBE → sdBox(lp, vec3(1)) 1=SPHERE → sdSphere(lp, 1.0) 2=PLANE → sdPlane(lp, vec3(0,1,0), obj.params.y) 3=TORUS → sdTorus(lp, vec2(0.8, 0.2)) 36/36 tests pass.
2026-03-22	feat(cnn_v3): GBufferEffect Pass 2 — SDF shadow raymarching	skal
	Implements gbuf_shadow.wgsl: fullscreen render pass that reads depth from Pass 1, reconstructs world-space positions, evaluates a proxy-box SDF for each object (via inv_model), computes soft shadows for both directional lights using shadowWithStoredDistance(), and writes shadow factor to the RGBA8Unorm node_shadow_ target consumed by gbuf_pack.wgsl. Bind layout: B0=GlobalUniforms, B1=ObjectsBuffer (storage-read), B2=texture_depth_2d, B3=GBufLightsUniforms. Sky fragments (depth=1.0) are output as 1.0 (fully lit). Falls back to clear(1.0) if pipeline is not ready. 36/36 tests pass. handoff(Gemini): Pass 2 done. Pass 3 (transparency) still TODO. Phase 4 (type-aware SDF) optional after visual validation.
2026-03-22	feat(cnn_v3): GBufferEffect internal scene + GBufViewEffect debug wiring	skal
	GBufferEffect: - set_scene() now owns Scene/Camera internally; no external pointers needed - 20 randomly rotating cubes (xorshift32 seed, axis-angle animation) - 4 pumping spheres (radius = base_r * (1 + audio_intensity * 0.8)) - Camera at (0,2.5,6) looking at origin; aspect updated per-frame - GBufLightsUniforms: 2 directional lights (warm key + cool fill) - object_type written to ObjectData.params.x (ready for SDF shadow) - shadow/transp nodes cleared via zero-draw render passes (placeholder) - bilinear sampler cached via create_linear_sampler() / sampler_.get() - dead placeholder textures removed GBufViewEffect: - gbuf_view.wgsl: all channels now fully grayscale (removed color tint) - seq_compiler.py: GBufViewEffect added to CLASS_TO_HEADER - timeline.seq: cnn_v3_test uses GBufViewEffect -> sink for debug view Docs: HOWTO.md §1 updated with set_scene() description + §1b implementation plan for Pass 2 SDF shadow (shader spec, bind layout, C++ additions) handoff(Gemini): GBufferEffect has internal scene, 36/36 tests green. Next: implement Pass 2 shadow (gbuf_shadow.wgsl) per §1b plan in HOWTO.md.
2026-03-22	add full dataset	skal

2026-03-22	feat(cnn_v3/tools): zoom canvas shows region around clicked texel	skal

2026-03-22	fix(cnn_v3/tools): don't destroy feat textures after runFromFeat (breaks viz)	skal

2026-03-22	fix(cnn_v3/tools): refresh zoom canvas when new sample is loaded	skal

2026-03-22	fix(cnn_v3/tools): zoom canvas fits remaining panel space (both axes, rAF ↵	skal
	measure)
2026-03-22	fix(cnn_v3/tools): fit zoom canvas to panel width by scaling canvas attributes	skal