summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
36 hoursfeat(cnn_v3): export script + HOW_TO_CNN.md playbookskal
- export_cnn_v3_weights.py: .pth → cnn_v3_weights.bin (f16 packed u32) + cnn_v3_film_mlp.bin (f32) - HOW_TO_CNN.md: full pipeline playbook (data collection, training, export, C++ wiring, parity, HTML tool) - TODO.md: mark export script done Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
36 hoursfeat(cnn_v3): Phase 6 — training script (train_cnn_v3.py + cnn_v3_utils.py)skal
- train_cnn_v3.py: CNNv3 U-Net+FiLM model, training loop, CLI - cnn_v3_utils.py: image I/O, pyrdown, depth_gradient, assemble_features, apply_channel_dropout, detect_salient_points, CNNv3Dataset - Patch-based training (default 64×64) with salient-point extraction (harris/shi-tomasi/fast/gradient/random detectors, pre-cached at init) - Channel dropout for geometric/context/temporal channels - Random FiLM conditioning per sample for joint MLP+U-Net training - docs: HOWTO.md §3 updated with commands and flag reference - TODO.md: Phase 6 marked done, export script noted as next step Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
36 hoursdocs(cnn_v3): update CNN_V3.md + HOWTO.md to reflect Phases 1-5 completeskal
- CNN_V3.md: status line, architecture channel counts (8/16→4/8), FiLM MLP output count (96→40 params), size budget table (real implemented values) - HOWTO.md: Phase status table (5→done, add phase 6 training TODO), sections 3-5 rewritten to reflect what exists vs what is still planned
36 hoursfeat(cnn_v3): Phase 5 complete — parity validation passing (36/36 tests)skal
- Add test_cnn_v3_parity.cc: zero_weights + random_weights tests - Add gen_test_vectors.py: PyTorch reference implementation for enc0/enc1/bn/dec1/dec0 - Add test_vectors.h: generated C header with enc0, dec1, output expected values - Fix declare_nodes(): intermediate textures at fractional resolutions (W/2, W/4) using new NodeRegistry::default_width()/default_height() getters - Add layer-by-layer readback (enc0, dec1) for regression coverage - Final parity: enc0 max_err=1.95e-3, dec1 max_err=1.95e-3, out max_err=4.88e-4 handoff(Claude): CNN v3 parity done. Next: train_cnn_v3.py (FiLM MLP training).
37 hoursdocs: session handoff — CNN v3 Phase 4 completeskal
- TODO.md: mark Phase 4 done, add FiLM MLP training details (blocked on train_cnn_v3.py), clarify what 'real' set_film_params() requires - COMPLETED.md: archive Phase 4 with alignment fix note (vec3u→64/96 bytes) handoff(Gemini): next up CNN v3 Phase 5 (parity validation) or train_cnn_v3.py
37 hoursfeat(cnn_v3): Phase 4 complete — CNNv3Effect C++ + FiLM uniform uploadskal
- cnn_v3/src/cnn_v3_effect.{h,cc}: full Effect subclass with 5 compute passes (enc0→enc1→bottleneck→dec1→dec0), shared weights storage buffer, per-pass uniform buffers, set_film_params() API - Fixed WGSL/C++ struct alignment: vec3u has align=16, so CnnV3Params4ch is 64 bytes and CnnV3ParamsEnc1 is 96 bytes (not 48/80) - Weight offsets computed as explicit formulas (e.g. 20*4*9+4) for clarity - Registered in CMake, shaders.h/cc, demo_effects.h, test_demo_effects.cc - 35/35 tests pass handoff(Gemini): CNN v3 Phase 5 next — parity validation (Python ref vs WGSL)
37 hoursfeat(cnn_v3): Phase 3 complete — WGSL U-Net inference shadersskal
5 compute shaders + cnn_v3/common snippet: enc0: Conv(20→4,3×3) + FiLM + ReLU full-res enc1: AvgPool + Conv(4→8,3×3) + FiLM + ReLU half-res bottleneck: AvgPool + Conv(8→8,1×1) + ReLU quarter-res dec1: NearestUp + cat(enc1) + Conv(16→4) + FiLM half-res dec0: NearestUp + cat(enc0) + Conv(8→4) + FiLM + Sigmoid full-res Parity rules: zero-pad conv, AvgPool down, NearestUp, FiLM after conv+bias, skip=concat, OIHW weights+bias layout. Matches PyTorch train_cnn_v3.py forward() exactly. Registered in workspaces/main/assets.txt + src/effects/shaders.cc. Weight layout + Params struct documented in cnn_v3/docs/HOWTO.md §7. Next: Phase 4 — C++ CNNv3Effect + FiLM uniform upload. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
38 hoursmake the heptagon effect more interestingskal
3 daysfeat(cnn_v3): Phase 1 complete - GBufferEffect integrated + HOWTO playbookskal
- Wire GBufferEffect into demo build: assets.txt, DemoSourceLists.cmake, demo_effects.h, shaders.h/cc. ShaderComposer::Compose() applied to gbuf_raster.wgsl (resolves #include "common_uniforms"). - Add GBufferEffect construction test. 35/35 passing. - Write cnn_v3/docs/HOWTO.md: G-buffer wiring, training data prep, training plan, per-pixel validation workflow, phase status table, troubleshooting guide. - Add project hooks: remind to update HOWTO.md on cnn_v3/ edits; warn on direct str_view(*_wgsl) usage bypassing ShaderComposer. - Update PROJECT_CONTEXT.md and TODO.md: Phase 1 done, Phase 3 (WGSL U-Net shaders) is next active. handoff(Gemini): CNN v3 Phase 3 is next - WGSL enc/dec/bottleneck/FiLM shaders in cnn_v3/shaders/. See cnn_v3/docs/CNN_V3.md Architecture section and cnn_v3/docs/HOWTO.md section 3 for spec. GBufferEffect outputs feat_tex0 + feat_tex1 (rgba32uint, 20ch, 32 bytes/pixel). C++ CNNv3Effect (Phase 4) takes those as input nodes.
3 daysfeat(cnn_v3): G-buffer phase 1 + training infrastructureskal
G-buffer (Phase 1): - Add NodeTypes GBUF_ALBEDO/DEPTH32/R8/RGBA32UINT to NodeRegistry - GBufferEffect: MRT raster pass (albedo+normal_mat+depth) + pack compute - Shaders: gbuf_raster.wgsl (MRT), gbuf_pack.wgsl (feature packing, 32B/px) - Shadow/SDF passes stubbed (placeholder textures), CMake integration deferred Training infrastructure (Phase 2): - blender_export.py: headless EXR export with all G-buffer render passes - pack_blender_sample.py: EXR → per-channel PNGs (oct-normals, 1/z depth) - pack_photo_sample.py: photo → zero-filled G-buffer sample layout handoff(Gemini): G-buffer phases 3-5 remain (U-Net shaders, CNNv3Effect, parity)
3 daysdocs(cnn_v3): full design doc — U-Net + FiLM architecture planskal
- CNN_V3.md: complete design document - U-Net enc_channels=[4,8], ~5 KB f16 weights - FiLM conditioning (5D → γ/β per level, CPU-side MLP) - 20-channel feature buffer, 32 bytes/pixel: two rgba32uint textures - feat_tex0: albedo.rgb, normal.xy, depth, depth_grad.xy (f16) - feat_tex1: mat_id, prev.rgb, mip1.rgb, mip2.rgb, shadow, transp (u8) - 4-pass G-buffer: raster MRT + SDF compute + lighting + pack - Per-pixel parity framework: PyTorch / HTML WebGPU / C++ WebGPU (≤1/255) - Training pipelines: Blender full G-buffer + photo-only (channel dropout) - train_cnn_v3_full.sh spec (modelled on v2 script) - HTML tool adaptation plan from cnn_v2/tools/cnn_v2_test/index.html - Binary format v3 header spec - 8-phase ordered implementation checklist - TODO.md: add CNN v3 U-Net+FiLM future task with phases - cnn_v3/README.md: update status to design phase handoff(Gemini): CNN v3 design complete. Phase 0 (stub G-buffer) unblocks all other phases — one compute shader writing feat_tex0+feat_tex1 with synthetic values from the current framebuffer. See cnn_v3/docs/CNN_V3.md Implementation Checklist.
3 daysdocs: archive stale/completed docs, compact active refs (-1300 lines)skal
- Archive WORKSPACE_SYSTEM.md (completed); replace with 36-line operational ref - Archive SHADER_REUSE_INVESTIGATION.md (implemented Feb 2026) - Archive GPU_PROCEDURAL_PHASE4.md (completed feature) - Archive GEOM_BUFFER.md (ideation only, never implemented) - Archive SPECTRAL_BRUSH_EDITOR.md (v1 DCT approach, superseded by MQ v2) - Update CLAUDE.md Tier 3 refs; point Audio to SPECTRAL_BRUSH_2.md - Update TODO.md Task #5 design link to SPECTRAL_BRUSH_2.md - Update COMPLETED.md archive index handoff(Claude): doc cleanup done, 30 active docs (was 34), -1300 lines
3 dayschore: remove broken seeking test, demote CNN v2 quant to future CNN v3skal
handoff(Gemini): removed test_audio_engine_seeking (broken, not worth fixing); moved CNN v2 8-bit quantization to Future as CNN v3 task.
3 daysadd a commit ruleskal
3 daysdocs(init): add glfw as macOS brew dependencyskal
project_init.sh now checks/installs both wgpu-native and glfw via brew. HOWTO.md documents the macOS prerequisites before the build steps. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 daysfix(assets): skip missing binary assets with warning instead of failingskal
asset_packer now emits a zero-size empty stub for BINARY assets whose file is not found, and continues with a warning rather than aborting. Allows building without optional assets like cnn_v2_weights.bin. handoff(Gemini): asset_packer tolerates missing BINARY files; GetAsset() returns nullptr/size=0 for those assets at runtime. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 daysfix(win): accept SuccessSuboptimal in surface texture status checkskal
Wine/Vulkan returns WGPUSurfaceGetCurrentTextureStatus_SuccessSuboptimal instead of SuccessOptimal, causing the blit pass to be skipped entirely and the window to stay black. Fixed in seq_compiler.py (the source template); regenerated timeline.cc. Tests: 35/35. handoff(Gemini): Wine black screen fixed — root cause was status code check rejecting suboptimal swapchain; exe now renders on Wine. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 daysfix(build): move generated assets to per-build binary dirskal
assets_data.cc was shared in src/generated/ across all builds. When a native debug build regenerated it with --disk_load (file paths), then build_win compiled with STRIP_ALL=ON, the disk-load fopen() path was compiled out, leaving raw macOS paths as shader source content — causing wgpu WGSL parse errors on Wine. Fix: GEN_DEMO_H/CC and all stamps now live in CMAKE_CURRENT_BINARY_DIR/ src/generated/ so each build dir independently generates assets in the correct mode (embedded vs disk-load). Added CMAKE_CURRENT_BINARY_DIR/src to CORE_INCLUDES so the binary-dir assets.h is resolved first. handoff(Gemini): build system fix — assets are now per-build-dir; tests 35/35 pass; build_win produces embedded shaders. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 daysfix(assets): regenerate assets when DEMO_STRIP_ALL togglesskal
Write asset_packer_mode.flag at configure time so that switching between disk-load (debug) and embedded (STRIP_ALL) modes invalidates assets_data.cc. Previously a stale embedded assets_data.cc caused fopen() on WGSL bytes, silently failing all shader snippet registration (vs_main not found crash). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 daysdocs(todo): add Wine black screen investigation taskskal
demo64k.exe runs under Wine but renders no visuals. Audio/timeline work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 daysfix(win): update wgpu-native to v27, unify Windows/macOS API pathsskal
- fetch_win_deps.sh: update wgpu-native v0.19.4.1 → v27.0.4.0 (same as macOS) - platform.h: remove v0.19 compat shims, Windows now uses WGPUStringView API - gpu.cc/gpu.h: remove DEMO_CROSS_COMPILE_WIN32 old-API branches - texture_readback.cc, visual_debug.cc, hybrid3d_effect.cc: same cleanup - rotating_cube_effect.cc: remove #ifdef guard for depthSlice - glfw3webgpu.c: remove old WGPUSurfaceDescriptorFromWindowsHWND branch - asset_manager.cc: fix DEMO_STRIP_ALL→STRIP_ALL guard (vs_main was missing in STRIP_ALL Windows builds because disk-loading path ran on embedded data) - tracker.cc: skip MP3 assets gracefully in STRIP_ALL builds instead of fatal handoff(Gemini): Windows .exe now runs under Wine. demo64k.exe renders frames and progresses through audio timeline. Pre-existing test failures unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix(headless): add missing sampler/texture stubs to gpu_headless.ccskal
gpu_create_linear_sampler, gpu_create_nearest_sampler, and gpu_create_dummy_scene_texture were defined only in gpu.cc, causing link errors in DEMO_HEADLESS mode. handoff(Gemini): headless mode link error fixed, 35/35 tests should still pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix(test_3d): correct projection matrix m[5] assertion signskal
m[5] = -1/tan(fov/2); larger FOV yields smaller magnitude (closer to 0), so proj_varied.m[5] > proj.m[5], not <. handoff(Gemini): fixed ThreeDSystemTest assertion in test_3d.cc:43 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfactorize render_ntsc()skal
12 daysfix(effects): particle sync and heptagon SDF bugsskal
- particles: stagger respawn y by golden-ratio index offset to break per-row synchronization (100 particles per row fell in lock-step) - heptagon: fix swapped atan2(x,y)->atan2(y,x) and WGSL % truncation for negative angles (broke SDF for entire lower half of shape) handoff(Gemini): heptagon now correct; particles desynchronized Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix(particles): release compute and render pass encodersskal
Missing wgpuComputePassEncoderRelease/wgpuRenderPassEncoderRelease caused per-frame leaks and command buffer corruption in wgpu-native. handoff(Gemini): particles encoder leak fixed, 2 lines added to render() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 dayschange dither_c64() signature to take 'dimension' directlyskal
12 daysdocs: document CLASS_TO_HEADER override in EFFECT_WORKFLOWskal
12 daysfix(seq_compiler): map NtscYiq to ntsc_effect.hskal
Add CLASS_TO_HEADER override map for classes that share a header file. NtscYiq lives in ntsc_effect.h alongside Ntsc. handoff(Gemini): seq_compiler.py fix for shared-header effect classes.
12 daysntsc: factor common code into snippet; add RGB and YIQ input variantsskal
- Extract shared NTSC logic into render/ntsc_common.wgsl snippet - sample_ntsc_signal() hook decouples input format from processing - ntsc_rgb.wgsl: RGB input (converts via rgba_to_luma_chroma_phase) - ntsc_yiq.wgsl: YIQ passthrough for RotatingCube output - Add NtscYiq WgslEffect thin wrapper; register both in tests handoff(Claude): NTSC refactor complete; NtscYiq ready for timeline use with RotatingCube. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysrotating_cube: use VSOut, and store to yiqskal
13 daysfix: use ShaderComposer in RotatingCube; add rule to CODING_STYLEskal
rotating_cube_effect.cc was bypassing ShaderComposer, causing #include directives in rotating_cube.wgsl to fail at runtime. handoff(Claude): ShaderComposer rule documented and enforced in rotating_cube. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysstyle: require 2-line header comment in all .wgsl filesskal
Add rule to CODING_STYLE.md and apply to ntsc.wgsl. handoff(Claude): rule added, ntsc.wgsl patched; scratch_lines and color_c64 already compliant. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysNTSC: use 6-taps filtering instead of 12-tapskal
14 daysdocs: streamline SEQUENCE.md (12 effects, remove v1 migration notes)skal
handoff(Gemini): SEQUENCE.md updated - removed obsolete v1 migration notes, updated effect count 7→12, added absolute-time note, removed completed TODO items. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 daysdocs: update PROJECT_CONTEXT with new color/color_c64 snippets (27 shaders)skal
14 daysrefactor: mv get_border_col() to color_c64.wgsl as get_border_c64()skal
14 daysfeat: register math/color_c64 snippet in ShaderComposerskal
14 daysrefactor: extract YIQ and C64 dither to common WGSL shadersskal
- math/color.wgsl: add rgba_to_yiqa, yiqa_to_rgba, rgba_to_luma_chroma_phase - math/color_c64.wgsl: new file with C64 palette, Bayer 8x8, Dither() - ntsc.wgsl: include both, remove local duplicates; Dither() now takes xsize/ysize handoff(Claude): YIQ/dither helpers now reusable by other effects
14 daysadd ditheringskal
14 daysntsc effect for realskal
2026-03-08feat: extend debug_print with full ASCII and debug_str()skal
- Replace _dbg_pixel() with _dbg_char(ascii, r, c) covering printable ASCII 0x20-0x7E (95 glyphs, C64-style 8x8 bitmaps) - Update debug_f32() to use ASCII codes directly - Add debug_str(col, pos, origin, s: vec4u, len) for rendering up to 16 chars packed 4-per-u32 big-endian handoff(Claude): debug_print now supports full ASCII strings. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08fix: negate Y in perspective() to correct rasterized 3D orientationskal
The fullscreen post-process VS uses Y-up UVs (uv.y=0 = bottom), so textureSample() Y-flips any rasterized offscreen texture. SDF effects author their content Y-down and look correct after the flip. Rasterized effects (RotatingCube, Hybrid3D) must pre-flip their geometry: - mat4::perspective(): m[5] = -t (negated Y scale) - Pipelines with cullMode=Back: frontFace = WGPUFrontFace_CW (Y-flip reverses winding, so CW becomes the visible face) - Remove incorrect transposes from GlobalUniforms::make(), ObjectData::make(), and Uniforms::make() — mini_math is column-major, no transpose needed for GPU upload - Document the convention in doc/3D.md under "Rasterized 3D and the Y-flip rule" handoff(Gemini): Y-flip rule now documented; all rasterized 3D pipelines must follow it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08fix: register debug/debug_print snippet in ShaderComposerskal
Add SHADER_DEBUG_DEBUG_PRINT to assets.txt and register it as "debug/debug_print" in InitShaderComposer() so ntsc.wgsl #include works. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08feat: add WGSL debug_f32() snippet with C64 8x8 fontskal
Renders a f32 value (±999.999, 3 decimal digits) at a given screen position using authentic C64 8x8 bitmap glyphs in yellow. Usage: col = debug_f32(col, pos.xy, vec2f(10.0, 10.0), value); Include: #include "debug/debug_print" handoff(Gemini): new snippet at src/shaders/debug/debug_print.wgsl Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08fix: transpose matrices on GPU upload (row-major → column-major)skal
mini_math mat4 is row-major; WGSL mat4x4f is column-major. Matrices uploaded without transposing were interpreted as their own transpose on the GPU, causing RotatingCube and Renderer3D to render upside-down. - Add gpu_upload_mat4() to post_process_helper for standalone uploads - Add Uniforms::make() to RotatingCube::Uniforms (handles transpose) - Add GlobalUniforms::make() and ObjectData::make() to renderer.h - Update renderer_draw.cc and visual_debug.cc to use make() handoff(Gemini): matrix layout bug fixed across all rasterized effects. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08feat: new cloudsskal
2026-03-08tweak: scene2.wgsl visual adjustmentsskal
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08docs: update ASSET_SYSTEM.md for WGSL disk-load in dev modeskal
WGSL shaders now join SPEC/MP3 in being loaded from disk in development mode, enabling shader iteration without rebuild. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08feat: WGSL asset load-from-disk in dev modeskal
- asset_packer: include WGSL in --disk_load path storage (alongside SPEC/MP3) - asset_manager: disk-load WGSL assets at runtime when !DEMO_STRIP_ALL - DemoCodegen: pass ASSET_PACKER_FLAGS to pack_test_assets so test assets also use disk-load paths in dev mode (fixes pre-existing SPEC/WGSL test failures) - test_shader_composer: fix stale assertions (fn test_wgsl → fn snippet_a, correct ordering check) 35/35 tests passing. handoff(Claude): WGSL disk-loading implemented. Shaders now loaded from disk in dev mode, enabling hot-reload without rebuild. Tests fixed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>