demo.git - Vide-coded 64k demo system

Age	Commit message (Collapse)	Author
14 hours	CNN v2: Alpha channel depth handling and layer visualization	skal
	Training changes: - Changed p3 default depth from 0.0 to 1.0 (far plane semantics) - Extract depth from target alpha channel in both datasets - Consistent alpha-as-depth across training/validation Test tool enhancements (cnn_test): - Added load_depth_from_alpha() for R32Float depth texture - Fixed bind group layout for UnfilterableFloat sampling - Added --save-intermediates with per-channel grayscale composites - Each layer saved as 4x wide PNG (p0-p3 stacked horizontally) - Global layers_composite.png for vertical layer stack overview Investigation notes: - Static features p4-p7 ARE computed and bound correctly - Sin_20_y pattern visibility difference between tools under investigation - Binary weights timestamp (Feb 13 20:36) vs HTML tool (Feb 13 22:12) - Next: Update HTML tool with canonical binary weights handoff(Claude): HTML tool weights update pending - base64 encoded canonical weights ready in /tmp/weights_b64.txt for line 392 replacement. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
20 hours	CNN v2: Fix WebGPU validation error in uniform buffer alignment	skal
	Fix two issues causing validation errors in test_demo: 1. Remove redundant pipeline creation without layout (static_pipeline_) 2. Change vec3<u32> to 3× u32 fields in StaticFeatureParams struct WGSL vec3<u32> aligns to 16 bytes (std140), making struct 32 bytes, while C++ struct was 16 bytes. Explicit fields ensure consistent layout. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
21 hours	CNN v2: Add TODO for flexible feature layout in binary format v3	skal
	Document future enhancement for arbitrary feature vector layouts. Proposed feature descriptor in binary format v3: - Specify feature types, sources, and ordering - Enable runtime experimentation without shader recompilation - Examples: [R,G,B,dx,dy,uv_x,bias] or [mip1.r,mip2.g,laplacian,uv_x,sin20_x,bias] Added TODOs in: - CNN_V2_BINARY_FORMAT.md: Detailed proposal with struct layout - CNN_V2.md: Future extensions section - train_cnn_v2.py: compute_static_features() docstring - cnn_v2_static.wgsl: Shader header comment - cnn_v2_effect.cc: Version check comment Current limitation: Hardcoded [p0,p1,p2,p3,uv_x,uv_y,sin10_x,bias] layout. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
21 hours	CNN v2: Add mip-level support to runtime effect	skal
	Binary format v2 includes mip_level in header (20 bytes, was 16). Effect reads mip_level and passes to static features shader via uniform. Shader samples from correct mip texture based on mip_level. Changes: - export_cnn_v2_weights.py: Header v2 with mip_level field - cnn_v2_effect.h: Add StaticFeatureParams, mip_level member, params buffer - cnn_v2_effect.cc: Read mip_level from weights, create/bind params buffer, update per-frame - cnn_v2_static.wgsl: Accept params uniform, sample from selected mip level Binary format v2: - Header: 20 bytes (magic, version=2, num_layers, total_weights, mip_level) - Backward compatible: v1 weights load with mip_level=0 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
25 hours	CNNv2Effect: Document per-layer kernel sizes support	skal
	Updated comments to clarify that per-layer kernel sizes are supported. Code already handles this correctly via LayerInfo.kernel_size field. Changes: - cnn_v2_effect.h: Add comment about per-layer kernel sizes - cnn_v2_compute.wgsl: Clarify LayerParams provides per-layer config Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
47 hours	test_demo: Add beat-synchronized CNN post-processing with version selection	skal
	- Add --cnn-version <1\|2> flag to select between CNN v1 and v2 - Implement beat_phase modulation for dynamic blend in both CNN effects - Fix CNN v2 per-layer uniform buffer sharing (each layer needs own buffer) - Fix CNN v2 y-axis orientation to match render pass convention - Add Scene1Effect as base visual layer to test_demo timeline - Reorganize CNN v2 shaders into cnn_v2/ subdirectory - Update asset paths and documentation for new shader organization Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2 days	Fix CNN v2 weights validation logic	skal
	FATAL_CHECK triggers when condition is TRUE (error case). Inverted equality checks: magic/version == correct_value would fatal when weights were valid. Changed to != checks to fail on invalid data.
2 days	CNN v2: Complete multi-layer compute execution	skal
	- Create bind groups per layer with ping-pong buffers - Update layer params uniform per dispatch - Execute all layers in sequence with proper input/output swapping - Ready for weight export and end-to-end testing
2 days	CNN v2: storage buffer architecture foundation	skal
	- Add binary weight format (header + layer info + packed f16) - New export_cnn_v2_weights.py for binary weight export - Single cnn_v2_compute.wgsl shader with storage buffer - Load weights in CNNv2Effect::load_weights() - Create layer compute pipeline with 5 bindings - Fast training config: 100 epochs, 3×3 kernels, 8→4→4 channels Next: Complete bind group creation and multi-layer compute execution
2 days	CNN v2 Phase 5: render pipeline implementation	skal
	Complete multi-pass compute execution for CNNv2Effect. Implementation: - Layer texture creation (ping-pong buffers for intermediate results) - Static features compute pipeline with bind group layout - Bind group creation with 5 bindings (input mips + depth + output) - compute() override for multi-pass execution - Static features pass with proper workgroup dispatch Architecture: - Static features: 8×f16 packed as 4×u32 (RGBD + UV + sin + bias) - Layer buffers: 2×RGBA32Uint textures (8 channels f16 each) - Input mips: 3 levels (0, 1, 2) for multi-scale features - Workgroup size: 8×8 threads Status: - Static features compute pass functional - Layer pipeline infrastructure ready - All 36/36 tests passing Next: Layer shader integration, multi-layer execution Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2 days	CNN v2: parametric static features - Phases 1-4	skal
	Infrastructure for enhanced CNN post-processing with 7D feature input. Phase 1: Shaders - Static features compute (RGBD + UV + sin10_x + bias → 8×f16) - Layer template (convolution skeleton, packing/unpacking) - 3 mip level support for multi-scale features Phase 2: C++ Effect - CNNv2Effect class (multi-pass architecture) - Texture management (static features, layer buffers) - Build integration (CMakeLists, assets, tests) Phase 3: Training Pipeline - train_cnn_v2.py: PyTorch model with static feature concatenation - export_cnn_v2_shader.py: f32→f16 quantization, WGSL generation - Configurable architecture (kernels, channels) Phase 4: Validation - validate_cnn_v2.sh: End-to-end pipeline - Checkpoint → shaders → build → test images Tests: 36/36 passing Next: Complete render pipeline implementation (bind groups, multi-pass) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2 days	fix: suppress spurious shader snippet and auxiliary texture warnings	skal
	- Add render/scene_query_mode to known placeholders in VerifyIncludes - Remove warning for duplicate auxiliary texture registration (valid for multiple CNNEffect stacks) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	fix: add missing beat_phase parameter to stub gpu_draw implementations	skal
	Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	feat: implement beat-based timing system	skal
	BREAKING CHANGE: Timeline format now uses beats as default unit ## Core Changes Uniform Structure (32 bytes maintained): - Added `beat_time` (absolute beats for musical animation) - Added `beat_phase` (fractional 0-1 for smooth oscillation) - Renamed `beat` → `beat_phase` - Kept `time` (physical seconds, tempo-independent) Seq Compiler: - Default: all numbers are beats (e.g., `5`, `16.5`) - Explicit seconds: `2.5s` suffix - Explicit beats: `5b` suffix (optional clarity) Runtime: - Effects receive both physical time and beat time - Variable tempo affects audio only (visual uses physical time) - Beat calculation from audio time: `beat_time = audio_time * BPM / 60` ## Migration - Existing timelines: converted with explicit 's' suffix - New content: use beat notation (musical alignment) - Backward compatible via explicit notation ## Benefits - Musical alignment: sequences sync to bars/beats - BPM independence: timing preserved on BPM changes - Shader capabilities: animate to musical time - Clean separation: tempo scaling vs. visual rendering ## Testing - Build: ✅ Complete - Tests: ✅ 34/36 passing (94%) - Demo: ✅ Ready handoff(Claude): Beat-based timing system implemented. Variable tempo only affects audio sample triggering. Visual effects use physical_time (constant) and beat_time (musical). Shaders can now animate to beats. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	fix: Release cached samplers before WGPU shutdown	skal
	SamplerCache singleton never released samplers, causing device to retain references at shutdown. Add clear() method and call before fixture cleanup. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	fix: Resolve WGPU threading crash in cnn_test	skal
	- Release queue reference after submit in texture_readback - Add final wgpuDevicePoll before cleanup to sync GPU work Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	fix: Register cnn_conv1x1 snippet and add verification	skal
	- Add cnn_conv1x1 to shader composer registration - Add VerifyIncludes() to detect missing snippet registrations - STRIP_ALL-protected verification warns about unregistered includes - Fixes cnn_test runtime failure loading cnn_layer.wgsl Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	fix: Use actual resolution in RotatingCubeEffect	skal
	Hardcoded vec2(1280.0f, 720.0f) → u.resolution Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	fix: Use initialized_ flag instead of ctx_.device check	skal
	ctx_.device exists before init() but Renderer3D not initialized yet. Changed guard from !ctx_.device to !initialized_ flag. Set initialized_ = true after renderer_.init() in both effects. All 36 tests pass. Demo runs without crash. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	fix: Complete auxiliary texture initialization fix	skal
	Root cause: After swapping init/resize order, effects with Renderer3D crashed because resize() called before init() tried to use uninitialized GPU resources. Changes: - Add guards in FlashCubeEffect::resize() and Hybrid3DEffect::resize() to check ctx_.device before calling renderer_.resize() - Remove lazy initialization remnants from CircleMaskEffect and CNNEffect - Register auxiliary textures directly in init() (width_/height_ already set) - Remove ensure_texture() methods and texture_initialized_ flags All 36 tests passing. Demo runs without crashes. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	fix: Call resize() before init() to set dimensions first	skal
	Simpler solution than lazy initialization: effects need correct dimensions during init() to register auxiliary textures. Changed initialization order in MainSequence: - resize() sets width_/height_ FIRST - init() can then use correct dimensions Reverted lazy initialization complexity. One-line fix. Tests: All 36 tests passing, demo runs without error Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	refactor: Use lazy initialization for auxiliary textures	skal
	Prevents init/resize ordering bug and avoids unnecessary reallocation. Changes: - Auxiliary textures created on first use (compute/update_bind_group) - Added ensure_texture() methods to defer registration until resize() - Added early return in resize() if dimensions unchanged - Removed texture registration from init() methods Benefits: - No reallocation on window resize if dimensions match - Texture created with correct dimensions from start - Memory saved if effect never renders Tests: All 36 tests passing Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	fix: Resolve auxiliary texture resolution mismatch bug	skal
	Auxiliary textures were created during init() using default dimensions (1280x720) before resize() was called with actual window size. This caused compute shaders to receive uniforms with correct resolution but render to wrong-sized textures. Changes: - Add MainSequence::resize_auxiliary_texture() to recreate textures - Override resize() in CircleMaskEffect to resize circle_mask texture - Override resize() in CNNEffect to resize captured_frame texture - Bind groups are recreated with new texture views after resize Tests: All 36 tests passing Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3 days	refactor: Simplify effect render API and fix uniform initialization	skal
	Root cause: Uniform buffers created but not initialized before bind group creation, causing undefined UV coordinates in circle_mask_compute.wgsl. Changes: - Add get_common_uniforms() helper to Effect base class - Refactor render()/compute() signatures: 5 params → CommonPostProcessUniforms& - Fix uninitialized uniforms in CircleMaskEffect and CNNEffect - Update all 19 effect implementations and headers - Fix WGSL syntax error in FlashEffect (u.audio_intensity → audio_intensity) - Update test files (test_sequence.cc) Benefits: - Cleaner API: construct uniforms once per frame, reuse across effects - More maintainable: CommonPostProcessUniforms changes need no call site updates - Fixes UV coordinate bug in circle_mask_compute.wgsl All 36 tests passing (100%) handoff(Claude): Effect API refactor complete
3 days	add --save-intermediates to train.py and cnn_test	skal

3 days	fix: CNN test tool GPU readback with wgpuDevicePoll	skal
	Fixed buffer mapping callback mode mismatch causing Unknown status. Changed from WaitAnyOnly+ProcessEvents to AllowProcessEvents+DevicePoll. Readback now functional but CNN output incorrect (all white). Issue isolated to tool-specific binding/uniform setup - CNNEffect in demo works correctly. Technical details: - WGPUCallbackMode_WaitAnyOnly requires wgpuInstanceWaitAny - Using wgpuInstanceProcessEvents with WaitAnyOnly never fires callback - Fixed by using AllowProcessEvents mode + wgpuDevicePoll - Removed debug output and platform warnings Status: 36/36 tests pass, readback works, CNN shader issue remains. handoff(Claude): CNN test tool readback fixed, output debugging needed
3 days	feat: Add CNN shader testing tool with GPU texture readback	skal
	Core GPU Utility (texture_readback): - Reusable synchronous texture-to-CPU readback (~150 lines) - STRIP_ALL guards (0 bytes in release builds) - Handles COPY_BYTES_PER_ROW_ALIGNMENT (256-byte alignment) - Refactored OffscreenRenderTarget to use new utility CNN Test Tool (cnn_test): - Standalone PNG→3-layer CNN→PNG/PPM tool (~450 lines) - --blend parameter (0.0-1.0) for final layer mixing - --format option (png/ppm) for output format - ShaderComposer integration for include resolution Build Integration: - Added texture_readback.cc to GPU_SOURCES (both sections) - Tool target with STB_IMAGE support Testing: - All 36 tests pass (100%) - Processes 64×64 and 555×370 images successfully - Ground-truth validation setup complete Known Issues: - BUG: Tool produces black output (uninitialized input texture) - First intermediate texture not initialized before layer loop - MSE 64860 vs Python ground truth (expected <10) - Fix required: Copy input to intermediate[0] before processing Documentation: - doc/CNN_TEST_TOOL.md - Full technical reference - Updated PROJECT_CONTEXT.md and COMPLETED.md handoff(Claude): CNN test tool foundation complete, needs input init bugfix Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4 days	fix: Use ClampToEdge sampler for CNN to avoid edge wrapping	skal
	PyTorch Conv2d uses zero-padding; shader was using Repeat mode which wraps edges. ClampToEdge better approximates zero-padding behavior. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4 days	feat: Add Scene1 effect from ShaderToy (raymarching cube & sphere)	skal
	Converted ShaderToy shader (Saturday cubism experiment) to Scene1Effect following EFFECT_WORKFLOW.md automation guidelines. Changes: - Created Scene1Effect (.h, .cc) as scene effect (not post-process) - Converted GLSL to WGSL with manual fixes: - Replaced RESOLUTION/iTime with uniforms.resolution/time - Fixed const expressions (normalize not allowed in const) - Converted mainImage() to fs_main() return value - Manual matrix rotation for scene transformation - Added shader asset to workspaces/main/assets.txt - Registered in CMakeLists.txt (both GPU_SOURCES sections) - Added to demo_effects.h and shaders declarations - Added to timeline.seq at 22.5s for 10s duration - Added to test_demo_effects.cc scene_effects list Shader features: - Raymarching cube and sphere with ground plane - Reflections and soft shadows - Sky rendering with sun and horizon glow - ACES tonemapping and sRGB output - Time-based rotation animation Tests: All effects tests passing (5/9 scene, 9/9 post-process) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4 days	refactor: Factor WGPU boilerplate into builder pattern helpers	skal
	Add BindGroupLayoutBuilder, BindGroupBuilder, RenderPipelineBuilder, and SamplerCache to reduce repetitive WGPU code. Refactor post_process_helper, cnn_effect, and rotating_cube_effect. Changes: - Bind group creation: 19 instances, 14→4 lines each - Pipeline creation: 30-50→8 lines - Sampler deduplication: 6 instances → cached - Total boilerplate reduction: -122 lines across 3 files Builder pattern prevents binding index errors and consolidates platform-specific #ifdef in fewer locations. Binary size unchanged (6.3M debug). Tests pass. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4 days	fix: Resolve CNN effect black screen bug (framebuffer capture + uniforms)	skal
	Two bugs causing black screen when CNN post-processing activated: 1. Framebuffer capture timing: Capture ran inside post-effect loop after ping-pong swaps, causing layers 1+ to capture wrong buffer. Moved capture before loop to copy framebuffer_a once before post-chain starts. 2. Missing uniforms update: CNNEffect never updated uniforms_ buffer, leaving uniforms.resolution uninitialized (0,0). UV calculation p.xy/uniforms.resolution produced NaN, causing all texture samples to return black. Added uniforms update in update_bind_group(). Files modified: - src/gpu/effect.cc: Capture before post-chain (lines 308-346) - src/gpu/effects/cnn_effect.cc: Add uniforms update (lines 132-142) - workspaces/main/shaders/cnn/cnn_layer.wgsl: Remove obsolete comment - doc/CNN_DEBUG.md: Historical debugging doc - CLAUDE.md: Reference CNN_DEBUG.md in historical section Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4 days	fix: Capture scene framebuffer before post-processing for CNN effect	skal
	CNNEffect's "original" input was black because FadeEffect (priority 1) ran before CNNEffect (priority 1), fading the scene. Changed framebuffer capture to use framebuffer_a (scene output) instead of current_input (post-chain). Also add seq_compiler validation to detect post-process priority collisions within and across concurrent sequences, preventing similar render order issues. Updated stub_types.h WGPULoadOp enum values to match webgpu.h spec. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4 days	feat: Add multi-layer CNN support with framebuffer capture and blend control	skal
	Implements automatic layer chaining and generic framebuffer capture API for multi-layer neural network effects with proper original input preservation. Key changes: - Effect::needs_framebuffer_capture() - generic API for pre-render capture - MainSequence: auto-capture to "captured_frame" auxiliary texture - CNNEffect: multi-layer support via layer_index/total_layers params - seq_compiler: expands "layers=N" to N chained effect instances - Shader: @binding(4) original_input available to all layers - Training: generates layer switches and original input binding - Blend: mix(original, result, blend_amount) uses layer 0 input Timeline syntax: CNNEffect layers=3 blend=0.7 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4 days	feat: Add CNN post-processing effect with modular WGSL architecture	skal
	Implements multi-layer convolutional neural network shader for stylized post-processing of 3D rendered scenes: Core Components: - CNNEffect: C++ effect class with single-layer rendering (expandable to multi-pass) - Modular WGSL snippets: cnn_activation, cnn_conv3x3/5x5/7x7, cnn_weights_generated - Placeholder identity-like weights for initial testing (to be replaced by trained weights) Architecture: - Flexible kernel sizes (3×3, 5×5, 7×7) via separate snippet files - ShaderComposer integration (#include resolution) - Residual connections (input + processed output) - Supports parallel convolutions (design ready, single conv implemented) Size Impact: - ~3-4 KB shader code (snippets + main shader) - ~2-4 KB weights (depends on network architecture when trained) - Total: ~5-8 KB (acceptable for 64k demo) Testing: - CNNEffect added to test_demo_effects.cc - 36/36 tests passing (100%) Next Steps: - Training script (scripts/train_cnn.py) to generate real weights - Multi-layer rendering with ping-pong textures - Weight quantization for size optimization handoff(Claude): CNN effect foundation complete, ready for training integration
5 days	feat: Add headless mode for testing without GPU	skal
	Implements DEMO_HEADLESS build option for fast iteration cycles: - Functional GPU/platform stubs (not pure no-ops like STRIP_EXTERNAL_LIBS) - Audio and timeline systems work normally - No rendering overhead - Useful for CI, audio development, timeline validation Files added: - doc/HEADLESS_MODE.md - Documentation - src/gpu/headless_gpu.cc - Validated GPU stubs - src/platform/headless_platform.cc - Time simulation (60Hz) - scripts/test_headless.sh - End-to-end test script Usage: cmake -B build_headless -DDEMO_HEADLESS=ON cmake --build build_headless -j4 ./build_headless/demo64k --headless --duration 30 Progress printed every 5s. Compatible with --dump_wav mode. handoff(Claude): Task #76 follow-up - headless mode complete
5 days	feat: Implement Task #76 external library size measurement	skal
	- Use ma_backend_null for audio (100-200KB savings) - Stub platform/gpu abstractions instead of external APIs - Add DEMO_STRIP_EXTERNAL_LIBS build mode - Create stub_types.h with minimal WebGPU opaque types - Add scripts/measure_size.sh for automated measurement Results: Demo=4.4MB, External=2.0MB (69% vs 31%) handoff(Claude): Task #76 complete. Binary compiles but doesn't run (size measurement only).
5 days	fix: Add ShaderComposer to CircleMaskEffect pipelines	skal
	CircleMaskEffect was creating shader modules directly without using ShaderComposer, causing #include directives to fail at runtime. Changes: - Add ShaderComposer.Compose() for compute and render shaders - Include shader_composer.h header Fixes demo64k crash on CircleMaskEffect initialization. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
5 days	refactor: Deduplicate CommonUniforms with #include in WGSL shaders	skal
	Replace redundant CommonUniforms struct definitions across 13 shaders with #include "common_uniforms" directive. Integrate ShaderComposer preprocessing into all shader creation pipelines. Changes: - Replace 9-line CommonUniforms definitions with single #include line - Add ShaderComposer.Compose() to create_post_process_pipeline() - Add ShaderComposer.Compose() to gpu_create_render_pass() - Add ShaderComposer.Compose() to gpu_create_compute_pass() - Add InitShaderComposer() calls to test_effect_base and test_demo_effects - Update test_shader_compilation to compose shaders before validation Net reduction: 83 lines of duplicate code eliminated All 35 tests passing (100%) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
5 days	refactor: Generic sampler system for composites	skal
	Replace hardcoded linear_sampler_ with configurable sampler map. - SamplerType enum (LinearClamp, LinearRepeat, NearestClamp, NearestRepeat) - get_or_create_sampler() for lazy sampler creation - Default to LinearClamp for backward compatibility Eliminates hardcoded assumptions, more flexible for future use cases.
5 days	feat: GPU procedural Phase 4 - texture composition	skal
	Multi-input composite shaders with sampler support. - Dynamic bind group layouts (N input textures + 1 sampler) - dispatch_composite() for multi-input compute dispatch - create_gpu_composite_texture() API - gen_blend.wgsl and gen_mask.wgsl shaders Guarded with #if !defined(STRIP_GPU_COMPOSITE) for easy removal. Tests: - Blend two noise textures - Mask noise with grid - Multi-stage composite (composite of composites) Size: ~830 bytes (2 shaders + dispatch logic) handoff(Claude): GPU procedural Phase 4 complete
5 days	refactor: Unify TextureManager compute pipeline management	skal
	Replace individual pipeline pointers with map-based system. - Changed from 3 pointers to std::map<string, ComputePipelineInfo> - Unified get_or_create_compute_pipeline() for lazy init - Unified dispatch_compute() for all shaders - Simplified create_gpu_*_texture() methods (~390 lines removed) handoff(Claude): GPU procedural texture refactoring complete
5 days	feat(gpu): Phase 2 - Add gen_perlin and gen_grid GPU compute shaders	skal
	Complete Phase 2 implementation: - gen_perlin.wgsl: FBM with configurable octaves, amplitude decay - gen_grid.wgsl: Grid pattern with configurable spacing/thickness - TextureManager extensions: create_gpu_perlin_texture(), create_gpu_grid_texture() - Asset packer now validates gen_noise, gen_perlin, gen_grid for PROC_GPU() - 3 compute pipelines (lazy-init on first use) Shader parameters: - gen_perlin: seed, frequency, amplitude, amplitude_decay, octaves (32 bytes) - gen_grid: width, height, grid_size, thickness (16 bytes) test_3d_render migration: - Replaced CPU sky texture (gen_perlin) with GPU version - Replaced CPU noise texture (gen_noise) with GPU version - Added new GPU grid texture (256x256, 32px grid, 2px lines) Size impact: - gen_perlin.wgsl: ~200 bytes (compressed) - gen_grid.wgsl: ~100 bytes (compressed) - Total Phase 2 code: ~300 bytes - Cumulative (Phase 1+2): ~600 bytes Testing: - All 34 tests passing (100%) - test_gpu_procedural validates all generators - test_3d_render uses 3 GPU textures (noise, perlin, grid) Next: Phase 3 - Variable dimensions, async generation, pipeline caching Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
5 days	feat(gpu): Add GPU procedural texture generation system	skal
	Phase 1 implementation complete: - GPU compute shader for noise generation (gen_noise.wgsl) - TextureManager extensions: create_gpu_noise_texture(), dispatch_noise_compute() - Asset packer PROC_GPU() syntax support with validation - ShaderComposer integration for #include resolution - Zero CPU memory overhead (GPU-only textures) - Init-time and on-demand generation modes Technical details: - 8×8 workgroup size for 256×256 textures - UniformBuffer for params (width, height, seed, frequency) - Storage texture binding (rgba8unorm, write-only) - Lazy pipeline compilation on first use - ~300 bytes code (Phase 1) Testing: - New test: test_gpu_procedural.cc (passes) - All 34 tests passing (100%) Future phases: - Phase 2: Add gen_perlin, gen_grid compute shaders - Phase 3: Variable dimensions, async generation Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
5 days	cleanup: remove unused include from uniform_helper.h	skal

5 days	Refactor Effect class to centralize common uniforms management	skal
	Moved to Effect base class. Updated all subclasses to use the base member, removing redundant declarations and initializations. Cleaned up by removing redundant class definitions and including specific headers. Fixed a typo in DistortEffect constructor.
5 days	feat: WGSL Uniform Buffer Validation & Consolidation (Task #75)	skal
	- Added to validate WGSL/C++ struct alignment. - Integrated validation into . - Standardized uniform usage in , , , . - Renamed generic to specific names in WGSL and C++ to avoid collisions. - Added and updated . - handoff(Gemini): Completed Task #75.
5 days	chore: Apply clang-format to C++ source files	skal
	This commit applies clang-format to the project's C++ source files (.h and .cc) according to the project's contributing guidelines. This includes minor adjustments to whitespace, line endings, and header includes for consistency.
5 days	fix: Resolve DemoEffectsTest SEGFAULT and update GEMINI.md	skal
	- Fixed a persistent SEGFAULT in DemoEffectsTest, allowing all 33 tests to pass (100% test coverage). - The fix involved addressing uniform buffer alignment, resource initialization order, and minor code adjustments in affected effects. - Updated GEMINI.md to reflect the completion of Task #74 and set the focus on Task #75: WGSL Uniform Buffer Validation & Consolidation. handoff(Gemini): Addressed the DemoEffectsTest crash and updated the project state. Next up is Task #75 for robust uniform buffer validation.
5 days	fix: Increase dummy uniform buffer to 32 bytes for alignment	skal
	Related to Task #74. The dummy buffer used when effect_params is null must be 32 bytes to match CommonPostProcessUniforms size, not 16 bytes. Prevents potential validation errors when binding group expects 32-byte uniform buffer at binding 3.
5 days	fix: Resolve WebGPU uniform buffer alignment issues (Task #74)	skal
	Fixed multiple WGSL/C++ struct alignment mismatches causing validation errors: Padding fixes: - fade_effect.cc: Changed EffectParams padding from vec3<f32> to _pad0/1/2 - theme_modulation_effect.cc: Same padding fix for EffectParams - Root cause: WGSL vec3<f32> has 16-byte alignment, creating 32-byte structs ODR violation fix: - demo_effects.h: Added includes for fade_effect.h, theme_modulation_effect.h - Removed incomplete forward declarations (88 bytes) conflicting with complete definitions (96 bytes), causing heap buffer overflow in make_shared Member shadowing cleanup: - Renamed Effect::uniforms_ shadowing members to descriptive names: - FadeEffect: uniforms_ -> common_uniforms_ - FlashEffect: uniforms_ -> flash_uniforms_ - ThemeModulationEffect: uniforms_ -> common_uniforms_ Results: - demo64k runs without crashes - 33/33 tests passing (100%) - Added Task #75: WGSL uniform validation tool handoff(Claude): Uniform buffer alignment debugged and fixed