| Age | Commit message (Collapse) | Author |
|
Training changes:
- Changed p3 default depth from 0.0 to 1.0 (far plane semantics)
- Extract depth from target alpha channel in both datasets
- Consistent alpha-as-depth across training/validation
Test tool enhancements (cnn_test):
- Added load_depth_from_alpha() for R32Float depth texture
- Fixed bind group layout for UnfilterableFloat sampling
- Added --save-intermediates with per-channel grayscale composites
- Each layer saved as 4x wide PNG (p0-p3 stacked horizontally)
- Global layers_composite.png for vertical layer stack overview
Investigation notes:
- Static features p4-p7 ARE computed and bound correctly
- Sin_20_y pattern visibility difference between tools under investigation
- Binary weights timestamp (Feb 13 20:36) vs HTML tool (Feb 13 22:12)
- Next: Update HTML tool with canonical binary weights
handoff(Claude): HTML tool weights update pending - base64 encoded
canonical weights ready in /tmp/weights_b64.txt for line 392 replacement.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Fix two issues causing validation errors in test_demo:
1. Remove redundant pipeline creation without layout (static_pipeline_)
2. Change vec3<u32> to 3× u32 fields in StaticFeatureParams struct
WGSL vec3<u32> aligns to 16 bytes (std140), making struct 32 bytes,
while C++ struct was 16 bytes. Explicit fields ensure consistent layout.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Document future enhancement for arbitrary feature vector layouts.
Proposed feature descriptor in binary format v3:
- Specify feature types, sources, and ordering
- Enable runtime experimentation without shader recompilation
- Examples: [R,G,B,dx,dy,uv_x,bias] or [mip1.r,mip2.g,laplacian,uv_x,sin20_x,bias]
Added TODOs in:
- CNN_V2_BINARY_FORMAT.md: Detailed proposal with struct layout
- CNN_V2.md: Future extensions section
- train_cnn_v2.py: compute_static_features() docstring
- cnn_v2_static.wgsl: Shader header comment
- cnn_v2_effect.cc: Version check comment
Current limitation: Hardcoded [p0,p1,p2,p3,uv_x,uv_y,sin10_x,bias] layout.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Binary format v2 includes mip_level in header (20 bytes, was 16).
Effect reads mip_level and passes to static features shader via uniform.
Shader samples from correct mip texture based on mip_level.
Changes:
- export_cnn_v2_weights.py: Header v2 with mip_level field
- cnn_v2_effect.h: Add StaticFeatureParams, mip_level member, params buffer
- cnn_v2_effect.cc: Read mip_level from weights, create/bind params buffer, update per-frame
- cnn_v2_static.wgsl: Accept params uniform, sample from selected mip level
Binary format v2:
- Header: 20 bytes (magic, version=2, num_layers, total_weights, mip_level)
- Backward compatible: v1 weights load with mip_level=0
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Updated comments to clarify that per-layer kernel sizes are supported.
Code already handles this correctly via LayerInfo.kernel_size field.
Changes:
- cnn_v2_effect.h: Add comment about per-layer kernel sizes
- cnn_v2_compute.wgsl: Clarify LayerParams provides per-layer config
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
- Add --cnn-version <1|2> flag to select between CNN v1 and v2
- Implement beat_phase modulation for dynamic blend in both CNN effects
- Fix CNN v2 per-layer uniform buffer sharing (each layer needs own buffer)
- Fix CNN v2 y-axis orientation to match render pass convention
- Add Scene1Effect as base visual layer to test_demo timeline
- Reorganize CNN v2 shaders into cnn_v2/ subdirectory
- Update asset paths and documentation for new shader organization
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
FATAL_CHECK triggers when condition is TRUE (error case).
Inverted equality checks: magic/version == correct_value
would fatal when weights were valid.
Changed to != checks to fail on invalid data.
|
|
- Create bind groups per layer with ping-pong buffers
- Update layer params uniform per dispatch
- Execute all layers in sequence with proper input/output swapping
- Ready for weight export and end-to-end testing
|
|
- Add binary weight format (header + layer info + packed f16)
- New export_cnn_v2_weights.py for binary weight export
- Single cnn_v2_compute.wgsl shader with storage buffer
- Load weights in CNNv2Effect::load_weights()
- Create layer compute pipeline with 5 bindings
- Fast training config: 100 epochs, 3×3 kernels, 8→4→4 channels
Next: Complete bind group creation and multi-layer compute execution
|
|
Complete multi-pass compute execution for CNNv2Effect.
Implementation:
- Layer texture creation (ping-pong buffers for intermediate results)
- Static features compute pipeline with bind group layout
- Bind group creation with 5 bindings (input mips + depth + output)
- compute() override for multi-pass execution
- Static features pass with proper workgroup dispatch
Architecture:
- Static features: 8×f16 packed as 4×u32 (RGBD + UV + sin + bias)
- Layer buffers: 2×RGBA32Uint textures (8 channels f16 each)
- Input mips: 3 levels (0, 1, 2) for multi-scale features
- Workgroup size: 8×8 threads
Status:
- Static features compute pass functional
- Layer pipeline infrastructure ready
- All 36/36 tests passing
Next: Layer shader integration, multi-layer execution
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Infrastructure for enhanced CNN post-processing with 7D feature input.
Phase 1: Shaders
- Static features compute (RGBD + UV + sin10_x + bias → 8×f16)
- Layer template (convolution skeleton, packing/unpacking)
- 3 mip level support for multi-scale features
Phase 2: C++ Effect
- CNNv2Effect class (multi-pass architecture)
- Texture management (static features, layer buffers)
- Build integration (CMakeLists, assets, tests)
Phase 3: Training Pipeline
- train_cnn_v2.py: PyTorch model with static feature concatenation
- export_cnn_v2_shader.py: f32→f16 quantization, WGSL generation
- Configurable architecture (kernels, channels)
Phase 4: Validation
- validate_cnn_v2.sh: End-to-end pipeline
- Checkpoint → shaders → build → test images
Tests: 36/36 passing
Next: Complete render pipeline implementation (bind groups, multi-pass)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
- Add render/scene_query_mode to known placeholders in VerifyIncludes
- Remove warning for duplicate auxiliary texture registration (valid for multiple CNNEffect stacks)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
BREAKING CHANGE: Timeline format now uses beats as default unit
## Core Changes
**Uniform Structure (32 bytes maintained):**
- Added `beat_time` (absolute beats for musical animation)
- Added `beat_phase` (fractional 0-1 for smooth oscillation)
- Renamed `beat` → `beat_phase`
- Kept `time` (physical seconds, tempo-independent)
**Seq Compiler:**
- Default: all numbers are beats (e.g., `5`, `16.5`)
- Explicit seconds: `2.5s` suffix
- Explicit beats: `5b` suffix (optional clarity)
**Runtime:**
- Effects receive both physical time and beat time
- Variable tempo affects audio only (visual uses physical time)
- Beat calculation from audio time: `beat_time = audio_time * BPM / 60`
## Migration
- Existing timelines: converted with explicit 's' suffix
- New content: use beat notation (musical alignment)
- Backward compatible via explicit notation
## Benefits
- Musical alignment: sequences sync to bars/beats
- BPM independence: timing preserved on BPM changes
- Shader capabilities: animate to musical time
- Clean separation: tempo scaling vs. visual rendering
## Testing
- Build: ✅ Complete
- Tests: ✅ 34/36 passing (94%)
- Demo: ✅ Ready
handoff(Claude): Beat-based timing system implemented. Variable tempo
only affects audio sample triggering. Visual effects use physical_time
(constant) and beat_time (musical). Shaders can now animate to beats.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
SamplerCache singleton never released samplers, causing device to retain
references at shutdown. Add clear() method and call before fixture cleanup.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
- Release queue reference after submit in texture_readback
- Add final wgpuDevicePoll before cleanup to sync GPU work
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
- Add cnn_conv1x1 to shader composer registration
- Add VerifyIncludes() to detect missing snippet registrations
- STRIP_ALL-protected verification warns about unregistered includes
- Fixes cnn_test runtime failure loading cnn_layer.wgsl
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Hardcoded vec2(1280.0f, 720.0f) → u.resolution
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
ctx_.device exists before init() but Renderer3D not initialized yet.
Changed guard from !ctx_.device to !initialized_ flag.
Set initialized_ = true after renderer_.init() in both effects.
All 36 tests pass. Demo runs without crash.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Root cause: After swapping init/resize order, effects with Renderer3D crashed
because resize() called before init() tried to use uninitialized GPU resources.
Changes:
- Add guards in FlashCubeEffect::resize() and Hybrid3DEffect::resize() to
check ctx_.device before calling renderer_.resize()
- Remove lazy initialization remnants from CircleMaskEffect and CNNEffect
- Register auxiliary textures directly in init() (width_/height_ already set)
- Remove ensure_texture() methods and texture_initialized_ flags
All 36 tests passing. Demo runs without crashes.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Simpler solution than lazy initialization: effects need correct
dimensions during init() to register auxiliary textures.
Changed initialization order in MainSequence:
- resize() sets width_/height_ FIRST
- init() can then use correct dimensions
Reverted lazy initialization complexity. One-line fix.
Tests: All 36 tests passing, demo runs without error
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Prevents init/resize ordering bug and avoids unnecessary reallocation.
Changes:
- Auxiliary textures created on first use (compute/update_bind_group)
- Added ensure_texture() methods to defer registration until resize()
- Added early return in resize() if dimensions unchanged
- Removed texture registration from init() methods
Benefits:
- No reallocation on window resize if dimensions match
- Texture created with correct dimensions from start
- Memory saved if effect never renders
Tests: All 36 tests passing
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Auxiliary textures were created during init() using default dimensions
(1280x720) before resize() was called with actual window size. This
caused compute shaders to receive uniforms with correct resolution but
render to wrong-sized textures.
Changes:
- Add MainSequence::resize_auxiliary_texture() to recreate textures
- Override resize() in CircleMaskEffect to resize circle_mask texture
- Override resize() in CNNEffect to resize captured_frame texture
- Bind groups are recreated with new texture views after resize
Tests: All 36 tests passing
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Root cause: Uniform buffers created but not initialized before bind group
creation, causing undefined UV coordinates in circle_mask_compute.wgsl.
Changes:
- Add get_common_uniforms() helper to Effect base class
- Refactor render()/compute() signatures: 5 params → CommonPostProcessUniforms&
- Fix uninitialized uniforms in CircleMaskEffect and CNNEffect
- Update all 19 effect implementations and headers
- Fix WGSL syntax error in FlashEffect (u.audio_intensity → audio_intensity)
- Update test files (test_sequence.cc)
Benefits:
- Cleaner API: construct uniforms once per frame, reuse across effects
- More maintainable: CommonPostProcessUniforms changes need no call site updates
- Fixes UV coordinate bug in circle_mask_compute.wgsl
All 36 tests passing (100%)
handoff(Claude): Effect API refactor complete
|
|
|
|
Fixed buffer mapping callback mode mismatch causing Unknown status.
Changed from WaitAnyOnly+ProcessEvents to AllowProcessEvents+DevicePoll.
Readback now functional but CNN output incorrect (all white).
Issue isolated to tool-specific binding/uniform setup - CNNEffect
in demo works correctly.
Technical details:
- WGPUCallbackMode_WaitAnyOnly requires wgpuInstanceWaitAny
- Using wgpuInstanceProcessEvents with WaitAnyOnly never fires callback
- Fixed by using AllowProcessEvents mode + wgpuDevicePoll
- Removed debug output and platform warnings
Status: 36/36 tests pass, readback works, CNN shader issue remains.
handoff(Claude): CNN test tool readback fixed, output debugging needed
|
|
Core GPU Utility (texture_readback):
- Reusable synchronous texture-to-CPU readback (~150 lines)
- STRIP_ALL guards (0 bytes in release builds)
- Handles COPY_BYTES_PER_ROW_ALIGNMENT (256-byte alignment)
- Refactored OffscreenRenderTarget to use new utility
CNN Test Tool (cnn_test):
- Standalone PNG→3-layer CNN→PNG/PPM tool (~450 lines)
- --blend parameter (0.0-1.0) for final layer mixing
- --format option (png/ppm) for output format
- ShaderComposer integration for include resolution
Build Integration:
- Added texture_readback.cc to GPU_SOURCES (both sections)
- Tool target with STB_IMAGE support
Testing:
- All 36 tests pass (100%)
- Processes 64×64 and 555×370 images successfully
- Ground-truth validation setup complete
Known Issues:
- BUG: Tool produces black output (uninitialized input texture)
- First intermediate texture not initialized before layer loop
- MSE 64860 vs Python ground truth (expected <10)
- Fix required: Copy input to intermediate[0] before processing
Documentation:
- doc/CNN_TEST_TOOL.md - Full technical reference
- Updated PROJECT_CONTEXT.md and COMPLETED.md
handoff(Claude): CNN test tool foundation complete, needs input init bugfix
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
PyTorch Conv2d uses zero-padding; shader was using Repeat mode which
wraps edges. ClampToEdge better approximates zero-padding behavior.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Converted ShaderToy shader (Saturday cubism experiment) to Scene1Effect
following EFFECT_WORKFLOW.md automation guidelines.
**Changes:**
- Created Scene1Effect (.h, .cc) as scene effect (not post-process)
- Converted GLSL to WGSL with manual fixes:
- Replaced RESOLUTION/iTime with uniforms.resolution/time
- Fixed const expressions (normalize not allowed in const)
- Converted mainImage() to fs_main() return value
- Manual matrix rotation for scene transformation
- Added shader asset to workspaces/main/assets.txt
- Registered in CMakeLists.txt (both GPU_SOURCES sections)
- Added to demo_effects.h and shaders declarations
- Added to timeline.seq at 22.5s for 10s duration
- Added to test_demo_effects.cc scene_effects list
**Shader features:**
- Raymarching cube and sphere with ground plane
- Reflections and soft shadows
- Sky rendering with sun and horizon glow
- ACES tonemapping and sRGB output
- Time-based rotation animation
**Tests:** All effects tests passing (5/9 scene, 9/9 post-process)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Add BindGroupLayoutBuilder, BindGroupBuilder, RenderPipelineBuilder,
and SamplerCache to reduce repetitive WGPU code. Refactor
post_process_helper, cnn_effect, and rotating_cube_effect.
Changes:
- Bind group creation: 19 instances, 14→4 lines each
- Pipeline creation: 30-50→8 lines
- Sampler deduplication: 6 instances → cached
- Total boilerplate reduction: -122 lines across 3 files
Builder pattern prevents binding index errors and consolidates
platform-specific #ifdef in fewer locations. Binary size unchanged
(6.3M debug). Tests pass.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Two bugs causing black screen when CNN post-processing activated:
1. Framebuffer capture timing: Capture ran inside post-effect loop after
ping-pong swaps, causing layers 1+ to capture wrong buffer. Moved
capture before loop to copy framebuffer_a once before post-chain starts.
2. Missing uniforms update: CNNEffect never updated uniforms_ buffer,
leaving uniforms.resolution uninitialized (0,0). UV calculation
p.xy/uniforms.resolution produced NaN, causing all texture samples
to return black. Added uniforms update in update_bind_group().
Files modified:
- src/gpu/effect.cc: Capture before post-chain (lines 308-346)
- src/gpu/effects/cnn_effect.cc: Add uniforms update (lines 132-142)
- workspaces/main/shaders/cnn/cnn_layer.wgsl: Remove obsolete comment
- doc/CNN_DEBUG.md: Historical debugging doc
- CLAUDE.md: Reference CNN_DEBUG.md in historical section
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
CNNEffect's "original" input was black because FadeEffect (priority 1) ran
before CNNEffect (priority 1), fading the scene. Changed framebuffer capture
to use framebuffer_a (scene output) instead of current_input (post-chain).
Also add seq_compiler validation to detect post-process priority collisions
within and across concurrent sequences, preventing similar render order issues.
Updated stub_types.h WGPULoadOp enum values to match webgpu.h spec.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Implements automatic layer chaining and generic framebuffer capture API for
multi-layer neural network effects with proper original input preservation.
Key changes:
- Effect::needs_framebuffer_capture() - generic API for pre-render capture
- MainSequence: auto-capture to "captured_frame" auxiliary texture
- CNNEffect: multi-layer support via layer_index/total_layers params
- seq_compiler: expands "layers=N" to N chained effect instances
- Shader: @binding(4) original_input available to all layers
- Training: generates layer switches and original input binding
- Blend: mix(original, result, blend_amount) uses layer 0 input
Timeline syntax: CNNEffect layers=3 blend=0.7
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Implements multi-layer convolutional neural network shader for stylized
post-processing of 3D rendered scenes:
**Core Components:**
- CNNEffect: C++ effect class with single-layer rendering (expandable to multi-pass)
- Modular WGSL snippets: cnn_activation, cnn_conv3x3/5x5/7x7, cnn_weights_generated
- Placeholder identity-like weights for initial testing (to be replaced by trained weights)
**Architecture:**
- Flexible kernel sizes (3×3, 5×5, 7×7) via separate snippet files
- ShaderComposer integration (#include resolution)
- Residual connections (input + processed output)
- Supports parallel convolutions (design ready, single conv implemented)
**Size Impact:**
- ~3-4 KB shader code (snippets + main shader)
- ~2-4 KB weights (depends on network architecture when trained)
- Total: ~5-8 KB (acceptable for 64k demo)
**Testing:**
- CNNEffect added to test_demo_effects.cc
- 36/36 tests passing (100%)
**Next Steps:**
- Training script (scripts/train_cnn.py) to generate real weights
- Multi-layer rendering with ping-pong textures
- Weight quantization for size optimization
handoff(Claude): CNN effect foundation complete, ready for training integration
|
|
Implements DEMO_HEADLESS build option for fast iteration cycles:
- Functional GPU/platform stubs (not pure no-ops like STRIP_EXTERNAL_LIBS)
- Audio and timeline systems work normally
- No rendering overhead
- Useful for CI, audio development, timeline validation
Files added:
- doc/HEADLESS_MODE.md - Documentation
- src/gpu/headless_gpu.cc - Validated GPU stubs
- src/platform/headless_platform.cc - Time simulation (60Hz)
- scripts/test_headless.sh - End-to-end test script
Usage:
cmake -B build_headless -DDEMO_HEADLESS=ON
cmake --build build_headless -j4
./build_headless/demo64k --headless --duration 30
Progress printed every 5s. Compatible with --dump_wav mode.
handoff(Claude): Task #76 follow-up - headless mode complete
|
|
- Use ma_backend_null for audio (100-200KB savings)
- Stub platform/gpu abstractions instead of external APIs
- Add DEMO_STRIP_EXTERNAL_LIBS build mode
- Create stub_types.h with minimal WebGPU opaque types
- Add scripts/measure_size.sh for automated measurement
Results: Demo=4.4MB, External=2.0MB (69% vs 31%)
handoff(Claude): Task #76 complete. Binary compiles but doesn't run (size measurement only).
|
|
CircleMaskEffect was creating shader modules directly without using
ShaderComposer, causing #include directives to fail at runtime.
Changes:
- Add ShaderComposer.Compose() for compute and render shaders
- Include shader_composer.h header
Fixes demo64k crash on CircleMaskEffect initialization.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Replace redundant CommonUniforms struct definitions across 13 shaders
with #include "common_uniforms" directive. Integrate ShaderComposer
preprocessing into all shader creation pipelines.
Changes:
- Replace 9-line CommonUniforms definitions with single #include line
- Add ShaderComposer.Compose() to create_post_process_pipeline()
- Add ShaderComposer.Compose() to gpu_create_render_pass()
- Add ShaderComposer.Compose() to gpu_create_compute_pass()
- Add InitShaderComposer() calls to test_effect_base and test_demo_effects
- Update test_shader_compilation to compose shaders before validation
Net reduction: 83 lines of duplicate code eliminated
All 35 tests passing (100%)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Replace hardcoded linear_sampler_ with configurable sampler map.
- SamplerType enum (LinearClamp, LinearRepeat, NearestClamp, NearestRepeat)
- get_or_create_sampler() for lazy sampler creation
- Default to LinearClamp for backward compatibility
Eliminates hardcoded assumptions, more flexible for future use cases.
|
|
Multi-input composite shaders with sampler support.
- Dynamic bind group layouts (N input textures + 1 sampler)
- dispatch_composite() for multi-input compute dispatch
- create_gpu_composite_texture() API
- gen_blend.wgsl and gen_mask.wgsl shaders
Guarded with #if !defined(STRIP_GPU_COMPOSITE) for easy removal.
Tests:
- Blend two noise textures
- Mask noise with grid
- Multi-stage composite (composite of composites)
Size: ~830 bytes (2 shaders + dispatch logic)
handoff(Claude): GPU procedural Phase 4 complete
|
|
Replace individual pipeline pointers with map-based system.
- Changed from 3 pointers to std::map<string, ComputePipelineInfo>
- Unified get_or_create_compute_pipeline() for lazy init
- Unified dispatch_compute() for all shaders
- Simplified create_gpu_*_texture() methods (~390 lines removed)
handoff(Claude): GPU procedural texture refactoring complete
|
|
Complete Phase 2 implementation:
- gen_perlin.wgsl: FBM with configurable octaves, amplitude decay
- gen_grid.wgsl: Grid pattern with configurable spacing/thickness
- TextureManager extensions: create_gpu_perlin_texture(), create_gpu_grid_texture()
- Asset packer now validates gen_noise, gen_perlin, gen_grid for PROC_GPU()
- 3 compute pipelines (lazy-init on first use)
Shader parameters:
- gen_perlin: seed, frequency, amplitude, amplitude_decay, octaves (32 bytes)
- gen_grid: width, height, grid_size, thickness (16 bytes)
test_3d_render migration:
- Replaced CPU sky texture (gen_perlin) with GPU version
- Replaced CPU noise texture (gen_noise) with GPU version
- Added new GPU grid texture (256x256, 32px grid, 2px lines)
Size impact:
- gen_perlin.wgsl: ~200 bytes (compressed)
- gen_grid.wgsl: ~100 bytes (compressed)
- Total Phase 2 code: ~300 bytes
- Cumulative (Phase 1+2): ~600 bytes
Testing:
- All 34 tests passing (100%)
- test_gpu_procedural validates all generators
- test_3d_render uses 3 GPU textures (noise, perlin, grid)
Next: Phase 3 - Variable dimensions, async generation, pipeline caching
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
Phase 1 implementation complete:
- GPU compute shader for noise generation (gen_noise.wgsl)
- TextureManager extensions: create_gpu_noise_texture(), dispatch_noise_compute()
- Asset packer PROC_GPU() syntax support with validation
- ShaderComposer integration for #include resolution
- Zero CPU memory overhead (GPU-only textures)
- Init-time and on-demand generation modes
Technical details:
- 8×8 workgroup size for 256×256 textures
- UniformBuffer for params (width, height, seed, frequency)
- Storage texture binding (rgba8unorm, write-only)
- Lazy pipeline compilation on first use
- ~300 bytes code (Phase 1)
Testing:
- New test: test_gpu_procedural.cc (passes)
- All 34 tests passing (100%)
Future phases:
- Phase 2: Add gen_perlin, gen_grid compute shaders
- Phase 3: Variable dimensions, async generation
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
|
|
|
Moved to Effect base class. Updated all subclasses to use the base member, removing redundant declarations and initializations. Cleaned up by removing redundant class definitions and including specific headers. Fixed a typo in DistortEffect constructor.
|
|
- Added to validate WGSL/C++ struct alignment.
- Integrated validation into .
- Standardized uniform usage in , , , .
- Renamed generic to specific names in WGSL and C++ to avoid collisions.
- Added and updated .
- handoff(Gemini): Completed Task #75.
|
|
This commit applies clang-format to the project's C++ source files (.h and .cc) according to the project's contributing guidelines. This includes minor adjustments to whitespace, line endings, and header includes for consistency.
|
|
- Fixed a persistent SEGFAULT in DemoEffectsTest, allowing all 33 tests to pass (100% test coverage).
- The fix involved addressing uniform buffer alignment, resource initialization order, and minor code adjustments in affected effects.
- Updated GEMINI.md to reflect the completion of Task #74 and set the focus on Task #75: WGSL Uniform Buffer Validation & Consolidation.
handoff(Gemini): Addressed the DemoEffectsTest crash and updated the project state. Next up is Task #75 for robust uniform buffer validation.
|
|
Related to Task #74. The dummy buffer used when effect_params is null
must be 32 bytes to match CommonPostProcessUniforms size, not 16 bytes.
Prevents potential validation errors when binding group expects 32-byte
uniform buffer at binding 3.
|
|
Fixed multiple WGSL/C++ struct alignment mismatches causing validation errors:
Padding fixes:
- fade_effect.cc: Changed EffectParams padding from vec3<f32> to _pad0/1/2
- theme_modulation_effect.cc: Same padding fix for EffectParams
- Root cause: WGSL vec3<f32> has 16-byte alignment, creating 32-byte structs
ODR violation fix:
- demo_effects.h: Added includes for fade_effect.h, theme_modulation_effect.h
- Removed incomplete forward declarations (88 bytes) conflicting with
complete definitions (96 bytes), causing heap buffer overflow in make_shared
Member shadowing cleanup:
- Renamed Effect::uniforms_ shadowing members to descriptive names:
- FadeEffect: uniforms_ -> common_uniforms_
- FlashEffect: uniforms_ -> flash_uniforms_
- ThemeModulationEffect: uniforms_ -> common_uniforms_
Results:
- demo64k runs without crashes
- 33/33 tests passing (100%)
- Added Task #75: WGSL uniform validation tool
handoff(Claude): Uniform buffer alignment debugged and fixed
|