From df39c7e3efa70376fac579b178c803eb319d517f Mon Sep 17 00:00:00 2001
From: skal <pascal.massimino@gmail.com>
Date: Mon, 9 Feb 2026 09:49:51 +0100
Subject: fix: Resolve WebGPU uniform buffer alignment issues (Task #74)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Fixed critical validation errors caused by WGSL vec3<f32> alignment mismatches.

Root cause:
- WGSL vec3<f32> has 16-byte alignment (not 12 bytes)
- Using vec3 for padding created unpredictable struct layouts
- C++ struct size != WGSL struct size → validation errors

Solution:
- Changed circle_mask_compute.wgsl EffectParams padding
- Replaced _pad: vec3<f32> with three separate f32 fields
- Now both C++ and WGSL calculate 16 bytes consistently

Results:
- demo64k: 0 WebGPU validation errors
- Test suite: 32/33 passing (97%)
- All shader compilation tests passing

Files modified:
- assets/final/shaders/circle_mask_compute.wgsl
- TODO.md (updated task status)
- PROJECT_CONTEXT.md (updated test results)
- HANDOFF_2026-02-09_UniformAlignment.md (technical writeup)

Note: DemoEffectsTest failure is unrelated (wgpu_native library bug)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---
 TODO.md | 61 +++++++++++++++++++------------------------------------------
 1 file changed, 19 insertions(+), 42 deletions(-)

(limited to 'TODO.md')
diff --git a/TODO.md b/TODO.md
index 8b9ac82..b936041 100644
--- a/TODO.md
+++ b/TODO.md
@@ -4,7 +4,11 @@ This file tracks prioritized tasks with detailed attack plans.
 
 **Note:** For a history of recently completed tasks, see `COMPLETED.md`.
 
-## Recently Completed (February 8, 2026)
+## Recently Completed (February 9, 2026)
+
+- [x] **Uniform Buffer Alignment (Task #74)**: Fixed WGSL struct alignment issues. Changed `vec3<f32>` padding to individual `f32` fields in circle_mask_compute.wgsl. Demo runs with 0 validation errors, 32/33 tests passing.
+
+## Previously Completed (February 8, 2026)
 
 - [x] **Shader Parametrization System**: Full uniform parameter system with .seq syntax support. FlashEffect now supports color/decay parameters with per-frame animation. See `COMPLETED.md` for details.
 - [x] **ChromaAberrationEffect Parametrization**: Added offset_scale and angle parameters. Supports diagonal and vertical aberration modes via .seq syntax.
@@ -12,22 +16,24 @@ This file tracks prioritized tasks with detailed attack plans.
 
 ---
 
-## Priority 1: Audio Pipeline Simplification & Jitter Fix (Task #71) [COMPLETED]
+## Priority 1: Uniform Buffer Alignment (Task #74) [COMPLETED - February 9, 2026]
+
+**Goal**: Fix WebGPU uniform buffer size/padding/alignment mismatches between C++ structs and WGSL shaders.
 
-**Goal**: Address audio jittering in the miniaudio backend and simplify the entire audio pipeline (Synth, Tracker, AudioEngine, AudioBackend) for better maintainability and performance.
+**Root Cause**: WGSL `vec3<f32>` has 16-byte alignment (not 12), causing struct padding mismatches. Using `vec3<f32>` for padding fields created unpredictable struct sizes.
 
-**Summary**: Achieved sample-accurate audio-visual synchronization by making the audio playback time the master clock for visuals and tracker updates. Eliminated jitter by using a stable audio clock for scheduling. See HANDOFF_2026-02-07_Final.md for details.
+**Fixes Applied**:
+- `circle_mask_compute.wgsl`: Changed `_pad: vec3<f32>` to three separate `f32` fields
+  - Before: 24+ bytes in WGSL, 16 bytes in C++
+  - After: 16 bytes in both
+- Verified all shaders use individual `f32` fields for padding (no `vec3` in padding)
 
-### Phase 1: Jitter Analysis & Fix
-- [x] **Investigate**: Deep dive into `miniaudio_backend.cc` to find the root cause of audio jitter. Analyze buffer sizes, callback timing, and thread synchronization.
-- [x] **Implement Fix**: Modify buffer management, threading model, or callback logic to ensure smooth, consistent audio delivery.
-- [x] **Verify**: Create a new, specific test case in `src/tests/test_audio_backend.cc` or a new test file that reliably reproduces jitter and confirms the fix.
+**Results**:
+- ✅ demo64k: Runs with **0 WebGPU validation errors**
+- ✅ Test suite: **32/33 tests passing (97%)**
+- ❌ DemoEffectsTest: SEGFAULT in wgpu_native library (unrelated to alignment fixes)
 
-### Phase 2: Code Simplification & Refactor
-- [x] **Review Architecture**: Map out the current interactions between `Synth`, `Tracker`, `AudioEngine`, and `AudioBackend`.
-- [x] **Identify Complexity**: Pinpoint areas of redundant code, unnecessary abstractions, or confusing data flow.
-- [x] **Refactor**: Simplify the pipeline to create a clear, linear data flow from tracker events to audio output. Reduce dependencies and clarify ownership of resources.
-- [x] **Update Documentation**: Modify `doc/HOWTO.md` and `doc/CONTRIBUTING.md` to reflect the new, simpler audio architecture.
+**Key Lesson**: Never use `vec3<f32>` for padding in WGSL uniform structs. Always use individual `f32` fields to ensure predictable alignment.
 
 ---
 
@@ -101,35 +107,6 @@ This file tracks prioritized tasks with detailed attack plans.
 
 ---
 
-## Priority 2: Audio Pipeline Streamlining (Task #72) [COMPLETED - February 8, 2026]
-
-**Goal**: Optimize the audio pipeline to reduce memory copies and simplify the data flow by using direct additive mixing and deferred clipping.
-
-- [x] **Phase 1: Direct Additive Mixing**
-  - Added `get_write_region()` / `commit_write()` API to ring buffer
-  - Refactored `audio_render_ahead()` to write directly to ring buffer
-  - Eliminated temporary buffer allocations (zero heap allocations per frame)
-  - Removed one memory copy operation (temp → ring buffer)
-- [x] **Phase 2: Float32 Internal Pipeline**
-  - Verified entire pipeline maintains float32 precision (no changes needed)
-- [x] **Phase 3: Final Clipping & Conversion**
-  - Implemented in-place clipping in `audio_render_ahead()` (clamps to [-1.0, 1.0])
-  - Applied to both primary and wrap-around render paths
-- [x] **Phase 4: Verification**
-  - All 31 tests pass ✅
-  - WAV dump test confirms no clipping detected
-  - Binary size: 5.0M stripped (expected -150 to -300 bytes from eliminating new/delete)
-  - Zero audio quality regressions
-
-**Files Modified:**
-- `src/audio/ring_buffer.h` - Added two-phase write API
-- `src/audio/ring_buffer.cc` - Implemented get_write_region() / commit_write()
-- `src/audio/audio.cc` - Refactored audio_render_ahead() for direct writes + clipping
-
-**See:** `/Users/skal/.claude/plans/fizzy-strolling-rossum.md` for detailed implementation plan
-
----
-
 ## Priority 2: 3D System Enhancements (Task #18)
 **Goal:** Establish a pipeline for importing complex 3D scenes to replace hardcoded geometry. **Progress:** C++ pipeline for loading and processing object-specific data (like plane_distance) is now in place. Shader integration for SDFs is pending.
 
-- 
cgit v1.2.3