8 files changed, 654 insertions, 55 deletions
diff --git a/doc/ARCHITECTURE.md b/doc/ARCHITECTURE.md
index 97413de..4c36ec5 100644
--- a/doc/ARCHITECTURE.md
+++ b/doc/ARCHITECTURE.md
@@ -18,11 +18,26 @@ Detailed system architecture for the 64k demo project.
 
 **Effect**: Abstract base for visual elements. Supports `compute` and `render` phases.
 
-**Sequence**: Timeline of effects with start/end times.
+**Sequence**: Timeline of effects with start/end times defined in beats.
 
 **MainSequence**: Top-level coordinator and framebuffer manager.
 
-**seq_compiler**: Transpiles workspace `timeline.seq` into C++ `timeline.cc`.
+**seq_compiler**: Transpiles workspace `timeline.seq` (beat-based) into C++ `timeline.cc` (seconds).
+
+### Beat-Based Timing
+
+**Timeline Notation**: Sequences authored in musical beats (default) or explicit seconds (`s` suffix).
+
+**Runtime Conversion**: Beats → seconds at compile time using BPM. Effects activate at physical seconds.
+
+**Uniform Timing**: Effects receive both:
+- `time` - Physical seconds (constant, unaffected by tempo)
+- `beat_time` - Musical beats (from audio playback clock)
+- `beat_phase` - Fractional beat 0.0-1.0
+
+**Tempo Separation**: Variable tempo scales `music_time` for audio triggering only. Visual rendering uses constant physical time with optional beat synchronization.
+
+See `doc/BEAT_TIMING.md` for details.
 
 ---
 
@@ -42,10 +57,10 @@ Detailed system architecture for the 64k demo project.
 Real-time additive synthesis from spectrograms via FFT-based IDCT (O(N log N)). Stereo output (32kHz, 16-bit, interleaved L/R). Uses orthonormal DCT-II/DCT-III transforms with Numerical Recipes reordering method.
 
 ### Variable Tempo
-Music time abstraction with configurable tempo_scale. Tempo changes don't affect pitch.
+Music time abstraction with configurable `tempo_scale`. Tempo changes don't affect pitch. **Visual effects unaffected** - they use physical time, not tempo-scaled music time.
 
 ### Event-Based Tracker
-Individual TrackerEvents trigger as separate voices with dynamic beat calculation. Notes within patterns respect tempo scaling.
+Individual TrackerEvents trigger as separate voices with dynamic beat calculation. Notes within patterns respect tempo scaling. Triggers based on `music_time` (tempo-scaled).
 
 ### Backend Abstraction
 `AudioBackend` interface with `MiniaudioBackend` (production), `MockAudioBackend` (testing), and `WavDumpBackend` (offline rendering).
diff --git a/doc/BEAT_TIMING.md b/doc/BEAT_TIMING.md
new file mode 100644
index 0000000..cf7f377
--- /dev/null
+++ b/doc/BEAT_TIMING.md
@@ -0,0 +1,272 @@
+# Beat-Based Timing System
+
+## Overview
+
+The demo uses **beat-based timing** for visual effect sequences, ensuring musical synchronization regardless of BPM changes. All timeline sequences are authored in beats (musical time) and converted to physical seconds at runtime.
+
+**Key Principle:** Variable tempo only affects audio sample triggering. Visual effects run at constant physical time with optional beat-synchronized animation.
+
+---
+
+## Quick Start
+
+### Timeline Authoring
+```seq
+# BPM 120
+SEQUENCE 0 0 "Intro"         # Beat 0 (bar 1)
+  EFFECT + Flash 0 2         # Beats 0-2 (half bar)
+  EFFECT + Fade 4 8          # Beats 4-8 (full bar)
+
+SEQUENCE 16 1 "Drop"         # Beat 16 (bar 5)
+  EFFECT + Heptagon 0 16     # 4 bars
+```
+
+**Conversion:** At 120 BPM, 1 beat = 0.5 seconds, 4 beats = 2 seconds
+
+### Shader Animation
+```wgsl
+@group(0) @binding(2) var<uniform> uniforms: CommonUniforms;
+
+// Use beat_time for musical animation
+let bar_cycle = uniforms.beat_time / 4.0;  // Bars
+let pulse = sin(bar_cycle * TAU);
+
+// Use beat_phase for smooth per-beat effects
+let wave = sin(uniforms.beat_phase * TAU);
+
+// Use time for constant-speed physics
+let rotation = uniforms.time * TAU;
+```
+
+---
+
+## Uniform Structure
+
+All effects receive `CommonPostProcessUniforms` with timing data:
+
+```cpp
+struct CommonPostProcessUniforms {
+  vec2 resolution;        // Screen dimensions (pixels)
+  float aspect_ratio;     // Width/height ratio
+  float time;             // Physical seconds (constant, unaffected by tempo)
+  float beat_time;        // Absolute beats (musical time from audio clock)
+  float beat_phase;       // Fractional beat 0.0-1.0 (smooth oscillation)
+  float audio_intensity;  // Audio peak for beat sync
+  float _pad;            // Alignment padding
+};  // 32 bytes
+```
+
+**Use Cases:**
+- `time`: Physics animation, constant-speed rotation/movement
+- `beat_time`: Bar-based patterns, musical synchronization
+- `beat_phase`: Smooth per-beat pulse/wave effects
+
+---
+
+## Timeline Format
+
+### Time Notation
+
+**Default:** Beats (no suffix)
+```seq
+SEQUENCE 0 0      # Beat 0
+  EFFECT + Flash 0 4    # Beats 0-4
+```
+
+**Explicit Seconds:** Use `s` suffix (rare)
+```seq
+SEQUENCE 2.5s 0   # 2.5 physical seconds
+  EFFECT + Flash 0 4    # Still uses beats for duration
+```
+
+**Explicit Beats:** Use `b` suffix (optional clarity)
+```seq
+SEQUENCE 8b 0     # Same as "8"
+  EFFECT + Flash 0b 4b  # Same as "0 4"
+```
+
+### BPM Declaration
+
+**Required** in all timeline files:
+```seq
+# BPM 120
+```
+
+Specifies beats per minute for runtime conversion to seconds.
+
+---
+
+## Architecture
+
+### Timing Flow
+
+```
+Platform Clock (physical seconds)
+    │
+    ├──► Physical Time ────────┐
+    │    (constant)             │
+    │                           │
+    └──► Audio Time ────┐       │
+         (playback)     │       │
+                        ▼       │
+                 Beat Calculation    │
+                 (BPM * 60)          │
+                        │            │
+                        ▼            ▼
+                   Visual Effects Rendering
+                   (time + beat_time + beat_phase)
+```
+
+### Key Insight
+
+**Variable tempo changes `music_time`** (used for audio event triggering), but **visual effects receive `time` (physical)** and **`beat_time` (from audio playback clock)**, not from tempo-scaled music time.
+
+This separation ensures:
+- ✅ Visual effects run at constant frame rate
+- ✅ Beat-synced animations track actual audio playback
+- ✅ Tempo changes don't cause visual stuttering
+
+---
+
+## Implementation
+
+### Beat Calculation (Runtime)
+
+```cpp
+// main.cc - Calculate from audio playback time
+const float absolute_beat_time = current_audio_time * g_tracker_score.bpm / 60.0f;
+const float beat_phase = fmodf(absolute_beat_time, 1.0f);
+
+// Pass to GPU rendering
+gpu_draw(visual_peak, aspect_ratio, physical_time, absolute_beat_time, beat_phase);
+```
+
+### Timeline Compilation
+
+```cpp
+// seq_compiler.cc - Convert beats to seconds at compile time
+std::string convert_to_time(const std::string& value, float bpm) {
+  if (value.back() == 's') return explicit_seconds;  // Pass through
+
+  // Default: treat as beats
+  float beat = std::stof(value);
+  float time = beat * 60.0f / bpm;
+  return time;
+}
+```
+
+**Result:** Generated `timeline.cc` contains physical seconds for effect activation.
+
+---
+
+## Migration
+
+### Existing Timelines
+
+Already migrated with explicit `s` suffix to preserve timing:
+```seq
+SEQUENCE 2.50s 0          # Physical seconds preserved
+  EFFECT + Flash 0.00s 1.00s
+```
+
+### New Content
+
+Use beat notation (recommended):
+```seq
+# BPM 140
+SEQUENCE 0 0 "Intro"
+  EFFECT + Flash 0 4      # 4 beats = 1.71s @ 140 BPM
+  EFFECT + Fade 4 8       # 4 beats = 1.71s
+```
+
+**Benefits:**
+- Natural musical alignment (bars/beats)
+- BPM changes don't break timing
+- Easier to author to music
+
+---
+
+## Examples
+
+### Four-Bar Pattern
+```seq
+# BPM 120
+SEQUENCE 0 0 "Verse 1"
+  EFFECT - Background 0 16    # 4 bars background
+  EFFECT + Flash 0 1          # First beat flash
+  EFFECT + Pulse 4 5          # Second bar pulse
+  EFFECT + Fade 15 16         # Final beat fade
+```
+
+### Multi-Bar Sequence
+```seq
+SEQUENCE 16 0 "Chorus"        # Bar 5
+  EFFECT + Heptagon 0 32      # 8 bars (full chorus)
+  EFFECT + Particles 8 24     # Bars 7-11 (middle)
+```
+
+### Beat-Synced Shader
+```wgsl
+fn fragment_main(@location(0) uv: vec2<f32>) -> @location(0) vec4<f32> {
+  // Pulse every bar (4 beats)
+  let bar_phase = fract(uniforms.beat_time / 4.0);
+  let bar_pulse = smoothstep(0.0, 0.1, bar_phase) *
+                  (1.0 - smoothstep(0.9, 1.0, bar_phase));
+
+  // Smooth per-beat wave
+  let beat_wave = sin(uniforms.beat_phase * TAU);
+
+  // Combine
+  let intensity = bar_pulse * 0.5 + beat_wave * 0.3;
+  return vec4<f32>(color * intensity, 1.0);
+}
+```
+
+---
+
+## Troubleshooting
+
+### Shader Compilation Error: "invalid accessor 'beat'"
+
+**Cause:** Old shader using `uniforms.beat` (deprecated field)
+
+**Fix:** Use `uniforms.beat_phase` or `uniforms.beat_time`
+
+```wgsl
+// OLD (error):
+let x = uniforms.beat;
+
+// NEW:
+let x = uniforms.beat_phase;  // For 0-1 fractional
+let y = uniforms.beat_time;   // For absolute beats
+```
+
+### Timeline Parse Error
+
+**Cause:** Missing BPM declaration
+
+**Fix:** Add BPM at top of file
+```seq
+# BPM 120    ← Required
+SEQUENCE 0 0
+```
+
+### Effects Start at Wrong Time
+
+**Cause:** Mixing beats and seconds without explicit suffixes
+
+**Fix:** Be explicit
+```seq
+SEQUENCE 8 0      # 8 beats (not 8 seconds)
+SEQUENCE 8s 0     # 8 seconds (explicit)
+SEQUENCE 8b 0     # 8 beats (explicit, same as first)
+```
+
+---
+
+## See Also
+
+- **Format Reference:** `doc/SEQUENCE.md` - Complete .seq syntax
+- **Implementation:** `BEAT_TIMING_SUMMARY.md` - Technical details
+- **Effect Creation:** `doc/EFFECT_WORKFLOW.md` - Adding new effects
+- **Timeline Editor:** `tools/timeline_editor/README.md` - Visual editing
diff --git a/doc/CNN_BIAS_FIX_2026-02.md b/doc/CNN_BIAS_FIX_2026-02.md
new file mode 100644
index 0000000..26db8eb
--- /dev/null
+++ b/doc/CNN_BIAS_FIX_2026-02.md
@@ -0,0 +1,85 @@
+# CNN Bias Accumulation Fix (2026-02-11)
+
+## Problem
+Bias was being added multiple times in shader convolution loops (once per kernel position), causing mismatch between PyTorch training and WGSL inference.
+
+## Root Cause
+**Location**: `training/train_cnn.py:381, 398`
+
+When exporting weights to WGSL, bias was replicated for every kernel position. The shader loops through positions doing:
+```wgsl
+sum += dot(weights[pos], rgbd) + dot(weights[pos+1], in1);  // in1.w = 1.0
+```
+
+For 3×3 kernel (9 positions), bias added 9×. For 5×5, added 25×.
+
+## Fix
+Divide bias by `num_positions` during export:
+```python
+# Final layer (7→1)
+v1.append(f"{bias[0] / num_positions:.6f}")
+
+# Inner layers (7→4)
+v1.append(f"{bias[out_c] / num_positions:.6f}")
+```
+
+Shader accumulates bias × num_positions = original bias (correct).
+
+---
+
+## Additional Improvements
+
+### 1. RGBA Output Support
+**train_cnn.py**: Now saves 4-channel RGBA PNG preserving alpha from input:
+```python
+alpha = img_tensor[0, 3:4, :, :].permute(1, 2, 0).numpy()
+output_rgba = np.concatenate([output, alpha], axis=2)
+Image.fromarray((output_rgba * 255).astype(np.uint8), mode='RGBA')
+```
+
+Intermediate layers also save RGBA if 4-channel.
+
+### 2. Debug Hex Output
+**Both tools** support `--debug-hex` to print first 8 pixels as hex:
+```bash
+./training/train_cnn.py --infer input.png --export-only checkpoint.pth --debug-hex
+./build/cnn_test input.png output.png --debug-hex
+```
+
+Output format: `[0] 0xRRGGBBAA` for pixel-level comparison.
+
+### 3. Cleanup
+Removed sRGB/linear_png debug code from `cnn_test.cc` (simplified PNG saving).
+
+---
+
+## Files Modified
+- `training/train_cnn.py`: Bias fix, RGBA output, --debug-hex
+- `tools/cnn_test.cc`: --debug-hex, remove linear_png
+- `workspaces/main/shaders/cnn/cnn_weights_generated.wgsl`: Regenerated with fixed bias
+
+## Testing
+```bash
+# Train with fixed export
+./training/train_cnn.py --input training/input/ --target training/output/ \
+  --layers 3 --kernel_sizes 3,3,3 --epochs 5000
+
+# Generate ground truth
+./training/train_cnn.py --infer input.png --export-only checkpoint.pth \
+  --output ground_truth.png --debug-hex
+
+# Run GPU tool
+./build/cnn_test input.png tool_output.png --debug-hex
+
+# Compare hex output for first 8 pixels
+```
+
+---
+
+## Status
+✅ Bias accumulation bug fixed
+✅ RGBA output with alpha preservation
+✅ Debug hex comparison tool
+✅ Weights regenerated
+
+Commit: `8ff8c56`
diff --git a/doc/CNN_FLATTEN_ANALYSIS.md b/doc/CNN_FLATTEN_ANALYSIS.md
new file mode 100644
index 0000000..88f3db6
--- /dev/null
+++ b/doc/CNN_FLATTEN_ANALYSIS.md
@@ -0,0 +1,189 @@
+# CNN Shader Flatten Mode - Technical Analysis
+
+**Status:** Analysis complete - flatten mode NOT RECOMMENDED
+
+**Date:** February 2026
+
+---
+
+## Context
+
+Current CNN architecture uses **3 sequential render passes** (linear chaining):
+- **Layer 0:** 5×5 conv (7→4 channels) → framebuffer
+- **Layer 1:** 3×3 conv (7→4 channels) → reads L0 output, writes framebuffer
+- **Layer 2:** 3×3 conv (7→1 channel) → reads L1 output, blends with original
+
+Proposed **"flatten mode"**: Collapse all layers into **single shader pass** using intermediate arrays, eliminating framebuffer read/write between layers.
+
+---
+
+## Current Architecture
+
+**Shader Structure:**
+- 1 pipeline with layer branching (`layer_index` uniform)
+- 5 bindings: sampler, input texture, uniforms, layer params, original capture
+- Total shader size: ~8 KB (snippets + weights)
+
+**Performance Profile:**
+- 3 render pass dispatches
+- 2 framebuffer writes + reads between layers
+- Memory bandwidth: ~2× framebuffer size per layer
+- Register pressure: Low (per-layer isolation)
+
+**Weight Buffer:** 290 vec4s (4.6 KB) - already unified
+
+---
+
+## Flatten Approaches Evaluated
+
+### Option A: Full Flatten (All 3 Layers)
+
+**Cascading Receptive Field:**
+
+To compute final output at position (x, y):
+- Layer 2 needs 3×3 neighborhood of Layer 1 outputs
+- Each Layer 1 output needs 3×3 neighborhood of Layer 0 outputs
+- Each Layer 0 output needs 5×5 neighborhood of input samples
+
+**Effective input sampling:** 9×9 pixels (vs current 5×5 max)
+
+**Intermediate Storage (per thread/pixel):**
+```
+Layer 0 outputs: 5×5 positions × 4 channels = 100 floats
+Layer 1 outputs: 3×3 positions × 4 channels =  36 floats
+                                   TOTAL = 136 floats (544 bytes)
+```
+
+**GPU Register Pressure:**
+- Modern GPUs: 32-64 KB registers per SM, shared across warps
+- 544 bytes/thread → max 64 threads/SM (**low occupancy**)
+- Current multi-pass: ~4-8 bytes/thread (high occupancy)
+
+**Pros:**
+- 1 dispatch vs 3 (reduce CPU overhead)
+- Zero framebuffer bandwidth between layers
+
+**Cons:**
+- **Severe register pressure** (10-20× increase)
+- Reduced occupancy → potential performance loss
+- Complex shader (harder debug, larger binary)
+- 9×9 input sampling
+
+**Assessment:** ❌ **Not Recommended**
+Register cost outweighs bandwidth savings.
+
+---
+
+### Option B: Partial Flatten (Layers 1 + 2)
+
+Keep Layer 0 separate, flatten only Layers 1 and 2.
+
+**Pass Structure:**
+1. **Pass 1:** Layer 0 (5×5 conv) → framebuffer
+2. **Pass 2 (flattened):** Compute Layer 1 + Layer 2 in single shader
+
+**Intermediate Storage:**
+```
+Layer 0 samples: 3×3 × 4 = 36 floats (read once)
+Layer 1 outputs: 3×3 × 4 = 36 floats (computed)
+                 TOTAL = 72 floats (288 bytes)
+```
+
+**Receptive Field:** 5×5 Layer 0 samples required for 3×3 Layer 1 outputs
+
+**Pros:**
+- 2 passes vs 3 (33% reduction)
+- 1 framebuffer write saved
+- More manageable register usage
+
+**Cons:**
+- Still significant register pressure (288 bytes vs ~8 bytes baseline)
+- Medium complexity increase
+- Layer 0 (heaviest kernel) still separate
+
+**Assessment:** ⚠️ **Marginal Benefit**
+Saves 1 pass but register cost still high.
+
+---
+
+### Option C: Keep Current Multi-Pass ✅
+
+**Rationale:**
+- Current architecture well-suited to GPU design (high throughput via parallelism)
+- Minimal register usage → high occupancy → hides memory latency
+- Framebuffer bandwidth cost < register pressure cost
+- Clean separation aids debugging/iteration
+- Modular (easy to add/remove layers)
+
+**Alternative Optimizations (if bandwidth critical):**
+1. Merge passes via render pass load/store ops (Vulkan subpasses)
+2. Reduce intermediate channel count (4→3 or 2)
+3. Hybrid: Compute shaders + workgroup shared memory
+4. Layer pruning (2-layer vs 3-layer quality comparison)
+
+---
+
+## Recommendation
+
+**✅ Keep current multi-pass architecture**
+
+### Decision Matrix
+
+| Factor | Multi-Pass | Partial Flatten | Full Flatten |
+|--------|-----------|----------------|--------------|
+| Register pressure | ✅ Low | ⚠️ High | ❌ Extreme |
+| Occupancy | ✅ High | ⚠️ Medium | ❌ Low |
+| Memory bandwidth | ⚠️ Medium | ✅ Lower | ✅ Lowest |
+| Shader complexity | ✅ Simple | ⚠️ Medium | ❌ High |
+| Debuggability | ✅ Easy | ⚠️ Harder | ❌ Very hard |
+| Binary size | ✅ Small | ⚠️ Larger | ⚠️ Largest |
+
+**Modern GPU Architecture Favors:**
+- High parallelism (many small threads) over complex threads
+- Hiding latency via occupancy over minimizing operations
+- Memory bandwidth via caching, not elimination
+
+---
+
+## Alternative: Compute Shader + Shared Memory
+
+**If bandwidth becomes critical:**
+- Use compute shader with workgroup shared memory
+- Load tile + halos into shared memory (9×9 input samples)
+- Compute all 3 layers for tile interior (avoids redundant sampling)
+- Requires explicit synchronization (`workgroupBarrier`)
+
+**Trade-offs:**
+- ✅ Low register pressure + low bandwidth
+- ❌ Compute pipeline complexity (no render pass integration)
+- ❌ Tile edge handling
+- ❌ Larger code size
+
+---
+
+## Conclusion
+
+Current 3-pass architecture is **appropriate for demo64k**:
+- Size-efficient (modular shaders)
+- Performance adequate (bandwidth not bottleneck)
+- Maintainable (clean layer isolation)
+
+**Flatten mode not recommended** unless profiling reveals specific bandwidth constraint.
+
+### Size Optimization Alternatives (Better ROI)
+
+If size optimization critical, focus on:
+1. **Weight quantization:** 4.6 KB → ~2 KB (8-bit or 4-bit quantization)
+2. **Kernel size reduction:** 5×5 → 3×3 for Layer 0 (200 vec4s → 72 vec4s)
+3. **Channel reduction:** 7 inputs → 4 inputs (remove UV/grayscale channels)
+
+These yield better size/performance than shader architecture changes.
+
+---
+
+## References
+
+- `doc/CNN_EFFECT.md` - CNN implementation details
+- `doc/CNN.md` - High-level CNN design
+- `src/gpu/effects/cnn_effect.cc` - Current implementation
+- `workspaces/main/shaders/cnn_*.wgsl` - Shader snippets
diff --git a/doc/CONTRIBUTING.md b/doc/CONTRIBUTING.md
index 98df873..d7ef88a 100644
--- a/doc/CONTRIBUTING.md
+++ b/doc/CONTRIBUTING.md
@@ -153,8 +153,8 @@ To ensure consistency and prevent alignment-related issues:
 2. **Mirror in C++:** Create corresponding C++ structs that mirror WGSL definitions.
 3. **`static_assert` for Size:** Every C++ struct must have a `static_assert` verifying size matches WGSL.
 4. **Standard Bindings:**
-   - **Binding 2:** Always use `CommonPostProcessUniforms` for per-frame data (resolution, time, beat).
+   - **Binding 2:** Always use `CommonPostProcessUniforms` for per-frame data (resolution, time, beat_time, beat_phase, audio_intensity).
    - **Binding 3:** Use effect-specific parameter structs for unique data.
-5. **Shader Consistency:** Ensure WGSL shaders correctly declare uniforms at specified bindings.
+5. **Shader Consistency:** Use `ShaderComposer` to include `common_uniforms` snippet. Reference `CommonUniforms` struct in WGSL shaders.
 6. **Validation Script:** Run `tools/validate_uniforms.py` to catch discrepancies.
-7. **Documentation:** Refer to `doc/UNIFORM_BUFFER_GUIDELINES.md` for detailed alignment rules.
+7. **Documentation:** Refer to `doc/UNIFORM_BUFFER_GUIDELINES.md` for detailed alignment rules and `doc/BEAT_TIMING.md` for timing usage.
diff --git a/doc/EFFECT_WORKFLOW.md b/doc/EFFECT_WORKFLOW.md
index d68d148..e453b63 100644
--- a/doc/EFFECT_WORKFLOW.md
+++ b/doc/EFFECT_WORKFLOW.md
@@ -37,6 +37,16 @@ void render(WGPURenderPassEncoder pass,
             const CommonPostProcessUniforms& uniforms) override;
 ```
 
+**Uniforms Available:**
+```cpp
+uniforms.time;             // Physical seconds (constant speed)
+uniforms.beat_time;        // Musical beats (bar synchronization)
+uniforms.beat_phase;       // Fractional beat 0.0-1.0 (smooth oscillation)
+uniforms.audio_intensity;  // Audio peak for beat sync
+uniforms.resolution;       // Screen dimensions
+uniforms.aspect_ratio;     // Width/height ratio
+```
+
 **Template:** See `tools/shadertoy/template.*` or use `convert_shadertoy.py`
 
 ### 2. Add Shader to Assets
diff --git a/doc/SEQUENCE.md b/doc/SEQUENCE.md
index 68bd129..03d0c45 100644
--- a/doc/SEQUENCE.md
+++ b/doc/SEQUENCE.md
@@ -20,13 +20,13 @@ Sequence files (`.seq`) define the timeline and layering of visual effects. They
 ```
 # BPM 120
 ```
-Specifies beats per minute. Used to convert beat notation to seconds.
+Specifies beats per minute. Required. Used to convert beats to physical seconds at runtime.
 
 ### END_DEMO Directive
 ```
 END_DEMO <time>
 ```
-Optional auto-exit time. Supports beat notation (`64b`) or seconds (`32.0`).
+Optional auto-exit time in beats (e.g., `64` or `64b`) or explicit seconds (`32.0s`).
 
 ### SEQUENCE Declaration
 ```
@@ -35,10 +35,10 @@ SEQUENCE <global_start> <priority> ["optional_name"] [optional_end]
 ```
 
 **Parameters:**
-- `global_start`: Sequence start time (beats or seconds)
+- `global_start`: Sequence start time in beats (default) or explicit seconds (`2.5s`)
 - `priority`: Render order (0-9 for scenes, 10+ for post-processing)
 - `"optional_name"`: Optional display name for Gantt charts
-- `[optional_end]`: Optional sequence end time (forces effect termination)
+- `[optional_end]`: Optional sequence end time in beats (forces effect termination)
 
 **Examples:**
 ```
@@ -60,34 +60,47 @@ EFFECT <+|=|-> <EffectClassName> <local_start> <local_end> [constructor_args...]
 
 **Parameters:**
 - `EffectClassName`: C++ class from `demo_effects.h`
-- `local_start`, `local_end`: Time relative to sequence start
+- `local_start`, `local_end`: Time in beats relative to sequence start
 - `constructor_args`: Optional (rarely used, most effects use standard params only)
 
 ---
 
 ## Time Notation
 
+**Beat-based timing (default):** All times are in musical beats, ensuring alignment regardless of BPM changes.
+
 | Notation | Example | Description |
 |----------|---------|-------------|
-| Integer beats | `0`, `64`, `128` | No decimal point = beats |
-| Explicit beats | `0b`, `64b`, `128b` | Suffix 'b' = beats |
-| Decimal seconds | `0.0`, `32.0`, `64.0` | Decimal point = seconds |
-| Explicit seconds | `32.0s`, `64.0s` | Suffix 's' = seconds |
+| **Beats (default)** | `0`, `4`, `16` | Integer or decimal, no suffix |
+| Explicit beats | `4b`, `16.5b` | Optional 'b' suffix for clarity |
+| Explicit seconds | `2.0s`, `8.25s` | Suffix 's' for physical time (rare) |
+
+**Conversion:** At 120 BPM, 1 beat = 0.5 seconds, 4 beats = 2 seconds
 
-**At 120 BPM:** Beat 64 = 32.0 seconds, Beat 120 = 60.0 seconds
+**Why beats?**
+- Musical alignment: Sequences stay synchronized to music structure
+- BPM independence: Changing BPM preserves musical timing
+- Intuitive authoring: Timeline matches bars/beats
 
 ---
 
 ## Runtime Parameters
 
-All effects receive these parameters every frame in `render()`:
+All effects receive these parameters every frame in `render()` via `CommonPostProcessUniforms`:
 
 | Parameter | Type | Description |
 |-----------|------|-------------|
-| `time` | float | Global time in seconds |
-| `beat` | float | Current beat fraction (0.0-1.0) |
-| `intensity` | float | Audio peak (0.0-1.0, for beat sync) |
-| `aspect_ratio` | float | Screen width/height |
+| `resolution` | vec2 | Screen dimensions in pixels |
+| `aspect_ratio` | float | Screen width/height ratio |
+| `time` | float | Physical time in seconds (unaffected by tempo) |
+| `beat_time` | float | Musical time in beats (absolute) |
+| `beat_phase` | float | Fractional beat (0.0-1.0 within current beat) |
+| `audio_intensity` | float | Audio peak (0.0-1.0, for beat sync) |
+
+**Use cases:**
+- `time`: Physics-based animation (constant speed)
+- `beat_time`: Musical animation (sync to bars/beats)
+- `beat_phase`: Smooth oscillation per beat
 
 ---
 
@@ -108,47 +121,55 @@ All effects receive these parameters every frame in `render()`:
 
 ### Basic Sequence
 ```
+# BPM 120
 SEQUENCE 0 0
-  EFFECT + FlashEffect 0.0 0.5     # Priority 0
-  EFFECT + HeptagonEffect 0.2 10   # Priority 1
+  EFFECT + FlashEffect 0 1         # Priority 0, beats 0-1 (0-0.5s @ 120 BPM)
+  EFFECT + HeptagonEffect 0.4 20   # Priority 1, beats 0.4-20
 ```
 
 ### Same Priority Layering
 ```
 SEQUENCE 0 0
-  EFFECT + Flash 0.0 0.5           # Priority 0
-  EFFECT = Fade 0.1 0.3            # Priority 0 (same layer)
-  EFFECT + Other 0.2 3             # Priority 1
+  EFFECT + Flash 0 1               # Priority 0
+  EFFECT = Fade 0.2 0.6            # Priority 0 (same layer)
+  EFFECT + Other 0.4 6             # Priority 1
 ```
 
 ### Background Elements
 ```
 SEQUENCE 0 0
-  EFFECT - FlashCube 0 10          # Priority -1 (background)
-  EFFECT = BgEffect 0 5            # Priority -1 (same layer)
-  EFFECT + MainEffect 0 10         # Priority 0 (foreground)
+  EFFECT - FlashCube 0 20          # Priority -1 (background)
+  EFFECT = BgEffect 0 10           # Priority -1 (same layer)
+  EFFECT + MainEffect 0 20         # Priority 0 (foreground)
 ```
 
 ### Sequence with Explicit End
 ```
-SEQUENCE 8b 0 [5.0]
-  EFFECT + Particles 0 120         # Runs until 5s (sequence end)
+SEQUENCE 16 0 [10]
+  EFFECT + Particles 0 240         # Runs until beat 26 (sequence end at 16+10)
 ```
 
 ### Post-Processing Chain
 ```
 SEQUENCE 0 10
-  EFFECT + GaussianBlur 0 60       # Applied first
-  EFFECT + ChromaAberration 0 60   # Applied second
-  EFFECT + Solarize 0 60           # Applied last
+  EFFECT + GaussianBlur 0 120      # Applied first
+  EFFECT + ChromaAberration 0 120  # Applied second
+  EFFECT + Solarize 0 120          # Applied last
 ```
 
-### Music-Synchronized
+### Four-Bar Sequence (16 beats)
 ```
 # BPM 120
-SEQUENCE 0b 0
-  EFFECT + Flash 0b 1b             # Beat 0-1 (0-0.5s)
-  EFFECT + Heptagon 4b 8b          # Beat 4-8 (2-4s)
+SEQUENCE 0 0 "Intro"
+  EFFECT + Flash 0 1               # First beat
+  EFFECT + Heptagon 4 12           # Second bar through third bar
+  EFFECT + Fade 15 16              # Final beat
+```
+
+### Explicit Physical Time (Rare)
+```
+SEQUENCE 2.5s 0 "Intro timing"     # Start at 2.5 physical seconds
+  EFFECT + Fade 0 4                # Fade from beat 0-4 (relative)
 ```
 
 ---
diff --git a/doc/UNIFORM_BUFFER_GUIDELINES.md b/doc/UNIFORM_BUFFER_GUIDELINES.md
index ac02223..93999d8 100644
--- a/doc/UNIFORM_BUFFER_GUIDELINES.md
+++ b/doc/UNIFORM_BUFFER_GUIDELINES.md
@@ -19,7 +19,7 @@ Structs are padded to the alignment of their largest member. Any trailing space
 To maintain consistency and facilitate efficient rendering, a standard pattern for uniform buffer usage is established:
 
 - **Binding 0 & 1:** Reserved for Sampler and Texture access (handled by `pp_update_bind_group`).
-- **Binding 2:** **Common Uniforms** (`CommonPostProcessUniforms` or similar). This buffer should contain frequently used data like resolution, aspect ratio, time, beat, and audio intensity.
+- **Binding 2:** **Common Uniforms** (`CommonPostProcessUniforms` or similar). This buffer should contain frequently used data like resolution, aspect ratio, physical time, beat time, beat phase, and audio intensity.
 - **Binding 3:** **Effect-Specific Parameters**. This buffer holds parameters unique to a particular effect (e.g., `strength`, `speed`, `fade_amount`).
 
 This pattern ensures that common data is shared efficiently across effects, while effect-specific data remains isolated.
@@ -34,20 +34,26 @@ When defining uniform structs in WGSL, adhere to the following:
 - **Use `vec2<f32>` for 8-byte padding:** If you need 8 bytes of padding, use `_pad0: vec2<f32>` instead of `_pad0: f32, _pad1: f32` for potentially better clarity and to leverage WGSL's type system.
 - **Minimize Padding:** Only add padding where required by alignment rules to reduce memory usage.
 
-**Example (CommonPostProcessUniforms / HeptagonUniforms):**
+**Example (CommonPostProcessUniforms):**
 
 ```wgsl
 struct CommonUniforms {
-  resolution: vec2<f32>,
-  _pad0: vec2<f32>, // 8 bytes padding to align subsequent members
-  aspect_ratio: f32,
-  time: f32,
-  beat: f32,
-  audio_intensity: f32,
+  resolution: vec2<f32>,      // Screen dimensions (8 bytes)
+  aspect_ratio: f32,          // Width/height ratio (4 bytes)
+  time: f32,                  // Physical seconds, unaffected by tempo (4 bytes)
+  beat_time: f32,             // Musical time in beats (4 bytes)
+  beat_phase: f32,            // Fractional beat 0.0-1.0 (4 bytes)
+  audio_intensity: f32,       // Audio peak for beat sync (4 bytes)
+  _pad: f32,                  // Alignment padding (4 bytes)
 };
-// Expected size: 32 bytes
+// Total size: 32 bytes (8 f32 values)
 ```
 
+**Use cases:**
+- `time`: Constant-speed physics animation
+- `beat_time`: Musical bar/beat synchronization
+- `beat_phase`: Smooth per-beat oscillation
+
 **Example (EffectParams with f32 members):**
 
 ```wgsl
@@ -73,14 +79,15 @@ For every WGSL uniform struct, a corresponding C++ struct must exist. This C++ s
 
 ```cpp
 struct CommonPostProcessUniforms {
-  vec2 resolution;    // 8 bytes
-  float _pad[2];      // 8 bytes padding (matches vec2<f32> in WGSL)
-  float aspect_ratio; // 4 bytes
-  float time;         // 4 bytes
-  float beat;         // 4 bytes
-  float audio_intensity; // 4 bytes
+  vec2 resolution;        // 8 bytes - screen dimensions
+  float aspect_ratio;     // 4 bytes - width/height ratio
+  float time;             // 4 bytes - physical seconds
+  float beat_time;        // 4 bytes - musical beats
+  float beat_phase;       // 4 bytes - fractional beat 0-1
+  float audio_intensity;  // 4 bytes - audio peak
+  float _pad;             // 4 bytes - alignment padding
 };
-static_assert(sizeof(CommonPostProcessUniforms) == 32, 
+static_assert(sizeof(CommonPostProcessUniforms) == 32,
               "CommonPostProcessUniforms must be 32 bytes for WGSL alignment");
 ```