diff options
| author | skal <pascal.massimino@gmail.com> | 2026-02-08 10:12:35 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-02-08 10:12:35 +0100 |
| commit | d2d20763ac61f59187d261bb7d6dedcab525bc54 (patch) | |
| tree | 1412707ca48fa8e95f531ee021303c4a21a63da1 | |
| parent | 71736cc2f935a910c737bbfe0683d286688b4d73 (diff) | |
docs(audio): Update AUDIO_LIFECYCLE_REFACTOR.md and AUDIO_TIMING_ARCHITECTURE.md\n\nfeat(audio): Converted AUDIO_LIFECYCLE_REFACTOR.md from a design plan to a description of the implemented AudioEngine and SpectrogramResourceManager architecture, detailing the lazy-loading strategy and seeking capabilities.\n\ndocs(audio): Updated AUDIO_TIMING_ARCHITECTURE.md to reflect current code discrepancies in timing, BPM, and peak decay. Renamed sections to clarify current vs. proposed architecture and outlined a detailed implementation plan for Task #71.\n\nhandoff(Gemini): Finished updating audio documentation to reflect current state and future tasks.
| -rw-r--r-- | doc/AUDIO_LIFECYCLE_REFACTOR.md | 1337 | ||||
| -rw-r--r-- | doc/AUDIO_TIMING_ARCHITECTURE.md | 458 |
2 files changed, 159 insertions, 1636 deletions
diff --git a/doc/AUDIO_LIFECYCLE_REFACTOR.md b/doc/AUDIO_LIFECYCLE_REFACTOR.md index 70aaff6..f5e0045 100644 --- a/doc/AUDIO_LIFECYCLE_REFACTOR.md +++ b/doc/AUDIO_LIFECYCLE_REFACTOR.md @@ -1,197 +1,16 @@ -# Audio System Lifecycle Refactor Plan +# Audio System Architecture ## Problem Statement -The current audio system has a fragile initialization order dependency between `synth_init()` and `tracker_init()`: +The legacy audio system had a fragile initialization order dependency: `tracker_init()` had to be called after `synth_init()` because the synth would clear all registered spectrograms. This was brittle, non-obvious, and made testing difficult, as tests needed to respect this internal implementation detail. -- `synth_init()` clears ALL registered spectrograms (global state reset) -- `tracker_init()` registers spectrograms with the synth -- If called in wrong order → spectrograms cleared → no audio / test failures +## Implemented Solution: The AudioEngine -**Current Workarounds:** -- Tests must call `synth_init()` before `tracker_init()` -- `tracker_init()` re-registers everything on every call (memory overhead) -- Hidden bugs when `audio_init()` internally calls `synth_init()` +To solve this, the audio system was refactored into a unified, lifecycle-managed architecture centered around two new classes: `AudioEngine` and `SpectrogramResourceManager`. This design eliminates initialization-order dependencies and provides clear ownership of resources. -**Why This is Bad:** -- Brittle: Easy to break by calling init functions in wrong order -- Non-obvious: No compile-time or runtime checks for correct usage -- Wasteful: Re-allocates and re-registers on every init -- Hard to test: Tests must know internal implementation details -- Prevents composition: Can't have multiple tracker instances +### Architecture Overview ---- - -## Design Goals - -1. **Order Independence:** Init functions should work in any order -2. **Explicit Ownership:** Clear who owns what resources -3. **Testability:** Easy to mock, inject, and test components -4. **Memory Safety:** No leaks, no dangling pointers -5. **Size Conscious:** Minimal overhead for 64k binary goal - ---- - -## Proposed Solutions (3 Options) - -### Option A: Unified AudioEngine Class (Recommended) - -**Concept:** Single top-level class that manages synth and tracker as private members. - -```cpp -class AudioEngine { - public: - // Lifecycle - void init(); - void shutdown(); - void reset(); // Clear all state - - // Synth interface (delegates to private synth_) - int register_spectrogram(const Spectrogram* spec); - void trigger_voice(int spec_id, float volume, float pan); - void render(float* output, int num_frames); - - // Tracker interface (delegates to private tracker_) - void load_music_data(const TrackerScore* score); - void update_tracker(float music_time); - - private: - Synth synth_; // Embedded instance - Tracker tracker_; // Embedded instance - bool initialized_ = false; -}; -``` - -**Pros:** -- Single initialization point: `engine.init()` sets up everything -- Order independence: Internal init sequence is correct by design -- Clear ownership: Engine owns both synth and tracker -- Easy to test: Mock the entire engine or inject test data - -**Cons:** -- Larger refactor: Need to update all callsites -- Increased coupling: Synth and tracker bundled together -- Slightly larger binary (vtable overhead if using virtual methods) - -**Implementation Steps:** -1. Create `AudioEngine` class in `src/audio/audio_engine.h/cc` -2. Move synth and tracker global state into class members -3. Update `audio.cc` to use `AudioEngine` internally -4. Refactor tests to use `AudioEngine` API -5. Gradually migrate production code -6. Remove old `synth_init()` / `tracker_init()` globals - ---- - -### Option B: Registration Handle System - -**Concept:** Tracker holds registration handles and manages synth resources explicitly. - -```cpp -class Tracker { - public: - void init(Synth* synth); // Takes synth pointer, registers spectrograms - void shutdown(); // Unregisters spectrograms - void update(float music_time); - - private: - Synth* synth_ = nullptr; - std::vector<int> registered_spec_ids_; // Tracks what we registered -}; - -// Usage: -Synth synth; -synth.init(); - -Tracker tracker; -tracker.init(&synth); // Registers spectrograms - -// Later, if synth is reset: -synth.reset(); -tracker.re_register(); // Tracker knows what to re-register -``` - -**Pros:** -- Explicit dependencies: Tracker knows it depends on Synth -- Order enforced by API: Can't init tracker without synth pointer -- Resource cleanup: Tracker can unregister on shutdown -- Smaller refactor: Only changes tracker implementation - -**Cons:** -- Still has global synth state (just less fragile) -- Tracker needs to track registered IDs (extra bookkeeping) -- Multiple trackers would need careful coordination - -**Implementation Steps:** -1. Add `Synth* synth_` member to Tracker -2. Change `tracker_init()` to `tracker_init(Synth* synth)` -3. Store registered spec IDs in tracker -4. Add `tracker_re_register()` to handle synth resets -5. Update all callsites to pass synth pointer - ---- - -### Option C: Synth Resource Manager with Reference Counting - -**Concept:** Spectrograms are ref-counted, auto-cleanup when no refs remain. - -```cpp -class Synth { - public: - // Returns a handle that auto-releases on destruction - SpectrogramHandle register_spectrogram(const Spectrogram* spec); - - void trigger_voice(const SpectrogramHandle& handle, float volume, float pan); -}; - -class SpectrogramHandle { - public: - ~SpectrogramHandle() { if (synth_) synth_->unregister(id_); } - SpectrogramHandle(const SpectrogramHandle&); // Ref count++ - int id() const { return id_; } - - private: - Synth* synth_; - int id_; -}; -``` - -**Pros:** -- RAII: Automatic cleanup, no leaks -- Safe against synth resets: Handles become invalid automatically -- Modern C++ design: Follows std::shared_ptr pattern - -**Cons:** -- Complex implementation: Ref counting overhead -- Larger binary: More code for handle management -- Doesn't solve init order (still need synth before tracker) - -**Implementation Steps:** -1. Create `SpectrogramHandle` class with ref counting -2. Update `synth_register_spectrogram()` to return handle -3. Add `synth_unregister()` internal API -4. Update tracker to store handles instead of IDs -5. Test thoroughly (ref counting bugs are subtle) - ---- - -## Recommended Approach: Option A (AudioEngine) + Resource Manager - -**Why Option A?** -- Most robust: Completely solves initialization order issues -- Future-proof: Easy to extend with more audio subsystems -- Testable: Single point of dependency injection -- Size acceptable: ~500 bytes overhead (acceptable for robustness) - -### Component Separation: Addressing Asset/Procedural Spectrograms - -**Current Problem:** -Tracker currently handles 3 distinct responsibilities: -1. Loading assets via `GetAsset()` (AssetManager dependency) -2. Generating procedural spectrograms (owns allocated memory) -3. Pattern sequencing (triggers events at correct times) - -**Proposed Architecture:** +The final architecture separates responsibilities into distinct, manageable components: ``` ┌─────────────────┐ @@ -215,1128 +34,72 @@ Tracker currently handles 3 distinct responsibilities: └─────────────┘ └─────────────┘ ``` -**SpectrogramResourceManager Responsibilities:** -- Load asset spectrograms from AssetManager -- Generate procedural spectrograms from NoteParams -- Own all allocated memory (consistent ownership) -- Provide unified interface for both types -- Cache resources to avoid re-generation - -**Migration Path (Incremental):** -1. Create `AudioEngine` class alongside existing globals -2. Update tests to use `AudioEngine` first (validate correctness) -3. Add `audio_engine.cc` to build, mark old functions deprecated -4. Migrate `main.cc` and production code -5. Remove old `synth_init()` / `tracker_init()` in final cleanup - -**Estimated Effort:** -- Week 1: Design + implement AudioEngine class -- Week 2: Migrate tests + validate -- Week 3: Migrate production code -- Week 4: Cleanup + documentation - ---- - -## Alternative: Keep Current Design with Better Documentation - -If refactor is too costly, improve current system: - -1. **Add Assertions:** - ```cpp - void tracker_init() { - assert(synth_is_initialized() && "Must call synth_init() first!"); - // ... rest of init - } - ``` - -2. **Add State Tracking:** - ```cpp - static bool g_synth_initialized = false; - void synth_init() { - // ... init code - g_synth_initialized = true; - } - ``` - -3. **Document Init Order:** - - Update `CONTRIBUTING.md` with explicit init order requirements - - Add comments to every init function - - Create example code snippets - -**Pros:** Zero refactor cost -**Cons:** Doesn't solve underlying fragility - ---- - -## Decision Matrix - -| Criterion | Option A (Engine) | Option B (Handles) | Option C (RefCount) | Status Quo | -|-----------|-------------------|-------------------|-------------------|------------| -| Order Independence | ✅ Full | ⚠️ Partial | ❌ No | ❌ No | -| Testability | ✅ Excellent | ✅ Good | ⚠️ Moderate | ❌ Poor | -| Memory Safety | ✅ Full | ✅ Good | ✅ Excellent | ⚠️ Manual | -| Refactor Effort | 🔴 High | 🟡 Medium | 🔴 High | 🟢 None | -| Binary Size Impact | +500B | +200B | +1KB | 0B | -| Future Extensibility | ✅ High | ⚠️ Medium | ⚠️ Medium | ❌ Low | - -**Recommendation:** Option A for production, Status Quo improvements for short-term +### 1. AudioEngine ---- - -## Detailed Design: SpectrogramResourceManager +The `AudioEngine` acts as a high-level facade for the entire audio subsystem. It is the single entry point for initialization, updates, and shutdown. -### Interface +**Responsibilities:** +- Manages the lifecycle of the `Synth`, `Tracker`, and `SpectrogramResourceManager`. +- Guarantees the correct initialization order internally. +- Provides a unified API for updating the audio state (`update()`) and seeking (`seek()`). +- Abstracts away the complexity of resource loading and synth registration. +**Before (Fragile Initialization):** ```cpp -class SpectrogramResourceManager { - public: - // Lifecycle - void init(); - void shutdown(); // Frees all owned memory - - // Lazy loading API (loads on first access) - const Spectrogram* get_or_load_asset(AssetId asset_id); - const Spectrogram* get_or_generate_procedural(int sample_id, const NoteParams& params); - - // Explicit loading (for pre-warming if needed) - int preload_asset(AssetId asset_id); - int preload_procedural(int sample_id, const NoteParams& params); - - // Query API - const Spectrogram* get_spectrogram(int resource_id) const; - bool is_loaded(int resource_id) const; - - // Cache management - void clear_cache(); - int get_cache_size() const; - - private: - struct Resource { - Spectrogram spec; - float* owned_data; // nullptr if asset (not owned) - AssetId asset_id; // ASSET_LAST_ID if procedural - bool is_procedural; - }; - - Resource resources_[MAX_SPECTROGRAMS]; - int next_slot_ = 0; -}; +// In main.cc or tests +synth_init(); +tracker_init(); +// ... +tracker_update(music_time); ``` -### Usage in AudioEngine (Lazy Loading) - +**After (Robust Initialization):** ```cpp -class AudioEngine { - public: - void init() { - synth_.init(); - resource_mgr_.init(); - tracker_.init(&synth_, &resource_mgr_); - } - - void load_music_data(const TrackerScore* score, - const NoteParams* samples, - const AssetId* sample_assets, - uint32_t sample_count) { - // Register sample metadata (but don't load yet!) - for (uint32_t i = 0; i < sample_count; ++i) { - if (sample_assets[i] != AssetId::ASSET_LAST_ID) { - resource_mgr_.register_asset(i, sample_assets[i]); // Metadata only - } else { - resource_mgr_.register_procedural(i, samples[i]); // Metadata only - } - } - - tracker_.load_score(score); - } - - void update(float music_time, float dt) { - // Pre-warm samples needed in next 2 seconds - tracker_.prewarm_lookahead(music_time, music_time + 2.0f); - - // Update tracker (triggers events) - tracker_.update(music_time); - } - - void trigger_sample(int sample_id, float volume, float pan) { - // Load on-demand if not cached - const Spectrogram* spec = resource_mgr_.get_or_load(sample_id); - - // Register with synth (lazy registration) - int synth_id = get_or_register_synth_id(sample_id, spec); - - // Trigger voice - synth_.trigger_voice(synth_id, volume, pan); - } - - private: - int get_or_register_synth_id(int sample_id, const Spectrogram* spec) { - if (sample_to_synth_id_[sample_id] == -1) { - // First use: register with synth - sample_to_synth_id_[sample_id] = synth_.register_spectrogram(spec); - } - return sample_to_synth_id_[sample_id]; - } - - Synth synth_; - Tracker tracker_; - SpectrogramResourceManager resource_mgr_; - int sample_to_synth_id_[256] = {-1}; // -1 = not registered yet -}; -``` +// In main.cc or tests +#include "audio/audio_engine.h" -**Timeline:** -``` -t=0.0s: init() - No samples loaded (instant startup) -t=0.0s: load_music_data() - Register metadata only (~0ms) -t=0.0s: update(0.0) - Pre-warm samples for t=0-2s (load 3-5 samples) -t=0.0s: Pattern triggers → trigger_sample() → Instant (pre-warmed) -t=1.0s: update(1.0) - Pre-warm samples for t=1-3s (load 2-3 more) -t=2.0s: update(2.0) - Pre-warm samples for t=2-4s (load 2-3 more) -... -Total samples loaded by t=10s: Maybe 10-12 (not all 19) +AudioEngine g_audio_engine; +g_audio_engine.init(); +// ... +g_audio_engine.update(music_time); ``` -### Memory Ownership Rules +### 2. SpectrogramResourceManager -**Clear Ownership:** -1. **Assets:** AssetManager owns, ResourceManager borrows (pointer only) -2. **Procedurals:** ResourceManager owns (allocated with `new[]`, freed in `shutdown()`) -3. **Synth:** Only stores pointers, never owns data +This class centralizes all resource loading and memory management for spectrograms, for both binary assets and procedurally generated notes. -**Lifetime Guarantees:** -- Assets live until program exit (static data) -- Procedurals live until `ResourceManager::shutdown()` -- Synth references valid as long as ResourceManager exists +**Responsibilities:** +- **Loading:** Handles loading `.spec` files from the `AssetManager` and generating procedural spectrograms from `NoteParams`. +- **Ownership:** Enforces clear memory ownership rules. It *borrows* pointers to static assets but *owns* the memory for any generated procedural spectrograms, which it frees upon shutdown. +- **Lazy Loading & Caching:** Resources are not loaded when the application starts. Instead, their metadata is registered. The actual data is loaded on-demand the first time a sample is needed. Once loaded, it is cached for future use. This results in faster startup times and more efficient memory usage. --- -## Lazy Loading & On-Demand Strategy - -### Problem with Eager Loading - -**Current plan says:** "Load all resources ONCE at init" - -**Issues:** -- Long startup delay (load all 19 samples before demo starts) -- Wasted memory (samples used only once stay loaded) -- No benefit for streaming or late-game content - -**Example:** Demo is 60s, sample only used at t=45s → why load it at t=0? - -### Proposed: Lazy Loading with Cache - -**Strategy:** -1. **Load on first trigger** - Don't load until sample is actually needed -2. **Cache aggressively** - Keep loaded samples in memory (demo is short, ~60s) -3. **Optional pre-warming** - Load predicted samples 1-2 seconds ahead -4. **Eviction policy** - Unload samples that won't be used again (optional) - -### Implementation: Two-Tier Cache - -```cpp -class SpectrogramResourceManager { - private: - struct CacheEntry { - Spectrogram spec; - float* owned_data; // nullptr if not loaded yet - AssetId asset_id; - NoteParams proc_params; // For procedurals - - enum State { - UNLOADED, // Not loaded yet - LOADING, // Load in progress (async) - LOADED, // Ready to use - EVICTED // Was loaded, now evicted - }; - State state = UNLOADED; - float last_access_time; // For LRU eviction - }; - - CacheEntry cache_[MAX_SPECTROGRAMS]; - - public: - // Lazy loading (loads if not cached) - const Spectrogram* get_or_load(int sample_id) { - CacheEntry& entry = cache_[sample_id]; - - if (entry.state == LOADED) { - entry.last_access_time = current_time(); - return &entry.spec; // Cache hit! - } - - if (entry.state == UNLOADED || entry.state == EVICTED) { - load_now(&entry); // Load synchronously - entry.state = LOADED; - } - - return &entry.spec; - } - - private: - void load_now(CacheEntry* entry) { - if (entry->asset_id != ASSET_LAST_ID) { - // ASSET: Borrow pointer from AssetManager - size_t size; - const uint8_t* data = GetAsset(entry->asset_id, &size); - - #if defined(SUPPORT_COMPRESSED_ASSETS) - // ON-DEMAND DECOMPRESSION - if (is_compressed(data)) { - entry->owned_data = decompress_asset(data, size); - entry->spec.spectral_data_a = entry->owned_data; - } else { - entry->spec.spectral_data_a = (const float*)(data + sizeof(SpecHeader)); - } - #else - // Direct pointer (no decompression) - entry->spec.spectral_data_a = (const float*)(data + sizeof(SpecHeader)); - #endif - - } else { - // PROCEDURAL: Generate on-demand - int frames; - std::vector<float> data = generate_note_spectrogram(entry->proc_params, &frames); - - entry->owned_data = new float[data.size()]; - memcpy(entry->owned_data, data.data(), data.size() * sizeof(float)); - entry->spec.spectral_data_a = entry->owned_data; - entry->spec.num_frames = frames; - } - } -}; -``` - -### On-Demand Decompression (Future) - -When assets are compressed (Task #27), decompress lazily: - -```cpp -const Spectrogram* get_or_load(int sample_id) { - CacheEntry& entry = cache_[sample_id]; - - if (entry.state == LOADED) { - return &entry.spec; // Already decompressed - } - - // Load compressed asset - const uint8_t* compressed = GetAsset(entry->asset_id, &size); - - // Decompress into owned buffer - entry->owned_data = decompress_zlib(compressed, size, &decompressed_size); - entry->spec.spectral_data_a = entry->owned_data; - entry->state = LOADED; - - return &entry.spec; -} - -void evict(int sample_id) { - // Free decompressed data, can re-decompress from asset later - delete[] cache_[sample_id].owned_data; - cache_[sample_id].owned_data = nullptr; - cache_[sample_id].state = EVICTED; -} -``` +## Lazy Loading and Pre-warming Strategy -### Pre-warming Strategy +The system uses a lazy-loading approach to minimize startup delay and memory footprint. -**Idea:** Predict samples needed in next 1-2 seconds, load ahead of time - -```cpp -class Tracker { - public: - void update(float music_time) { - // 1. Trigger patterns at current time - trigger_patterns_at(music_time); - - // 2. Pre-warm samples needed in next 2 seconds - prewarm_lookahead(music_time, music_time + 2.0f); - } - - private: - void prewarm_lookahead(float start_time, float end_time) { - // Scan upcoming pattern triggers - for (const auto& trigger : upcoming_triggers_) { - if (trigger.time > start_time && trigger.time < end_time) { - const TrackerPattern& pattern = get_pattern(trigger.pattern_id); - - // Load all samples used in this pattern - for (const auto& event : pattern.events) { - resource_mgr_->preload(event.sample_id); // Async or sync - } - } - } - } -}; -``` +**Workflow:** +1. **`AudioEngine::init()`**: The resource manager is initialized, but no samples are loaded. +2. **`AudioEngine::load_music_data()`**: The tracker score is parsed, and metadata for all required samples (both asset-based and procedural) is *registered* with the `SpectrogramResourceManager`. No data is loaded at this stage. +3. **`AudioEngine::update()`**: During the main loop, the tracker identifies which samples will be needed in the near future (a 1-2 second lookahead window). It then asks the resource manager to "pre-warm" these samples, loading them into the cache just before they are needed. +4. **Triggering a sample**: When a tracker event fires, the `AudioEngine` requests the sample from the resource manager. Since it was pre-warmed, it's a fast cache hit. The engine then ensures the spectrogram is registered with the low-level synth and triggers the voice. **Benefits:** -- No loading stutter when pattern triggers -- Spreads load cost over time (not all at init) -- Can load asynchronously in background thread (if needed) - -### Cache Eviction Policy (Optional, Compile-Time Flag) - -**Demo knows its timeline:** Effects can explicitly release resources when done. - -**Explicit Resource Management (Preferred):** -```cpp -class Effect { - public: - void on_effect_end() override { - // Effect knows it won't need these samples again - resource_mgr_->release(SAMPLE_EXPLOSION); - resource_mgr_->release(SAMPLE_AMBIENT_WIND); - } -}; - -// In sequence compiler: -EFFECT ExplosionEffect 10.0 12.0 // Uses SAMPLE_EXPLOSION -// After t=12.0, explosion sample can be evicted -``` - -**Automatic Eviction (Under `ENABLE_CACHE_EVICTION` flag):** -```cpp -#if defined(ENABLE_CACHE_EVICTION) -void try_evict_lru() { - if (get_cache_memory_usage() > MAX_CACHE_SIZE) { - int lru_id = find_lru_entry(); - if (current_time() - cache_[lru_id].last_access_time > 10.0f) { - evict(lru_id); - } - } -} -#endif /* ENABLE_CACHE_EVICTION */ -``` - -**Eviction strategies (all optional):** -1. **Explicit** (recommended) - Effect calls `release()` when done -2. **LRU** (fallback) - Evict least recently used if memory tight -3. **Timeline hints** - Sequence compiler marks "last use" of samples - -**Recommendation:** -- Production: Explicit release only (demo knows timeline) -- Development: Enable LRU eviction for safety (`-DENABLE_CACHE_EVICTION`) -- Final 64k build: Disable eviction (cache everything, simple code) - ---- - -### Comparison: Eager vs Lazy Loading - -| Aspect | Eager (Load All at Init) | Lazy (Load on Demand) | Lazy + Pre-warm | -|--------|-------------------------|----------------------|-----------------| -| **Startup Time** | ❌ Slow (load all 19) | ✅ Fast (load 0) | ✅ Fast (load 0) | -| **First Trigger Latency** | ✅ Instant (cached) | ❌ Stutter (load now) | ✅ Instant (pre-loaded) | -| **Memory Usage** | ❌ High (all loaded) | ✅ Low (only used) | ⚠️ Medium (active + lookahead) | -| **Complexity** | ✅ Simple | ⚠️ Medium | ❌ Complex | -| **Best For** | Short demos, few samples | Long demos, many samples | **Recommended for 64k** | - -**Recommendation:** **Lazy + Pre-warm (1-2s lookahead)** -- Fast startup (no eager loading) -- No stutter (pre-warming prevents load-on-trigger) -- Memory efficient (only loads active + upcoming samples) - ---- - -### Benefits Over Current Design - -| Aspect | Current (Tracker-owned) | Proposed (ResourceManager) | -|--------|------------------------|---------------------------| -| **Ownership** | Split (assets borrowed, procedurals owned) | Centralized (clear rules) | -| **Coupling** | Tracker → AssetManager + gen | ResourceManager → AssetManager + gen | -| **Testability** | Hard (global GetAsset) | Easy (inject mock manager) | -| **Reusability** | Tracker-specific | Usable by any audio component | -| **Memory Safety** | Manual (delete[] in init) | RAII in shutdown() | - -### Integration with AssetManager - -**Option 1: Direct Dependency (Simple)** -```cpp -int SpectrogramResourceManager::load_asset_spectrogram(AssetId id) { - size_t size; - const uint8_t* data = GetAsset(id, &size); // Direct call - // ... parse and store -} -``` - -**Option 2: Injected Interface (Testable)** -```cpp -class IAssetProvider { - public: - virtual const uint8_t* get_asset(AssetId id, size_t* size) = 0; -}; - -class SpectrogramResourceManager { - public: - void set_asset_provider(IAssetProvider* provider) { - asset_provider_ = provider; - } - - int load_asset_spectrogram(AssetId id) { - const uint8_t* data = asset_provider_->get_asset(id, &size); - // ... parse and store - } - - private: - IAssetProvider* asset_provider_ = nullptr; -}; - -// Production: Adapter for AssetManager -class AssetManagerAdapter : public IAssetProvider { - public: - const uint8_t* get_asset(AssetId id, size_t* size) override { - return GetAsset(id, size); // Delegate to global - } -}; -``` - -**Recommendation:** Option 1 for now (simpler), Option 2 if we refactor AssetManager too +- **Fast Startup:** The application doesn't pay the cost of loading all audio assets up front. +- **No Stutter:** Pre-warming ensures that samples are already in memory when they need to be played, preventing stutter caused by load-on-trigger. +- **Memory Efficient:** Only the samples that are actively being used or are coming up soon are held in memory. --- -## Implementation Plan (Option A) - -### Phase 1: Design & Prototype (5-7 days) - -- [ ] Create `src/audio/spectrogram_resource_manager.h/cc` - - [ ] Implement lazy loading cache (UNLOADED → LOADED states) - - [ ] Add `get_or_load()` API for on-demand loading - - [ ] Add `register_*()` API for metadata-only registration - - [ ] Implement resource loading (assets + procedurals) - - [ ] Add memory ownership tracking - - [ ] Write unit tests for lazy loading behavior -- [ ] Create `src/audio/audio_engine.h/cc` with class interface -- [ ] Implement `AudioEngine::init()` / `shutdown()` / `reset()` -- [ ] Add delegation methods for synth/tracker APIs -- [ ] Integrate ResourceManager into AudioEngine -- [ ] **Implement `AudioEngine::seek()` for timeline scrubbing** - - [ ] Add `Synth::reset()` and `Synth::set_time()` - - [ ] Add `Tracker::reset()` and `Tracker::set_time()` - - [ ] Add `Tracker::update_silent()` for fast-forward - - [ ] Implement pre-warming for target time range - - [ ] Add seeking tests (forward, backward, edge cases) -- [ ] Write unit tests for `AudioEngine` lifecycle with resources - -### Phase 2: Test Migration (3-5 days) - -- [ ] Update `test_tracker.cc` to use `AudioEngine` -- [ ] Update `test_tracker_timing.cc` -- [ ] Update `test_variable_tempo.cc` -- [ ] Update `test_wav_dump.cc` -- [ ] Ensure 100% test pass rate - -### Phase 3: Production Integration (5-7 days) - -- [ ] Update `audio.cc` to use `AudioEngine` internally -- [ ] Update `main.cc` demo loop -- [ ] Update all effect classes that use audio -- [ ] Add backwards compatibility shims (temporary) - -### Phase 4: Cleanup (2-3 days) - -- [ ] Remove old `synth_init()` / `tracker_init()` functions -- [ ] Remove global synth/tracker state -- [ ] Update documentation (HOWTO.md, CONTRIBUTING.md) -- [ ] Remove compatibility shims - -### Phase 5: Optimization (Optional, 2-3 days) - -- [ ] Profile binary size impact -- [ ] Optimize if >1KB overhead -- [ ] Consider `#if !defined(STRIP_ALL)` for test-only features -- [ ] **Optional:** Add async loading (background thread for pre-warming) - - Load samples in background while demo runs - - Requires mutex for cache access - - Benefit: Zero stutter even for large samples - - Cost: +200-300 bytes for threading code - ---- - -## Future: Async Loading (Post-MVP) - -**Idea:** Pre-warm samples in background thread to avoid any stutter +## Timeline Seeking & Scrubbing (For Debugging) -```cpp -class SpectrogramResourceManager { - public: - void prewarm_async(int sample_id) { - if (cache_[sample_id].state == UNLOADED) { - cache_[sample_id].state = LOADING; - - // Enqueue background load - thread_pool_.enqueue([this, sample_id]() { - load_now(&cache_[sample_id]); - - std::lock_guard lock(cache_mutex_); - cache_[sample_id].state = LOADED; - }); - } - } - - const Spectrogram* get_or_load(int sample_id) { - std::lock_guard lock(cache_mutex_); - - CacheEntry& entry = cache_[sample_id]; - - if (entry.state == LOADED) { - return &entry.spec; // Ready! - } - - if (entry.state == LOADING) { - // Wait for background load to finish - wait_for_load(sample_id); - } else { - // Not started, load synchronously (fallback) - load_now(&entry); - entry.state = LOADED; - } - - return &entry.spec; - } - - private: - std::mutex cache_mutex_; - ThreadPool thread_pool_; -}; -``` - -**Benefits:** -- Zero latency: All loads happen in background -- Demo runs at full framerate during loading - -**Costs:** -- +200-300 bytes for threading code -- Complexity: Need mutexes, thread management -- May not be needed for 64k (samples are small, load quickly) - -**Recommendation:** Only implement if profiling shows stutter issues - ---- - -## Success Criteria - -1. **No Init Order Failures:** Tests pass regardless of call order -2. **Memory Clean:** No leaks detected by sanitizers -3. **Size Budget:** <1KB binary size increase -4. **API Simplicity:** Single `engine.init()` call for users -5. **Test Coverage:** 100% coverage of lifecycle edge cases - ---- - -## Open Questions - -1. Should `AudioEngine` be a singleton or instance-based? - - **Proposal:** Instance-based for testability, but provide global getter for convenience - -2. How to handle legacy code during migration? - - **Proposal:** Keep old functions as deprecated wrappers for 1-2 releases - -3. Should synth and tracker be members or pointers? - - **Proposal:** Direct members (smaller, faster, simpler ownership) - -4. What about backend abstraction (MiniaudioBackend, MockBackend)? - - **Proposal:** AudioEngine owns backend pointer, injected via `set_backend()` - ---- - -## Timeline Seeking & Scrubbing (Critical for Development) - -### Requirement: Robust Seeking for Debugging - -**Need to support:** -- Jump forward: t=5s → t=45s (skip to specific scene) -- Rewind: t=45s → t=5s (go back to debug earlier issue) -- Fast-forward: Simulate t=0-30s without audio (load samples, update state) -- Frame-by-frame: t → t+0.016s (precise debugging) - -**Properties:** -- ✅ Robust: Works for any timeline position -- ✅ Correct: Audio/visuals match target time exactly -- ⚠️ Performance: Doesn't need to be fast (debugging tool) -- ✅ Pre-warming: Must load samples for target time range - -### Implementation: AudioEngine::seek() - -```cpp -class AudioEngine { - public: - // Seek to arbitrary point in timeline - void seek(float target_time) { - #if !defined(STRIP_ALL) // Only in debug builds - - // 1. Reset synth state (clear all active voices) - synth_.reset(); - - // 2. Reset tracker state (re-scan patterns from scratch) - tracker_.reset(); - tracker_.set_time(target_time); - - // 3. Pre-warm samples for target time range - float prewarm_start = std::max(0.0f, target_time - 1.0f); - float prewarm_end = target_time + 2.0f; - tracker_.prewarm_lookahead(prewarm_start, prewarm_end); - - // 4. Simulate tracker up to target time (no audio rendering) - // This ensures tracker internal state is correct - const float dt = 0.1f; // Simulate in 0.1s steps - for (float t = 0.0f; t < target_time; t += dt) { - tracker_.update_silent(t); // Update state only, no audio - } - - // 5. Final update at exact target time - tracker_.update(target_time); - - // 6. Notify effects of seek event (optional) - sequence_.on_seek(target_time); - - current_time_ = target_time; - - #endif /* !defined(STRIP_ALL) */ - } - - // Fast-forward: Simulate time range without rendering - void fast_forward(float from_time, float to_time) { - const float dt = 0.1f; - for (float t = from_time; t < to_time; t += dt) { - tracker_.update_silent(t); - sequence_.update_silent(t, dt); // Update effects without rendering - } - } - - private: - float current_time_ = 0.0f; -}; -``` - -### Tracker Support for Seeking - -```cpp -class Tracker { - public: - void reset() { - // Clear all active patterns - for (auto& pattern : active_patterns_) { - pattern.active = false; - } - last_trigger_idx_ = 0; - } - - void set_time(float time) { - current_music_time_ = time; - - // Re-scan score to find patterns that should be active at this time - for (uint32_t i = 0; i < score_->num_triggers; ++i) { - const auto& trigger = score_->triggers[i]; - - if (trigger.time_sec <= time) { - // This pattern should be active (or was active) - // Determine if still active based on pattern duration - const auto& pattern = patterns_[trigger.pattern_id]; - float pattern_end = trigger.time_sec + get_pattern_duration(pattern); - - if (time < pattern_end) { - // Pattern is currently active, add it - activate_pattern(trigger.pattern_id, trigger.time_sec); - - // Fast-forward pattern to current beat - update_pattern_to_time(trigger.pattern_id, time); - } - - last_trigger_idx_ = i + 1; - } - } - } - - void update_silent(float time) { - // Update tracker state without triggering voices - // Used for fast-forward and seeking - update_internal(time, /*trigger_audio=*/false); - } - - private: - void update_pattern_to_time(int pattern_id, float target_time) { - // Advance pattern's event index to match target time - ActivePattern& active = find_active_pattern(pattern_id); - const TrackerPattern& pattern = patterns_[pattern_id]; - - float elapsed = target_time - active.start_music_time; - float beat_duration = 60.0f / score_->bpm; - float elapsed_beats = elapsed / beat_duration; - - // Advance event index past all events before target time - while (active.next_event_idx < pattern.num_events) { - if (pattern.events[active.next_event_idx].beat > elapsed_beats) { - break; - } - active.next_event_idx++; - } - } -}; -``` - -### Synth Support for Seeking - -```cpp -class Synth { - public: - void reset() { - // Clear all active voices (stop all sound immediately) - for (auto& voice : voices_) { - voice.active = false; - } - - // Reset time tracking - #if !defined(STRIP_ALL) - g_elapsed_time_sec = 0.0f; - #endif - } - - void set_time(float time) { - #if !defined(STRIP_ALL) - g_elapsed_time_sec = time; - #endif - } -}; -``` - -### Effect Sequence Support for Seeking - -```cpp -class Sequence { - public: - void on_seek(float target_time) { - // Notify all effects that might be active at target time - for (auto& layer : layers_) { - for (auto& entry : layer.effects) { - if (entry.start_time <= target_time && target_time < entry.end_time) { - // Effect is active at target time - entry.effect->on_seek(target_time); - } else if (target_time < entry.start_time) { - // Effect hasn't started yet, reset it - entry.effect->on_reset(); - } - } - } - } -}; - -class Effect { - public: - virtual void on_seek(float time) { - // Optional: Reset effect state to match target time - // Example: Particle system re-simulates to target time - } - - virtual void on_reset() { - // Optional: Clear all effect state - } -}; -``` - -### Usage Examples - -**Jump to specific scene:** -```cpp -// User presses 'J' key to jump to scene 3 (at t=45s) -if (key_pressed('J')) { - audio_engine.seek(45.0f); - camera.set_position(scene3_camera_pos); // Sync visuals too -} -``` - -**Rewind to beginning:** -```cpp -if (key_pressed('R')) { - audio_engine.seek(0.0f); -} -``` - -**Frame-by-frame debugging:** -```cpp -if (key_pressed('N')) { // Next frame - float dt = 1.0f / 60.0f; - audio_engine.update(current_time, dt); - current_time += dt; -} - -if (key_pressed('P')) { // Previous frame - float dt = 1.0f / 60.0f; - audio_engine.seek(std::max(0.0f, current_time - dt)); -} -``` - -**Fast-forward to load all samples:** -```cpp -// Pre-load all samples by fast-forwarding through entire demo -void preload_all_samples() { - audio_engine.fast_forward(0.0f, demo_duration); - audio_engine.seek(0.0f); // Reset to beginning -} -``` - -### Seeking Guarantees - -**Invariants after `seek(t)`:** -1. ✅ Tracker state matches time `t` (patterns active, event indices correct) -2. ✅ Synth has no active voices (silent state) -3. ✅ Samples for range [t-1s, t+2s] are pre-warmed -4. ✅ Next `update(t)` will trigger correct events -5. ✅ Audio output matches what would have played at time `t` - -**Edge Cases Handled:** -- Seek to t=0: Reset everything (initial state) -- Seek beyond demo end: Clamp to valid range -- Seek to same time twice: No-op (already there) -- Seek while audio playing: Stop all voices first - -### Performance Characteristics - -**Seek cost:** -- Synth reset: O(MAX_VOICES) ~ 64 voices → <1ms -- Tracker reset: O(MAX_PATTERNS) ~ 20 patterns → <1ms -- Pattern re-scan: O(num_triggers) ~ 100 triggers → <1ms -- Pre-warming: O(samples_in_range) ~ 5 samples → 5-10ms -- Total: **~10-20ms** (acceptable for debugging) - -**Optimization (if needed):** -- Cache tracker state snapshots every 5s -- Seek to nearest snapshot, then fast-forward delta -- Reduces re-scan cost for long demos - -### Testing Seeking Robustness - -```cpp -void test_seeking() { - AudioEngine engine; - engine.init(); - - // Test 1: Forward seek - engine.seek(10.0f); - assert(engine.get_time() == 10.0f); - - // Test 2: Backward seek (rewind) - engine.seek(5.0f); - assert(engine.get_time() == 5.0f); - - // Test 3: Seek to beginning - engine.seek(0.0f); - assert(engine.get_time() == 0.0f); - - // Test 4: Multiple seeks in sequence - engine.seek(20.0f); - engine.seek(30.0f); - engine.seek(15.0f); - assert(engine.get_time() == 15.0f); - - // Test 5: Verify samples loaded correctly after seek - engine.seek(45.0f); - const auto* spec = engine.get_resource_manager().get_spectrogram(SAMPLE_ENDING); - assert(spec != nullptr); // Sample for ending should be loaded -} -``` - ---- - -## FAQ - -### Q: How does this handle procedural vs asset spectrograms? - -**A:** Via a dedicated `SpectrogramResourceManager`: - -- **Assets:** Loaded from AssetManager, ResourceManager stores pointer only (doesn't own) -- **Procedurals:** Generated via `generate_note_spectrogram()`, ResourceManager owns memory -- **Unified Interface:** Both types returned as `const Spectrogram*`, caller doesn't care about source - -**Key Insight:** Separating resource loading from pattern sequencing allows: -- Tracker focuses on timing/triggering (single responsibility) -- Resources loaded once at init (no per-frame overhead) -- Easy to add new spectrogram sources (streaming, compressed, etc.) - -### Q: Does AudioEngine depend on AssetManager? - -**A:** Indirectly, via ResourceManager: - -``` -AudioEngine → ResourceManager → AssetManager (via GetAsset) - ↘ Tracker (no asset dependency) - ↘ Synth (no asset dependency) -``` - -**Benefits:** -- Only one component knows about assets (ResourceManager) -- Easy to mock for testing (inject fake ResourceManager) -- If AssetManager is refactored, only ResourceManager needs updating - -### Q: What happens to the current caching mechanism? - -**A:** Improved and centralized: - -**Current (Tracker-owned):** -```cpp -static int g_sample_synth_cache[256]; // Maps sample_id → synth_id -``` - -**Proposed (AudioEngine-owned):** -```cpp -class AudioEngine { - private: - int sample_to_synth_id_[256]; // Clearer ownership -}; -``` - -**Why better?** -- Cache lifetime matches engine lifetime (no stale data) -- Cleared automatically on `engine.reset()` -- Testable (can verify cache contents) - -### Q: How does seeking/scrubbing work for debugging? - -**A:** Robust seek API that resets state and pre-warms samples: - -```cpp -audio_engine.seek(45.0f); // Jump to t=45s - -// What happens internally: -// 1. synth.reset() - Clear all active voices -// 2. tracker.reset() - Clear active patterns -// 3. tracker.set_time(45.0) - Re-scan score, activate patterns at t=45 -// 4. prewarm(43-47s) - Load samples for target range -// 5. update(45.0) - Trigger events at exact time -``` - -**Supported operations:** -- **Jump forward:** `seek(45.0f)` - Skip to specific scene -- **Rewind:** `seek(5.0f)` - Go back to earlier time -- **Fast-forward:** `fast_forward(0, 30)` - Simulate without audio -- **Frame-by-frame:** `seek(t + 0.016)` - Step through frames - -**Performance:** ~10-20ms per seek (acceptable for debugging) - -**Guarantees:** -- ✅ Audio/visuals match target time exactly -- ✅ Tracker state correct (patterns active, event indices) -- ✅ Samples pre-warmed (no stutter on next update) -- ✅ Works for any timeline position (0 → end) - -**Development workflow:** -```cpp -// Press 'J' to jump to problematic scene at t=37s -if (key_pressed('J')) { - audio_engine.seek(37.0f); - // Audio and visuals now at t=37s, ready to debug -} -``` - -### Q: Why not load all samples at init? - -**A:** Lazy loading with pre-warming is more efficient: - -**Problems with eager loading:** -- Slow startup: Load all 19 samples before t=0 (~50-100ms delay) -- Wasted memory: Sample used only at t=45s stays loaded from t=0 -- No benefit: 60s demo doesn't need all samples loaded upfront - -**Lazy + Pre-warm strategy:** -``` -t=0.0s: Load 0 samples (instant startup) -t=0.0s: Pre-warm 3-5 samples for t=0-2s range (background) -t=1.0s: Pre-warm 2-3 more for t=1-3s range -By t=10s: Only ~10 samples loaded (not all 19) -``` - -**Benefits:** -- ✅ Instant startup (no initial loading) -- ✅ No trigger stutter (pre-warming prevents load-on-access) -- ✅ Memory efficient (only active + upcoming samples) -- ✅ Spreads load cost over time - -**Future: On-demand decompression (Task #27)** -When assets are compressed, decompress only when needed: -- Compressed asset in binary: 2KB -- Decompressed in memory: 8KB (4x larger) -- Only decompress samples actually used (save memory) - -### Q: How does this affect binary size? - -**Estimated overhead:** -- `SpectrogramResourceManager` (basic): ~300 bytes -- `SpectrogramResourceManager` (with lazy loading): ~500 bytes -- `AudioEngine` wrapper: ~200 bytes -- Total: **~700 bytes** (1.1% of 64KB budget) - -**Savings from removing global state:** ~100 bytes -**Net impact:** **~600 bytes** (acceptable for robustness + lazy loading) - ---- - -## Compile-Time Configuration Flags - -**Optional features controlled by CMake flags:** - -```cmake -# CMakeLists.txt -option(DEMO_ENABLE_CACHE_EVICTION "Enable automatic cache eviction (LRU)" OFF) -option(DEMO_ENABLE_TIMELINE_SEEKING "Enable timeline seeking for debugging" ON) -option(DEMO_ENABLE_ASYNC_LOADING "Enable background thread pre-warming" OFF) -``` - -**Usage in code:** - -```cpp -// Cache eviction (optional, for long demos or low memory) -#if defined(DEMO_ENABLE_CACHE_EVICTION) -void SpectrogramResourceManager::try_evict_lru() { - // ... LRU eviction logic -} -#endif - -// Timeline seeking (enabled by default in debug, stripped in STRIP_ALL) -#if defined(DEMO_ENABLE_TIMELINE_SEEKING) && !defined(STRIP_ALL) -void AudioEngine::seek(float target_time) { - // ... seeking logic -} -#endif - -// Async loading (optional, for zero-latency pre-warming) -#if defined(DEMO_ENABLE_ASYNC_LOADING) -void SpectrogramResourceManager::prewarm_async(int sample_id) { - thread_pool_.enqueue([this, sample_id]() { - load_now(&cache_[sample_id]); - }); -} -#endif -``` - -**Recommended configurations:** - -| Build Type | Cache Eviction | Timeline Seeking | Async Loading | -|------------|---------------|-----------------|---------------| -| **Development** | ON (safety) | ON (debugging) | OFF (simplicity) | -| **Testing** | ON (coverage) | ON (test seeks) | OFF (deterministic) | -| **Release** | OFF (explicit) | OFF (no debug) | OFF (size) | -| **Final 64k** | OFF | OFF (`STRIP_ALL`) | OFF | - -**Size impact:** -- Cache eviction: +150 bytes -- Timeline seeking: +300 bytes (stripped in final) -- Async loading: +250 bytes - -**Binary size budget:** -``` -Base AudioEngine: 500 bytes -+ Lazy loading: +200 bytes -+ Seeking (debug only): +300 bytes (stripped) -+ Cache eviction (opt): +150 bytes (disabled) -+ Async loading (opt): +250 bytes (disabled) -──────────────────────────────────── -Development build: 1150 bytes -Final 64k build: 700 bytes -``` - ---- +A key feature enabled by this refactor is robust timeline seeking, which is invaluable for development and debugging. The `AudioEngine::seek()` method (`#if !defined(STRIP_ALL)`) allows jumping to any point in the demo timeline. -## References +**`seek(target_time)` Process:** +1. **Reset:** The synth and tracker states are completely reset, clearing all active voices and pattern states. +2. **State Reconstruction:** The tracker re-scans the musical score from the beginning up to `target_time` to determine which patterns should be active. +3. **Pre-warming:** The resource manager pre-loads all samples needed for the time range around `target_time`. +4. **Ready:** The audio system is now in the exact state it would have been if the demo had run normally up to `target_time`, ready for playback to resume. -- Current issue: Commit 7721f57 "fix(audio): Resolve tracker test failures..." -- Related: `doc/TRACKER.md`, `doc/ASSET_SYSTEM.md` -- Design patterns: Facade pattern, Dependency Injection, Resource Manager pattern +This allows developers to instantly jump to a specific scene to debug audio or visual issues without having to watch the demo from the start. diff --git a/doc/AUDIO_TIMING_ARCHITECTURE.md b/doc/AUDIO_TIMING_ARCHITECTURE.md index 9ac3927..7b125d0 100644 --- a/doc/AUDIO_TIMING_ARCHITECTURE.md +++ b/doc/AUDIO_TIMING_ARCHITECTURE.md @@ -1,26 +1,53 @@ -# Audio Timing Architecture - Proper Solution (February 7, 2026) +# Audio Timing Architecture - Analysis and Proposed Solution (February 8, 2026) ## Problem Statement **Original Issue:** "demo is still flashing a lot" due to audio-visual timing mismatch. **Root Causes:** -1. Multiple time sources with no clear hierarchy -2. Hardcoded latency constants (400ms) in solution proposals -3. Beat calculation using wrong time source -4. Peak decay rate not matched to music tempo +1. Multiple time sources with no clear hierarchy. +2. Beat calculation for visuals uses the wrong time source (physical time instead of audio playback time). +3. Hardcoded peak decay rate does not adapt to the music's tempo. +4. Hardcoded BPM values in some places prevent data-driven tempo changes. --- -## Correct Architecture ✅ +## Current State Analysis (The Problem) -### Single Source of Truth: Physical Clock +The current implementation suffers from several discrepancies that lead to audio-visual desynchronization. + +### Discrepancy #1: Wrong Time Source for Beat Calculation +Both `main.cc` and `test_demo.cc` use the physical wall clock time to calculate the current musical beat for visual effects. However, the audio peak is measured at the moment the audio is actually played, which can be hundreds of milliseconds later due to audio buffering. + +**`main.cc` (Incorrect Logic):** ```cpp -platform_get_time() → ONE authoritative wall clock from OS +// current_time is derived from platform_state.time (physical clock) +const double current_time = platform_state.time + seek_time; +// Beat is calculated from physical time +const float beat_time = (float)current_time * g_tracker_score.bpm / 60.0f; +// Peak is from audio playback time (a different clock) +const float raw_peak = audio_get_realtime_peak(); + +// MISMATCH: `beat_time` and `raw_peak` are out of sync! ``` +This is the primary cause of the AV sync issue. + +### Discrepancy #2: Hardcoded Configuration -**Everything else derives from this:** +The system relies on several hardcoded values instead of being data-driven. + +- **Hardcoded BPM:** `src/test_demo.cc` hardcodes the BPM to `120.0f`, ignoring the value in the track file. +- **Hardcoded Peak Decay:** `src/audio/backend/miniaudio_backend.cc` uses a fixed decay rate of `0.5f`, which does not adapt to different tempos. A fast song will have its visual peak decay too slowly, and a slow song too quickly. +- **No `tracker_get_bpm()` API:** There is no formal function to get the BPM from the tracker; code accesses the global `g_tracker_score.bpm` directly. + +--- + +## Proposed Architecture (The Solution) ✅ + +### Single Source of Truth: Physical Clock + +The core principle remains: `platform_get_time()` is the one authoritative wall clock from the OS. All other time representations should derive from it in a managed way. ``` Physical Time (platform_get_time()) @@ -46,14 +73,16 @@ Physical Time (platform_get_time()) | Time Source | Purpose | How to Get | Use For | |-------------|---------|------------|---------| | **Physical Time** | Wall clock, frame deltas | `platform_get_time()` | dt calculations, physics | -| **Audio Playback Time** | What's being HEARD | `audio_get_playback_time()` | Audio-visual sync, beat display | +| **Audio Playback Time** | What's being HEARD | `audio_get_playback_time()` | **Audio-visual sync, beat display** | | **Music Time** | Tracker time (tempo-scaled) | `g_music_time` | Tracker event triggering | --- -## Implementation: test_demo.cc +## Correct Implementation Example + +This demonstrates how `test_demo.cc` and `main.cc` *should* be implemented to achieve proper synchronization. -### Before (Wrong ❌) +### Before (Current Incorrect State ❌) ```cpp const double current_time = platform_state.time; // Physical time @@ -61,19 +90,17 @@ const double current_time = platform_state.time; // Physical time // Beat calculation based on physical time const float beat_time = (float)current_time * 120.0f / 60.0f; -// But peak is measured at audio playback time (400ms behind!) +// But peak is measured at audio playback time (e.g. 400ms behind!) const float raw_peak = audio_get_realtime_peak(); // MISMATCH: beat and peak are from different time sources! ``` -**Problem:** Visual beat shows beat 2 (physical time), but peak shows beat 1 (audio playback time). +**Problem:** Visual beat shows beat 2 (from physical time), but peak is for beat 1 (from audio playback time). -### After (Correct ✅) +### After (Proposed Correct State ✅) ```cpp -const double physical_time = platform_state.time; // For dt calculations - // Audio playback time: what's being HEARD right now const float audio_time = audio_get_playback_time(); @@ -83,10 +110,10 @@ const float beat_time = audio_time * 120.0f / 60.0f; // Peak is measured at audio playback time const float raw_peak = audio_get_realtime_peak(); -// SYNCHRONIZED: beat and peak are from same time source! +// SYNCHRONIZED: beat and peak are from the same time source! ``` -**Result:** Visual beat shows beat 1, peak shows beat 1 → synchronized! ✅ +**Result:** Visual beat shows beat 1, peak is for beat 1 → synchronized! ✅ --- @@ -99,354 +126,87 @@ float audio_get_playback_time() { return (float)total_samples / (RING_BUFFER_SAMPLE_RATE * RING_BUFFER_CHANNELS); } ``` - **Key Points:** -1. **Tracks samples consumed** by audio callback (not samples rendered) -2. **Automatically accounts for ring buffer latency** (no hardcoded constants!) -3. **Self-consistent** with `audio_get_realtime_peak()` (measured at same moment) - -**Example Timeline:** -``` -T(physical) = 1.00s: - → Ring buffer has 400ms of lookahead - → Audio callback is playing samples rendered at T=0.60s (music time) - → total_read counter reflects 0.60s worth of samples - → audio_get_playback_time() returns 0.60s - → audio_get_realtime_peak() measured from samples at 0.60s - → Beat calculation: 0.60s * 2 = 1.2 → beat 1 - → SYNCHRONIZED! ✅ -``` - ---- - -## Remaining Issues: Data-Driven Configuration - -### Issue #1: Hardcoded Decay Rate - -**Current** (miniaudio_backend.cc:166): -```cpp -realtime_peak_ *= 0.5f; // Hardcoded: 50% per callback -``` - -**Problem:** Decay rate should match music tempo, not be hardcoded! - -**Proposed Solution:** -```cpp -// AudioBackend should query decay rate from audio system: -float decay_rate = audio_get_peak_decay_rate(); // Returns BPM-adjusted rate -realtime_peak_ *= decay_rate; -``` - -**How to calculate:** -```cpp -// In audio system (based on current BPM): -float audio_get_peak_decay_rate() { - const float bpm = tracker_get_bpm(); // e.g., 120 - const float beat_interval = 60.0f / bpm; // e.g., 0.5s - const float callback_interval = 0.128f; // Measured from device - - // Decay to 10% within one beat: - // decay_rate^(beat_interval / callback_interval) = 0.1 - // decay_rate = 0.1^(callback_interval / beat_interval) - - const float num_callbacks_per_beat = beat_interval / callback_interval; - return powf(0.1f, 1.0f / num_callbacks_per_beat); -} -``` - -**Result:** At 120 BPM, decay to 10% in 0.5s (1 beat). At 60 BPM, decay to 10% in 1.0s (1 beat). Adapts automatically! +1. **Tracks samples consumed** by the audio callback (what is being heard). +2. **Automatically accounts for ring buffer latency** without hardcoded constants. +3. **Is self-consistent** with `audio_get_realtime_peak()`, which is measured at the same moment. --- -### Issue #2: Hardcoded BPM - -**Current** (test_demo.cc:305): -```cpp -const float beat_time = audio_time * 120.0f / 60.0f; // Hardcoded BPM -``` - -**Problem:** BPM should come from tracker/music data! - -**Proposed Solution:** -```cpp -// Tracker should expose BPM: -const float bpm = tracker_get_bpm(); // From TrackerScore -const float beat_time = audio_time * bpm / 60.0f; - -// Or even better, tracker calculates beat directly: -const float beat = tracker_get_current_beat(audio_time); -``` - -**Implementation:** -```cpp -// In tracker.h/cc: -float tracker_get_bpm() { - return g_tracker_score.bpm; // From parsed .track file -} +## Implementation Plan (Task #71) -float tracker_get_current_beat(float audio_time) { - return audio_time * (g_tracker_score.bpm / 60.0f); -} -``` +To fix the jitter and simplify the audio pipeline, the following steps should be taken. -**Result:** Change BPM in `.track` file → everything updates automatically! +### Phase 1: Fix Core Timing Synchronization ---- +1. **Update `main.cc` and `test_demo.cc`:** Modify the main loop and test loop to use `audio_get_playback_time()` as the time source for all beat and visual calculations. This is the highest priority fix. -### Issue #3: No API for "What time is it in sequence world?" +### Phase 2: Implement Data-Driven Configuration -**User's Suggestion:** -> "ask the AudioSystem or demo system (MainSequence?) what 'time' it is in the sequence world" +1. **Create `tracker_get_bpm()` API:** + * **Purpose:** Provide a formal API to read the BPM from the active `.track` file, removing the need for hardcoded values. + * **Implementation:** + ```cpp + // In tracker.h: + float tracker_get_bpm(); -**Current Approach:** Each system tracks its own time independently -- test_demo.cc: Uses `audio_get_playback_time()` directly -- main.cc: Uses `platform_state.time + seek_time` -- MainSequence: Uses `global_time` parameter passed to `render_frame()` + // In tracker.cc: + float tracker_get_bpm() { return g_tracker_score.bpm; } + ``` + * **Adoption:** Update `main.cc` and `test_demo.cc` to call this function instead of using `g_tracker_score.bpm` directly or hardcoding `120.0f`. -**Problem:** No central "what time should I use?" API +2. **Implement BPM-Aware Peak Decay Rate:** + * **Purpose:** Calculate the peak decay rate dynamically based on the current BPM, so that visual feedback feels consistent across different tempos. + * **Implementation:** + ```cpp + // In audio.h (or similar): + float audio_get_peak_decay_rate(); -**Proposed API:** -```cpp -// In MainSequence or AudioEngine: -class TimeProvider { - public: - // Returns: Current time in "sequence world" (accounting for all latencies) - float get_current_time() const { - return audio_get_playback_time(); // Use audio playback time - } + // In audio.cc: + float audio_get_peak_decay_rate() { + const float bpm = tracker_get_bpm(); + const float beat_interval = 60.0f / bpm; // e.g., 0.5s at 120 BPM + const float callback_interval = 0.128f; // Approx. from device + const float num_callbacks_per_beat = beat_interval / callback_interval; + // Decay to 10% within one beat + return powf(0.1f, 1.0f / num_callbacks_per_beat); + } - // Returns: Current beat (fractional) - float get_current_beat() const { - return get_current_time() * (bpm_ / 60.0f); - } + // In miniaudio_backend.cc: + // Replace realtime_peak_ *= 0.5f; with: + realtime_peak_ *= audio_get_peak_decay_rate(); + ``` - // Returns: Current peak (synchronized with current time) - float get_current_peak() const { - return audio_get_realtime_peak(); // Already synchronized - } -}; +### Phase 3: Architectural Simplification -// Usage in test_demo.cc or main.cc: -const float time = g_time_provider.get_current_time(); -const float beat = g_time_provider.get_current_beat(); -const float peak = g_time_provider.get_current_peak(); +1. **Implement a `TimeProvider` Class:** + * **Purpose:** Centralize all timing queries into a single, authoritative source to simplify the logic in the main loop and effects system. + * **Design:** + ```cpp + // In a new file, e.g., audio/time_provider.h: + class TimeProvider { + public: + // Returns current time for AV sync (what's being heard) + float get_current_time() const { return audio_get_playback_time(); } -// All guaranteed to be synchronized! -``` + // Returns current beat (fractional, BPM-aware) + float get_current_beat() const; -**Benefits:** -- Single point of query for all timing -- Hides implementation details (ring buffer, latency, etc.) -- Easy to change timing strategy globally -- Clear contract: "This is the time to use for audio-visual sync" + // Returns current peak (synchronized with current time) + float get_current_peak() const { return audio_get_realtime_peak(); } + + // Returns current BPM + float get_bpm() const { return tracker_get_bpm(); } + }; + ``` + * **Migration:** Gradually refactor `main.cc`, `test_demo.cc`, and visual effects to get all timing information from an instance of this class, removing the need to pass time, beat, and peak as parameters. --- -## Next Steps - -### Completed ✅ -1. ✅ Use `audio_get_playback_time()` instead of physical time for beat calculation -2. ✅ Faster decay rate (0.5 instead of 0.7) to prevent constant flashing -3. ✅ Peak meter visualization to verify timing visually -4. ✅ No hardcoded latency constants (system queries its own state) - -### Future Work (Deferred) - -#### Task: Add tracker_get_bpm() API -**Purpose:** Read BPM from `.track` file instead of hardcoding in test_demo.cc/main.cc - -**Implementation:** -```cpp -// In tracker.h: -float tracker_get_bpm(); // Returns g_tracker_score.bpm - -// In tracker.cc: -float tracker_get_bpm() { - return g_tracker_score.bpm; -} - -// Usage in test_demo.cc/main.cc: -const float bpm = tracker_get_bpm(); // Instead of hardcoded 120.0f -const float beat_time = audio_time * (bpm / 60.0f); -``` - -**Benefits:** -- Change BPM in `.track` file → everything updates automatically -- No hardcoded BPM values in demo code -- Supports variable BPM (future enhancement) - ---- - -#### Task: BPM-Aware Peak Decay Rate -**Purpose:** Calculate decay rate based on current BPM to match beat interval - -**Implementation:** -```cpp -// In audio.h: -float audio_get_peak_decay_rate(); // BPM-adjusted decay - -// In audio.cc: -float audio_get_peak_decay_rate() { - const float bpm = tracker_get_bpm(); - const float beat_interval = 60.0f / bpm; // e.g., 0.5s at 120 BPM - const float callback_interval = 0.128f; // Measured from device - - // Decay to 10% within one beat: - const float n = beat_interval / callback_interval; - return powf(0.1f, 1.0f / n); -} - -// In miniaudio_backend.cc: -realtime_peak_ *= audio_get_peak_decay_rate(); // Instead of hardcoded 0.5f -``` - -**Benefits:** -- Peak decays in exactly 1 beat (regardless of BPM) -- At 120 BPM: decay = 0.5 (500ms fade) -- At 60 BPM: decay = 0.7 (1000ms fade) -- Adapts automatically to tempo changes - ---- - -#### Task: TimeProvider Class (Architectural) -**Purpose:** Centralize all timing queries with single source of truth - -**Design:** -```cpp -// In audio/time_provider.h: -class TimeProvider { - public: - TimeProvider(); - - // Returns: Current time in "sequence world" (what's being heard) - float get_current_time() const { - return audio_get_playback_time(); - } - - // Returns: Current beat (fractional, BPM-aware) - float get_current_beat() const { - const float bpm = tracker_get_bpm(); - return get_current_time() * (bpm / 60.0f); - } - - // Returns: Current peak (synchronized with current time) - float get_current_peak() const { - return audio_get_realtime_peak(); - } - - // Returns: Current BPM - float get_bpm() const { - return tracker_get_bpm(); - } -}; - -// Usage in test_demo.cc, main.cc, effects: -extern TimeProvider g_time_provider; // Global or MainSequence member - -const float time = g_time_provider.get_current_time(); -const float beat = g_time_provider.get_current_beat(); -const float peak = g_time_provider.get_current_peak(); - -// All guaranteed to be synchronized! -``` - -**Integration with MainSequence:** -```cpp -class MainSequence { - public: - TimeProvider time_provider; - - void render_frame(float global_time, float beat, float peak, - float aspect_ratio, WGPUSurface surface) { - // Effects can query: time_provider.get_current_time() etc. - } -}; -``` - -**Benefits:** -- Single point of query for all timing -- Hides implementation details (ring buffer, latency) -- Easy to change timing strategy globally -- Clear contract: "This is the time for audio-visual sync" -- No more passing time parameters everywhere - -**Migration Path:** -1. Create TimeProvider class -2. Expose as global or MainSequence member -3. Gradually migrate test_demo.cc, main.cc, effects to use it -4. Remove time/beat/peak parameters from render functions -5. Everything queries TimeProvider directly - ---- - -### Design Principles Established - -1. ✅ **Single physical clock:** `platform_get_time()` is the only wall clock -2. ✅ **Systems expose their state:** `audio_get_playback_time()` knows its latency -3. ✅ **No hardcoded constants:** System queries its own state dynamically -4. ✅ **Data-driven configuration:** BPM from tracker, decay from BPM (future) -5. ✅ **Synchronized time sources:** Beat and peak from same moment - ---- - -## Testing Verification - -### With Peak Meter Visualization - -Run `./build/test_demo` and observe: -- ✅ Red bar extends when kicks hit (beats 0, 2, 4, ...) -- ✅ Bar width matches FlashEffect intensity -- ✅ Bar decays before next beat (no constant red bar) -- ✅ Snares show narrower bar width (~50-70%) - -### With Peak Logging - -Run `./build/test_demo --log-peaks peaks.txt` and verify: -```bash -# Expected pattern (120 BPM, kicks every 1s): -Beat 0 (T=0.0s): High peak (kick) -Beat 1 (T=0.5s): Medium peak (snare) -Beat 2 (T=1.0s): High peak (kick) -Beat 3 (T=1.5s): Medium peak (snare) -... -``` - -### Console Output - -Should show: -``` -[AudioT=0.06, Beat=0, Frac=0.13, Peak=1.00] ← Kick -[AudioT=0.58, Beat=1, Frac=0.15, Peak=0.62] ← Snare (quieter) -[AudioT=1.09, Beat=2, Frac=0.18, Peak=0.16] ← Decayed (between beats) -[AudioT=2.62, Beat=5, Frac=0.25, Peak=1.00] ← Kick -``` - -**No more constant Peak=1.00 from beat 15 onward!** - ---- - -## Summary - -### What We Fixed -1. ✅ **Use audio playback time** instead of physical time for beat calculation -2. ✅ **Faster decay** (0.5 instead of 0.7) to match beat interval -3. ✅ **No hardcoded latency** - system queries its own state - -### What Still Needs Improvement -1. ⚠️ **BPM should come from tracker** (not hardcoded 120) -2. ⚠️ **Decay rate should be calculated from BPM** (not hardcoded 0.5) -3. ⚠️ **Centralized TimeProvider** for all timing queries - -### Key Insight (User's Contribution) -> "There should be a unique tick-source somewhere, that is the real physical_time. Then, we shouldn't hardcode the constants like 400ms, but really ask the AudioSystem or demo system (MainSequence?) what 'time' it is in the sequence world." - -**This is the correct architectural principle!** ✅ -- ONE physical clock (platform_get_time) -- Systems expose their own state (audio_get_playback_time) -- No hardcoded constants - query the system -- Data-driven configuration (BPM from tracker) - ---- +### Design Principles to Uphold -*Created: February 7, 2026* -*Architectural discussion and implementation complete* +1. ✅ **Single physical clock:** `platform_get_time()` is the only wall clock. +2. ✅ **Systems expose their state:** `audio_get_playback_time()` reports its true playback position. +3. ✅ **No hardcoded constants:** System queries its own state dynamically. +4. ✅ **Data-driven configuration:** BPM comes from the tracker, decay rate comes from BPM. +5. ✅ **Synchronized time sources:** Beat calculations and peak measurements must use the same clock (`audio_get_playback_time`). |
