diff options
| author | skal <pascal.massimino@gmail.com> | 2026-02-08 10:12:35 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-02-08 10:12:35 +0100 |
| commit | d2d20763ac61f59187d261bb7d6dedcab525bc54 (patch) | |
| tree | 1412707ca48fa8e95f531ee021303c4a21a63da1 /doc/AUDIO_LIFECYCLE_REFACTOR.md | |
| parent | 71736cc2f935a910c737bbfe0683d286688b4d73 (diff) | |
docs(audio): Update AUDIO_LIFECYCLE_REFACTOR.md and AUDIO_TIMING_ARCHITECTURE.md\n\nfeat(audio): Converted AUDIO_LIFECYCLE_REFACTOR.md from a design plan to a description of the implemented AudioEngine and SpectrogramResourceManager architecture, detailing the lazy-loading strategy and seeking capabilities.\n\ndocs(audio): Updated AUDIO_TIMING_ARCHITECTURE.md to reflect current code discrepancies in timing, BPM, and peak decay. Renamed sections to clarify current vs. proposed architecture and outlined a detailed implementation plan for Task #71.\n\nhandoff(Gemini): Finished updating audio documentation to reflect current state and future tasks.
Diffstat (limited to 'doc/AUDIO_LIFECYCLE_REFACTOR.md')
| -rw-r--r-- | doc/AUDIO_LIFECYCLE_REFACTOR.md | 1337 |
1 files changed, 50 insertions, 1287 deletions
diff --git a/doc/AUDIO_LIFECYCLE_REFACTOR.md b/doc/AUDIO_LIFECYCLE_REFACTOR.md index 70aaff6..f5e0045 100644 --- a/doc/AUDIO_LIFECYCLE_REFACTOR.md +++ b/doc/AUDIO_LIFECYCLE_REFACTOR.md @@ -1,197 +1,16 @@ -# Audio System Lifecycle Refactor Plan +# Audio System Architecture ## Problem Statement -The current audio system has a fragile initialization order dependency between `synth_init()` and `tracker_init()`: +The legacy audio system had a fragile initialization order dependency: `tracker_init()` had to be called after `synth_init()` because the synth would clear all registered spectrograms. This was brittle, non-obvious, and made testing difficult, as tests needed to respect this internal implementation detail. -- `synth_init()` clears ALL registered spectrograms (global state reset) -- `tracker_init()` registers spectrograms with the synth -- If called in wrong order → spectrograms cleared → no audio / test failures +## Implemented Solution: The AudioEngine -**Current Workarounds:** -- Tests must call `synth_init()` before `tracker_init()` -- `tracker_init()` re-registers everything on every call (memory overhead) -- Hidden bugs when `audio_init()` internally calls `synth_init()` +To solve this, the audio system was refactored into a unified, lifecycle-managed architecture centered around two new classes: `AudioEngine` and `SpectrogramResourceManager`. This design eliminates initialization-order dependencies and provides clear ownership of resources. -**Why This is Bad:** -- Brittle: Easy to break by calling init functions in wrong order -- Non-obvious: No compile-time or runtime checks for correct usage -- Wasteful: Re-allocates and re-registers on every init -- Hard to test: Tests must know internal implementation details -- Prevents composition: Can't have multiple tracker instances +### Architecture Overview ---- - -## Design Goals - -1. **Order Independence:** Init functions should work in any order -2. **Explicit Ownership:** Clear who owns what resources -3. **Testability:** Easy to mock, inject, and test components -4. **Memory Safety:** No leaks, no dangling pointers -5. **Size Conscious:** Minimal overhead for 64k binary goal - ---- - -## Proposed Solutions (3 Options) - -### Option A: Unified AudioEngine Class (Recommended) - -**Concept:** Single top-level class that manages synth and tracker as private members. - -```cpp -class AudioEngine { - public: - // Lifecycle - void init(); - void shutdown(); - void reset(); // Clear all state - - // Synth interface (delegates to private synth_) - int register_spectrogram(const Spectrogram* spec); - void trigger_voice(int spec_id, float volume, float pan); - void render(float* output, int num_frames); - - // Tracker interface (delegates to private tracker_) - void load_music_data(const TrackerScore* score); - void update_tracker(float music_time); - - private: - Synth synth_; // Embedded instance - Tracker tracker_; // Embedded instance - bool initialized_ = false; -}; -``` - -**Pros:** -- Single initialization point: `engine.init()` sets up everything -- Order independence: Internal init sequence is correct by design -- Clear ownership: Engine owns both synth and tracker -- Easy to test: Mock the entire engine or inject test data - -**Cons:** -- Larger refactor: Need to update all callsites -- Increased coupling: Synth and tracker bundled together -- Slightly larger binary (vtable overhead if using virtual methods) - -**Implementation Steps:** -1. Create `AudioEngine` class in `src/audio/audio_engine.h/cc` -2. Move synth and tracker global state into class members -3. Update `audio.cc` to use `AudioEngine` internally -4. Refactor tests to use `AudioEngine` API -5. Gradually migrate production code -6. Remove old `synth_init()` / `tracker_init()` globals - ---- - -### Option B: Registration Handle System - -**Concept:** Tracker holds registration handles and manages synth resources explicitly. - -```cpp -class Tracker { - public: - void init(Synth* synth); // Takes synth pointer, registers spectrograms - void shutdown(); // Unregisters spectrograms - void update(float music_time); - - private: - Synth* synth_ = nullptr; - std::vector<int> registered_spec_ids_; // Tracks what we registered -}; - -// Usage: -Synth synth; -synth.init(); - -Tracker tracker; -tracker.init(&synth); // Registers spectrograms - -// Later, if synth is reset: -synth.reset(); -tracker.re_register(); // Tracker knows what to re-register -``` - -**Pros:** -- Explicit dependencies: Tracker knows it depends on Synth -- Order enforced by API: Can't init tracker without synth pointer -- Resource cleanup: Tracker can unregister on shutdown -- Smaller refactor: Only changes tracker implementation - -**Cons:** -- Still has global synth state (just less fragile) -- Tracker needs to track registered IDs (extra bookkeeping) -- Multiple trackers would need careful coordination - -**Implementation Steps:** -1. Add `Synth* synth_` member to Tracker -2. Change `tracker_init()` to `tracker_init(Synth* synth)` -3. Store registered spec IDs in tracker -4. Add `tracker_re_register()` to handle synth resets -5. Update all callsites to pass synth pointer - ---- - -### Option C: Synth Resource Manager with Reference Counting - -**Concept:** Spectrograms are ref-counted, auto-cleanup when no refs remain. - -```cpp -class Synth { - public: - // Returns a handle that auto-releases on destruction - SpectrogramHandle register_spectrogram(const Spectrogram* spec); - - void trigger_voice(const SpectrogramHandle& handle, float volume, float pan); -}; - -class SpectrogramHandle { - public: - ~SpectrogramHandle() { if (synth_) synth_->unregister(id_); } - SpectrogramHandle(const SpectrogramHandle&); // Ref count++ - int id() const { return id_; } - - private: - Synth* synth_; - int id_; -}; -``` - -**Pros:** -- RAII: Automatic cleanup, no leaks -- Safe against synth resets: Handles become invalid automatically -- Modern C++ design: Follows std::shared_ptr pattern - -**Cons:** -- Complex implementation: Ref counting overhead -- Larger binary: More code for handle management -- Doesn't solve init order (still need synth before tracker) - -**Implementation Steps:** -1. Create `SpectrogramHandle` class with ref counting -2. Update `synth_register_spectrogram()` to return handle -3. Add `synth_unregister()` internal API -4. Update tracker to store handles instead of IDs -5. Test thoroughly (ref counting bugs are subtle) - ---- - -## Recommended Approach: Option A (AudioEngine) + Resource Manager - -**Why Option A?** -- Most robust: Completely solves initialization order issues -- Future-proof: Easy to extend with more audio subsystems -- Testable: Single point of dependency injection -- Size acceptable: ~500 bytes overhead (acceptable for robustness) - -### Component Separation: Addressing Asset/Procedural Spectrograms - -**Current Problem:** -Tracker currently handles 3 distinct responsibilities: -1. Loading assets via `GetAsset()` (AssetManager dependency) -2. Generating procedural spectrograms (owns allocated memory) -3. Pattern sequencing (triggers events at correct times) - -**Proposed Architecture:** +The final architecture separates responsibilities into distinct, manageable components: ``` ┌─────────────────┐ @@ -215,1128 +34,72 @@ Tracker currently handles 3 distinct responsibilities: └─────────────┘ └─────────────┘ ``` -**SpectrogramResourceManager Responsibilities:** -- Load asset spectrograms from AssetManager -- Generate procedural spectrograms from NoteParams -- Own all allocated memory (consistent ownership) -- Provide unified interface for both types -- Cache resources to avoid re-generation - -**Migration Path (Incremental):** -1. Create `AudioEngine` class alongside existing globals -2. Update tests to use `AudioEngine` first (validate correctness) -3. Add `audio_engine.cc` to build, mark old functions deprecated -4. Migrate `main.cc` and production code -5. Remove old `synth_init()` / `tracker_init()` in final cleanup - -**Estimated Effort:** -- Week 1: Design + implement AudioEngine class -- Week 2: Migrate tests + validate -- Week 3: Migrate production code -- Week 4: Cleanup + documentation - ---- - -## Alternative: Keep Current Design with Better Documentation - -If refactor is too costly, improve current system: - -1. **Add Assertions:** - ```cpp - void tracker_init() { - assert(synth_is_initialized() && "Must call synth_init() first!"); - // ... rest of init - } - ``` - -2. **Add State Tracking:** - ```cpp - static bool g_synth_initialized = false; - void synth_init() { - // ... init code - g_synth_initialized = true; - } - ``` - -3. **Document Init Order:** - - Update `CONTRIBUTING.md` with explicit init order requirements - - Add comments to every init function - - Create example code snippets - -**Pros:** Zero refactor cost -**Cons:** Doesn't solve underlying fragility - ---- - -## Decision Matrix - -| Criterion | Option A (Engine) | Option B (Handles) | Option C (RefCount) | Status Quo | -|-----------|-------------------|-------------------|-------------------|------------| -| Order Independence | ✅ Full | ⚠️ Partial | ❌ No | ❌ No | -| Testability | ✅ Excellent | ✅ Good | ⚠️ Moderate | ❌ Poor | -| Memory Safety | ✅ Full | ✅ Good | ✅ Excellent | ⚠️ Manual | -| Refactor Effort | 🔴 High | 🟡 Medium | 🔴 High | 🟢 None | -| Binary Size Impact | +500B | +200B | +1KB | 0B | -| Future Extensibility | ✅ High | ⚠️ Medium | ⚠️ Medium | ❌ Low | - -**Recommendation:** Option A for production, Status Quo improvements for short-term +### 1. AudioEngine ---- - -## Detailed Design: SpectrogramResourceManager +The `AudioEngine` acts as a high-level facade for the entire audio subsystem. It is the single entry point for initialization, updates, and shutdown. -### Interface +**Responsibilities:** +- Manages the lifecycle of the `Synth`, `Tracker`, and `SpectrogramResourceManager`. +- Guarantees the correct initialization order internally. +- Provides a unified API for updating the audio state (`update()`) and seeking (`seek()`). +- Abstracts away the complexity of resource loading and synth registration. +**Before (Fragile Initialization):** ```cpp -class SpectrogramResourceManager { - public: - // Lifecycle - void init(); - void shutdown(); // Frees all owned memory - - // Lazy loading API (loads on first access) - const Spectrogram* get_or_load_asset(AssetId asset_id); - const Spectrogram* get_or_generate_procedural(int sample_id, const NoteParams& params); - - // Explicit loading (for pre-warming if needed) - int preload_asset(AssetId asset_id); - int preload_procedural(int sample_id, const NoteParams& params); - - // Query API - const Spectrogram* get_spectrogram(int resource_id) const; - bool is_loaded(int resource_id) const; - - // Cache management - void clear_cache(); - int get_cache_size() const; - - private: - struct Resource { - Spectrogram spec; - float* owned_data; // nullptr if asset (not owned) - AssetId asset_id; // ASSET_LAST_ID if procedural - bool is_procedural; - }; - - Resource resources_[MAX_SPECTROGRAMS]; - int next_slot_ = 0; -}; +// In main.cc or tests +synth_init(); +tracker_init(); +// ... +tracker_update(music_time); ``` -### Usage in AudioEngine (Lazy Loading) - +**After (Robust Initialization):** ```cpp -class AudioEngine { - public: - void init() { - synth_.init(); - resource_mgr_.init(); - tracker_.init(&synth_, &resource_mgr_); - } - - void load_music_data(const TrackerScore* score, - const NoteParams* samples, - const AssetId* sample_assets, - uint32_t sample_count) { - // Register sample metadata (but don't load yet!) - for (uint32_t i = 0; i < sample_count; ++i) { - if (sample_assets[i] != AssetId::ASSET_LAST_ID) { - resource_mgr_.register_asset(i, sample_assets[i]); // Metadata only - } else { - resource_mgr_.register_procedural(i, samples[i]); // Metadata only - } - } - - tracker_.load_score(score); - } - - void update(float music_time, float dt) { - // Pre-warm samples needed in next 2 seconds - tracker_.prewarm_lookahead(music_time, music_time + 2.0f); - - // Update tracker (triggers events) - tracker_.update(music_time); - } - - void trigger_sample(int sample_id, float volume, float pan) { - // Load on-demand if not cached - const Spectrogram* spec = resource_mgr_.get_or_load(sample_id); - - // Register with synth (lazy registration) - int synth_id = get_or_register_synth_id(sample_id, spec); - - // Trigger voice - synth_.trigger_voice(synth_id, volume, pan); - } - - private: - int get_or_register_synth_id(int sample_id, const Spectrogram* spec) { - if (sample_to_synth_id_[sample_id] == -1) { - // First use: register with synth - sample_to_synth_id_[sample_id] = synth_.register_spectrogram(spec); - } - return sample_to_synth_id_[sample_id]; - } - - Synth synth_; - Tracker tracker_; - SpectrogramResourceManager resource_mgr_; - int sample_to_synth_id_[256] = {-1}; // -1 = not registered yet -}; -``` +// In main.cc or tests +#include "audio/audio_engine.h" -**Timeline:** -``` -t=0.0s: init() - No samples loaded (instant startup) -t=0.0s: load_music_data() - Register metadata only (~0ms) -t=0.0s: update(0.0) - Pre-warm samples for t=0-2s (load 3-5 samples) -t=0.0s: Pattern triggers → trigger_sample() → Instant (pre-warmed) -t=1.0s: update(1.0) - Pre-warm samples for t=1-3s (load 2-3 more) -t=2.0s: update(2.0) - Pre-warm samples for t=2-4s (load 2-3 more) -... -Total samples loaded by t=10s: Maybe 10-12 (not all 19) +AudioEngine g_audio_engine; +g_audio_engine.init(); +// ... +g_audio_engine.update(music_time); ``` -### Memory Ownership Rules +### 2. SpectrogramResourceManager -**Clear Ownership:** -1. **Assets:** AssetManager owns, ResourceManager borrows (pointer only) -2. **Procedurals:** ResourceManager owns (allocated with `new[]`, freed in `shutdown()`) -3. **Synth:** Only stores pointers, never owns data +This class centralizes all resource loading and memory management for spectrograms, for both binary assets and procedurally generated notes. -**Lifetime Guarantees:** -- Assets live until program exit (static data) -- Procedurals live until `ResourceManager::shutdown()` -- Synth references valid as long as ResourceManager exists +**Responsibilities:** +- **Loading:** Handles loading `.spec` files from the `AssetManager` and generating procedural spectrograms from `NoteParams`. +- **Ownership:** Enforces clear memory ownership rules. It *borrows* pointers to static assets but *owns* the memory for any generated procedural spectrograms, which it frees upon shutdown. +- **Lazy Loading & Caching:** Resources are not loaded when the application starts. Instead, their metadata is registered. The actual data is loaded on-demand the first time a sample is needed. Once loaded, it is cached for future use. This results in faster startup times and more efficient memory usage. --- -## Lazy Loading & On-Demand Strategy - -### Problem with Eager Loading - -**Current plan says:** "Load all resources ONCE at init" - -**Issues:** -- Long startup delay (load all 19 samples before demo starts) -- Wasted memory (samples used only once stay loaded) -- No benefit for streaming or late-game content - -**Example:** Demo is 60s, sample only used at t=45s → why load it at t=0? - -### Proposed: Lazy Loading with Cache - -**Strategy:** -1. **Load on first trigger** - Don't load until sample is actually needed -2. **Cache aggressively** - Keep loaded samples in memory (demo is short, ~60s) -3. **Optional pre-warming** - Load predicted samples 1-2 seconds ahead -4. **Eviction policy** - Unload samples that won't be used again (optional) - -### Implementation: Two-Tier Cache - -```cpp -class SpectrogramResourceManager { - private: - struct CacheEntry { - Spectrogram spec; - float* owned_data; // nullptr if not loaded yet - AssetId asset_id; - NoteParams proc_params; // For procedurals - - enum State { - UNLOADED, // Not loaded yet - LOADING, // Load in progress (async) - LOADED, // Ready to use - EVICTED // Was loaded, now evicted - }; - State state = UNLOADED; - float last_access_time; // For LRU eviction - }; - - CacheEntry cache_[MAX_SPECTROGRAMS]; - - public: - // Lazy loading (loads if not cached) - const Spectrogram* get_or_load(int sample_id) { - CacheEntry& entry = cache_[sample_id]; - - if (entry.state == LOADED) { - entry.last_access_time = current_time(); - return &entry.spec; // Cache hit! - } - - if (entry.state == UNLOADED || entry.state == EVICTED) { - load_now(&entry); // Load synchronously - entry.state = LOADED; - } - - return &entry.spec; - } - - private: - void load_now(CacheEntry* entry) { - if (entry->asset_id != ASSET_LAST_ID) { - // ASSET: Borrow pointer from AssetManager - size_t size; - const uint8_t* data = GetAsset(entry->asset_id, &size); - - #if defined(SUPPORT_COMPRESSED_ASSETS) - // ON-DEMAND DECOMPRESSION - if (is_compressed(data)) { - entry->owned_data = decompress_asset(data, size); - entry->spec.spectral_data_a = entry->owned_data; - } else { - entry->spec.spectral_data_a = (const float*)(data + sizeof(SpecHeader)); - } - #else - // Direct pointer (no decompression) - entry->spec.spectral_data_a = (const float*)(data + sizeof(SpecHeader)); - #endif - - } else { - // PROCEDURAL: Generate on-demand - int frames; - std::vector<float> data = generate_note_spectrogram(entry->proc_params, &frames); - - entry->owned_data = new float[data.size()]; - memcpy(entry->owned_data, data.data(), data.size() * sizeof(float)); - entry->spec.spectral_data_a = entry->owned_data; - entry->spec.num_frames = frames; - } - } -}; -``` - -### On-Demand Decompression (Future) - -When assets are compressed (Task #27), decompress lazily: - -```cpp -const Spectrogram* get_or_load(int sample_id) { - CacheEntry& entry = cache_[sample_id]; - - if (entry.state == LOADED) { - return &entry.spec; // Already decompressed - } - - // Load compressed asset - const uint8_t* compressed = GetAsset(entry->asset_id, &size); - - // Decompress into owned buffer - entry->owned_data = decompress_zlib(compressed, size, &decompressed_size); - entry->spec.spectral_data_a = entry->owned_data; - entry->state = LOADED; - - return &entry.spec; -} - -void evict(int sample_id) { - // Free decompressed data, can re-decompress from asset later - delete[] cache_[sample_id].owned_data; - cache_[sample_id].owned_data = nullptr; - cache_[sample_id].state = EVICTED; -} -``` +## Lazy Loading and Pre-warming Strategy -### Pre-warming Strategy +The system uses a lazy-loading approach to minimize startup delay and memory footprint. -**Idea:** Predict samples needed in next 1-2 seconds, load ahead of time - -```cpp -class Tracker { - public: - void update(float music_time) { - // 1. Trigger patterns at current time - trigger_patterns_at(music_time); - - // 2. Pre-warm samples needed in next 2 seconds - prewarm_lookahead(music_time, music_time + 2.0f); - } - - private: - void prewarm_lookahead(float start_time, float end_time) { - // Scan upcoming pattern triggers - for (const auto& trigger : upcoming_triggers_) { - if (trigger.time > start_time && trigger.time < end_time) { - const TrackerPattern& pattern = get_pattern(trigger.pattern_id); - - // Load all samples used in this pattern - for (const auto& event : pattern.events) { - resource_mgr_->preload(event.sample_id); // Async or sync - } - } - } - } -}; -``` +**Workflow:** +1. **`AudioEngine::init()`**: The resource manager is initialized, but no samples are loaded. +2. **`AudioEngine::load_music_data()`**: The tracker score is parsed, and metadata for all required samples (both asset-based and procedural) is *registered* with the `SpectrogramResourceManager`. No data is loaded at this stage. +3. **`AudioEngine::update()`**: During the main loop, the tracker identifies which samples will be needed in the near future (a 1-2 second lookahead window). It then asks the resource manager to "pre-warm" these samples, loading them into the cache just before they are needed. +4. **Triggering a sample**: When a tracker event fires, the `AudioEngine` requests the sample from the resource manager. Since it was pre-warmed, it's a fast cache hit. The engine then ensures the spectrogram is registered with the low-level synth and triggers the voice. **Benefits:** -- No loading stutter when pattern triggers -- Spreads load cost over time (not all at init) -- Can load asynchronously in background thread (if needed) - -### Cache Eviction Policy (Optional, Compile-Time Flag) - -**Demo knows its timeline:** Effects can explicitly release resources when done. - -**Explicit Resource Management (Preferred):** -```cpp -class Effect { - public: - void on_effect_end() override { - // Effect knows it won't need these samples again - resource_mgr_->release(SAMPLE_EXPLOSION); - resource_mgr_->release(SAMPLE_AMBIENT_WIND); - } -}; - -// In sequence compiler: -EFFECT ExplosionEffect 10.0 12.0 // Uses SAMPLE_EXPLOSION -// After t=12.0, explosion sample can be evicted -``` - -**Automatic Eviction (Under `ENABLE_CACHE_EVICTION` flag):** -```cpp -#if defined(ENABLE_CACHE_EVICTION) -void try_evict_lru() { - if (get_cache_memory_usage() > MAX_CACHE_SIZE) { - int lru_id = find_lru_entry(); - if (current_time() - cache_[lru_id].last_access_time > 10.0f) { - evict(lru_id); - } - } -} -#endif /* ENABLE_CACHE_EVICTION */ -``` - -**Eviction strategies (all optional):** -1. **Explicit** (recommended) - Effect calls `release()` when done -2. **LRU** (fallback) - Evict least recently used if memory tight -3. **Timeline hints** - Sequence compiler marks "last use" of samples - -**Recommendation:** -- Production: Explicit release only (demo knows timeline) -- Development: Enable LRU eviction for safety (`-DENABLE_CACHE_EVICTION`) -- Final 64k build: Disable eviction (cache everything, simple code) - ---- - -### Comparison: Eager vs Lazy Loading - -| Aspect | Eager (Load All at Init) | Lazy (Load on Demand) | Lazy + Pre-warm | -|--------|-------------------------|----------------------|-----------------| -| **Startup Time** | ❌ Slow (load all 19) | ✅ Fast (load 0) | ✅ Fast (load 0) | -| **First Trigger Latency** | ✅ Instant (cached) | ❌ Stutter (load now) | ✅ Instant (pre-loaded) | -| **Memory Usage** | ❌ High (all loaded) | ✅ Low (only used) | ⚠️ Medium (active + lookahead) | -| **Complexity** | ✅ Simple | ⚠️ Medium | ❌ Complex | -| **Best For** | Short demos, few samples | Long demos, many samples | **Recommended for 64k** | - -**Recommendation:** **Lazy + Pre-warm (1-2s lookahead)** -- Fast startup (no eager loading) -- No stutter (pre-warming prevents load-on-trigger) -- Memory efficient (only loads active + upcoming samples) - ---- - -### Benefits Over Current Design - -| Aspect | Current (Tracker-owned) | Proposed (ResourceManager) | -|--------|------------------------|---------------------------| -| **Ownership** | Split (assets borrowed, procedurals owned) | Centralized (clear rules) | -| **Coupling** | Tracker → AssetManager + gen | ResourceManager → AssetManager + gen | -| **Testability** | Hard (global GetAsset) | Easy (inject mock manager) | -| **Reusability** | Tracker-specific | Usable by any audio component | -| **Memory Safety** | Manual (delete[] in init) | RAII in shutdown() | - -### Integration with AssetManager - -**Option 1: Direct Dependency (Simple)** -```cpp -int SpectrogramResourceManager::load_asset_spectrogram(AssetId id) { - size_t size; - const uint8_t* data = GetAsset(id, &size); // Direct call - // ... parse and store -} -``` - -**Option 2: Injected Interface (Testable)** -```cpp -class IAssetProvider { - public: - virtual const uint8_t* get_asset(AssetId id, size_t* size) = 0; -}; - -class SpectrogramResourceManager { - public: - void set_asset_provider(IAssetProvider* provider) { - asset_provider_ = provider; - } - - int load_asset_spectrogram(AssetId id) { - const uint8_t* data = asset_provider_->get_asset(id, &size); - // ... parse and store - } - - private: - IAssetProvider* asset_provider_ = nullptr; -}; - -// Production: Adapter for AssetManager -class AssetManagerAdapter : public IAssetProvider { - public: - const uint8_t* get_asset(AssetId id, size_t* size) override { - return GetAsset(id, size); // Delegate to global - } -}; -``` - -**Recommendation:** Option 1 for now (simpler), Option 2 if we refactor AssetManager too +- **Fast Startup:** The application doesn't pay the cost of loading all audio assets up front. +- **No Stutter:** Pre-warming ensures that samples are already in memory when they need to be played, preventing stutter caused by load-on-trigger. +- **Memory Efficient:** Only the samples that are actively being used or are coming up soon are held in memory. --- -## Implementation Plan (Option A) - -### Phase 1: Design & Prototype (5-7 days) - -- [ ] Create `src/audio/spectrogram_resource_manager.h/cc` - - [ ] Implement lazy loading cache (UNLOADED → LOADED states) - - [ ] Add `get_or_load()` API for on-demand loading - - [ ] Add `register_*()` API for metadata-only registration - - [ ] Implement resource loading (assets + procedurals) - - [ ] Add memory ownership tracking - - [ ] Write unit tests for lazy loading behavior -- [ ] Create `src/audio/audio_engine.h/cc` with class interface -- [ ] Implement `AudioEngine::init()` / `shutdown()` / `reset()` -- [ ] Add delegation methods for synth/tracker APIs -- [ ] Integrate ResourceManager into AudioEngine -- [ ] **Implement `AudioEngine::seek()` for timeline scrubbing** - - [ ] Add `Synth::reset()` and `Synth::set_time()` - - [ ] Add `Tracker::reset()` and `Tracker::set_time()` - - [ ] Add `Tracker::update_silent()` for fast-forward - - [ ] Implement pre-warming for target time range - - [ ] Add seeking tests (forward, backward, edge cases) -- [ ] Write unit tests for `AudioEngine` lifecycle with resources - -### Phase 2: Test Migration (3-5 days) - -- [ ] Update `test_tracker.cc` to use `AudioEngine` -- [ ] Update `test_tracker_timing.cc` -- [ ] Update `test_variable_tempo.cc` -- [ ] Update `test_wav_dump.cc` -- [ ] Ensure 100% test pass rate - -### Phase 3: Production Integration (5-7 days) - -- [ ] Update `audio.cc` to use `AudioEngine` internally -- [ ] Update `main.cc` demo loop -- [ ] Update all effect classes that use audio -- [ ] Add backwards compatibility shims (temporary) - -### Phase 4: Cleanup (2-3 days) - -- [ ] Remove old `synth_init()` / `tracker_init()` functions -- [ ] Remove global synth/tracker state -- [ ] Update documentation (HOWTO.md, CONTRIBUTING.md) -- [ ] Remove compatibility shims - -### Phase 5: Optimization (Optional, 2-3 days) - -- [ ] Profile binary size impact -- [ ] Optimize if >1KB overhead -- [ ] Consider `#if !defined(STRIP_ALL)` for test-only features -- [ ] **Optional:** Add async loading (background thread for pre-warming) - - Load samples in background while demo runs - - Requires mutex for cache access - - Benefit: Zero stutter even for large samples - - Cost: +200-300 bytes for threading code - ---- - -## Future: Async Loading (Post-MVP) - -**Idea:** Pre-warm samples in background thread to avoid any stutter +## Timeline Seeking & Scrubbing (For Debugging) -```cpp -class SpectrogramResourceManager { - public: - void prewarm_async(int sample_id) { - if (cache_[sample_id].state == UNLOADED) { - cache_[sample_id].state = LOADING; - - // Enqueue background load - thread_pool_.enqueue([this, sample_id]() { - load_now(&cache_[sample_id]); - - std::lock_guard lock(cache_mutex_); - cache_[sample_id].state = LOADED; - }); - } - } - - const Spectrogram* get_or_load(int sample_id) { - std::lock_guard lock(cache_mutex_); - - CacheEntry& entry = cache_[sample_id]; - - if (entry.state == LOADED) { - return &entry.spec; // Ready! - } - - if (entry.state == LOADING) { - // Wait for background load to finish - wait_for_load(sample_id); - } else { - // Not started, load synchronously (fallback) - load_now(&entry); - entry.state = LOADED; - } - - return &entry.spec; - } - - private: - std::mutex cache_mutex_; - ThreadPool thread_pool_; -}; -``` - -**Benefits:** -- Zero latency: All loads happen in background -- Demo runs at full framerate during loading - -**Costs:** -- +200-300 bytes for threading code -- Complexity: Need mutexes, thread management -- May not be needed for 64k (samples are small, load quickly) - -**Recommendation:** Only implement if profiling shows stutter issues - ---- - -## Success Criteria - -1. **No Init Order Failures:** Tests pass regardless of call order -2. **Memory Clean:** No leaks detected by sanitizers -3. **Size Budget:** <1KB binary size increase -4. **API Simplicity:** Single `engine.init()` call for users -5. **Test Coverage:** 100% coverage of lifecycle edge cases - ---- - -## Open Questions - -1. Should `AudioEngine` be a singleton or instance-based? - - **Proposal:** Instance-based for testability, but provide global getter for convenience - -2. How to handle legacy code during migration? - - **Proposal:** Keep old functions as deprecated wrappers for 1-2 releases - -3. Should synth and tracker be members or pointers? - - **Proposal:** Direct members (smaller, faster, simpler ownership) - -4. What about backend abstraction (MiniaudioBackend, MockBackend)? - - **Proposal:** AudioEngine owns backend pointer, injected via `set_backend()` - ---- - -## Timeline Seeking & Scrubbing (Critical for Development) - -### Requirement: Robust Seeking for Debugging - -**Need to support:** -- Jump forward: t=5s → t=45s (skip to specific scene) -- Rewind: t=45s → t=5s (go back to debug earlier issue) -- Fast-forward: Simulate t=0-30s without audio (load samples, update state) -- Frame-by-frame: t → t+0.016s (precise debugging) - -**Properties:** -- ✅ Robust: Works for any timeline position -- ✅ Correct: Audio/visuals match target time exactly -- ⚠️ Performance: Doesn't need to be fast (debugging tool) -- ✅ Pre-warming: Must load samples for target time range - -### Implementation: AudioEngine::seek() - -```cpp -class AudioEngine { - public: - // Seek to arbitrary point in timeline - void seek(float target_time) { - #if !defined(STRIP_ALL) // Only in debug builds - - // 1. Reset synth state (clear all active voices) - synth_.reset(); - - // 2. Reset tracker state (re-scan patterns from scratch) - tracker_.reset(); - tracker_.set_time(target_time); - - // 3. Pre-warm samples for target time range - float prewarm_start = std::max(0.0f, target_time - 1.0f); - float prewarm_end = target_time + 2.0f; - tracker_.prewarm_lookahead(prewarm_start, prewarm_end); - - // 4. Simulate tracker up to target time (no audio rendering) - // This ensures tracker internal state is correct - const float dt = 0.1f; // Simulate in 0.1s steps - for (float t = 0.0f; t < target_time; t += dt) { - tracker_.update_silent(t); // Update state only, no audio - } - - // 5. Final update at exact target time - tracker_.update(target_time); - - // 6. Notify effects of seek event (optional) - sequence_.on_seek(target_time); - - current_time_ = target_time; - - #endif /* !defined(STRIP_ALL) */ - } - - // Fast-forward: Simulate time range without rendering - void fast_forward(float from_time, float to_time) { - const float dt = 0.1f; - for (float t = from_time; t < to_time; t += dt) { - tracker_.update_silent(t); - sequence_.update_silent(t, dt); // Update effects without rendering - } - } - - private: - float current_time_ = 0.0f; -}; -``` - -### Tracker Support for Seeking - -```cpp -class Tracker { - public: - void reset() { - // Clear all active patterns - for (auto& pattern : active_patterns_) { - pattern.active = false; - } - last_trigger_idx_ = 0; - } - - void set_time(float time) { - current_music_time_ = time; - - // Re-scan score to find patterns that should be active at this time - for (uint32_t i = 0; i < score_->num_triggers; ++i) { - const auto& trigger = score_->triggers[i]; - - if (trigger.time_sec <= time) { - // This pattern should be active (or was active) - // Determine if still active based on pattern duration - const auto& pattern = patterns_[trigger.pattern_id]; - float pattern_end = trigger.time_sec + get_pattern_duration(pattern); - - if (time < pattern_end) { - // Pattern is currently active, add it - activate_pattern(trigger.pattern_id, trigger.time_sec); - - // Fast-forward pattern to current beat - update_pattern_to_time(trigger.pattern_id, time); - } - - last_trigger_idx_ = i + 1; - } - } - } - - void update_silent(float time) { - // Update tracker state without triggering voices - // Used for fast-forward and seeking - update_internal(time, /*trigger_audio=*/false); - } - - private: - void update_pattern_to_time(int pattern_id, float target_time) { - // Advance pattern's event index to match target time - ActivePattern& active = find_active_pattern(pattern_id); - const TrackerPattern& pattern = patterns_[pattern_id]; - - float elapsed = target_time - active.start_music_time; - float beat_duration = 60.0f / score_->bpm; - float elapsed_beats = elapsed / beat_duration; - - // Advance event index past all events before target time - while (active.next_event_idx < pattern.num_events) { - if (pattern.events[active.next_event_idx].beat > elapsed_beats) { - break; - } - active.next_event_idx++; - } - } -}; -``` - -### Synth Support for Seeking - -```cpp -class Synth { - public: - void reset() { - // Clear all active voices (stop all sound immediately) - for (auto& voice : voices_) { - voice.active = false; - } - - // Reset time tracking - #if !defined(STRIP_ALL) - g_elapsed_time_sec = 0.0f; - #endif - } - - void set_time(float time) { - #if !defined(STRIP_ALL) - g_elapsed_time_sec = time; - #endif - } -}; -``` - -### Effect Sequence Support for Seeking - -```cpp -class Sequence { - public: - void on_seek(float target_time) { - // Notify all effects that might be active at target time - for (auto& layer : layers_) { - for (auto& entry : layer.effects) { - if (entry.start_time <= target_time && target_time < entry.end_time) { - // Effect is active at target time - entry.effect->on_seek(target_time); - } else if (target_time < entry.start_time) { - // Effect hasn't started yet, reset it - entry.effect->on_reset(); - } - } - } - } -}; - -class Effect { - public: - virtual void on_seek(float time) { - // Optional: Reset effect state to match target time - // Example: Particle system re-simulates to target time - } - - virtual void on_reset() { - // Optional: Clear all effect state - } -}; -``` - -### Usage Examples - -**Jump to specific scene:** -```cpp -// User presses 'J' key to jump to scene 3 (at t=45s) -if (key_pressed('J')) { - audio_engine.seek(45.0f); - camera.set_position(scene3_camera_pos); // Sync visuals too -} -``` - -**Rewind to beginning:** -```cpp -if (key_pressed('R')) { - audio_engine.seek(0.0f); -} -``` - -**Frame-by-frame debugging:** -```cpp -if (key_pressed('N')) { // Next frame - float dt = 1.0f / 60.0f; - audio_engine.update(current_time, dt); - current_time += dt; -} - -if (key_pressed('P')) { // Previous frame - float dt = 1.0f / 60.0f; - audio_engine.seek(std::max(0.0f, current_time - dt)); -} -``` - -**Fast-forward to load all samples:** -```cpp -// Pre-load all samples by fast-forwarding through entire demo -void preload_all_samples() { - audio_engine.fast_forward(0.0f, demo_duration); - audio_engine.seek(0.0f); // Reset to beginning -} -``` - -### Seeking Guarantees - -**Invariants after `seek(t)`:** -1. ✅ Tracker state matches time `t` (patterns active, event indices correct) -2. ✅ Synth has no active voices (silent state) -3. ✅ Samples for range [t-1s, t+2s] are pre-warmed -4. ✅ Next `update(t)` will trigger correct events -5. ✅ Audio output matches what would have played at time `t` - -**Edge Cases Handled:** -- Seek to t=0: Reset everything (initial state) -- Seek beyond demo end: Clamp to valid range -- Seek to same time twice: No-op (already there) -- Seek while audio playing: Stop all voices first - -### Performance Characteristics - -**Seek cost:** -- Synth reset: O(MAX_VOICES) ~ 64 voices → <1ms -- Tracker reset: O(MAX_PATTERNS) ~ 20 patterns → <1ms -- Pattern re-scan: O(num_triggers) ~ 100 triggers → <1ms -- Pre-warming: O(samples_in_range) ~ 5 samples → 5-10ms -- Total: **~10-20ms** (acceptable for debugging) - -**Optimization (if needed):** -- Cache tracker state snapshots every 5s -- Seek to nearest snapshot, then fast-forward delta -- Reduces re-scan cost for long demos - -### Testing Seeking Robustness - -```cpp -void test_seeking() { - AudioEngine engine; - engine.init(); - - // Test 1: Forward seek - engine.seek(10.0f); - assert(engine.get_time() == 10.0f); - - // Test 2: Backward seek (rewind) - engine.seek(5.0f); - assert(engine.get_time() == 5.0f); - - // Test 3: Seek to beginning - engine.seek(0.0f); - assert(engine.get_time() == 0.0f); - - // Test 4: Multiple seeks in sequence - engine.seek(20.0f); - engine.seek(30.0f); - engine.seek(15.0f); - assert(engine.get_time() == 15.0f); - - // Test 5: Verify samples loaded correctly after seek - engine.seek(45.0f); - const auto* spec = engine.get_resource_manager().get_spectrogram(SAMPLE_ENDING); - assert(spec != nullptr); // Sample for ending should be loaded -} -``` - ---- - -## FAQ - -### Q: How does this handle procedural vs asset spectrograms? - -**A:** Via a dedicated `SpectrogramResourceManager`: - -- **Assets:** Loaded from AssetManager, ResourceManager stores pointer only (doesn't own) -- **Procedurals:** Generated via `generate_note_spectrogram()`, ResourceManager owns memory -- **Unified Interface:** Both types returned as `const Spectrogram*`, caller doesn't care about source - -**Key Insight:** Separating resource loading from pattern sequencing allows: -- Tracker focuses on timing/triggering (single responsibility) -- Resources loaded once at init (no per-frame overhead) -- Easy to add new spectrogram sources (streaming, compressed, etc.) - -### Q: Does AudioEngine depend on AssetManager? - -**A:** Indirectly, via ResourceManager: - -``` -AudioEngine → ResourceManager → AssetManager (via GetAsset) - ↘ Tracker (no asset dependency) - ↘ Synth (no asset dependency) -``` - -**Benefits:** -- Only one component knows about assets (ResourceManager) -- Easy to mock for testing (inject fake ResourceManager) -- If AssetManager is refactored, only ResourceManager needs updating - -### Q: What happens to the current caching mechanism? - -**A:** Improved and centralized: - -**Current (Tracker-owned):** -```cpp -static int g_sample_synth_cache[256]; // Maps sample_id → synth_id -``` - -**Proposed (AudioEngine-owned):** -```cpp -class AudioEngine { - private: - int sample_to_synth_id_[256]; // Clearer ownership -}; -``` - -**Why better?** -- Cache lifetime matches engine lifetime (no stale data) -- Cleared automatically on `engine.reset()` -- Testable (can verify cache contents) - -### Q: How does seeking/scrubbing work for debugging? - -**A:** Robust seek API that resets state and pre-warms samples: - -```cpp -audio_engine.seek(45.0f); // Jump to t=45s - -// What happens internally: -// 1. synth.reset() - Clear all active voices -// 2. tracker.reset() - Clear active patterns -// 3. tracker.set_time(45.0) - Re-scan score, activate patterns at t=45 -// 4. prewarm(43-47s) - Load samples for target range -// 5. update(45.0) - Trigger events at exact time -``` - -**Supported operations:** -- **Jump forward:** `seek(45.0f)` - Skip to specific scene -- **Rewind:** `seek(5.0f)` - Go back to earlier time -- **Fast-forward:** `fast_forward(0, 30)` - Simulate without audio -- **Frame-by-frame:** `seek(t + 0.016)` - Step through frames - -**Performance:** ~10-20ms per seek (acceptable for debugging) - -**Guarantees:** -- ✅ Audio/visuals match target time exactly -- ✅ Tracker state correct (patterns active, event indices) -- ✅ Samples pre-warmed (no stutter on next update) -- ✅ Works for any timeline position (0 → end) - -**Development workflow:** -```cpp -// Press 'J' to jump to problematic scene at t=37s -if (key_pressed('J')) { - audio_engine.seek(37.0f); - // Audio and visuals now at t=37s, ready to debug -} -``` - -### Q: Why not load all samples at init? - -**A:** Lazy loading with pre-warming is more efficient: - -**Problems with eager loading:** -- Slow startup: Load all 19 samples before t=0 (~50-100ms delay) -- Wasted memory: Sample used only at t=45s stays loaded from t=0 -- No benefit: 60s demo doesn't need all samples loaded upfront - -**Lazy + Pre-warm strategy:** -``` -t=0.0s: Load 0 samples (instant startup) -t=0.0s: Pre-warm 3-5 samples for t=0-2s range (background) -t=1.0s: Pre-warm 2-3 more for t=1-3s range -By t=10s: Only ~10 samples loaded (not all 19) -``` - -**Benefits:** -- ✅ Instant startup (no initial loading) -- ✅ No trigger stutter (pre-warming prevents load-on-access) -- ✅ Memory efficient (only active + upcoming samples) -- ✅ Spreads load cost over time - -**Future: On-demand decompression (Task #27)** -When assets are compressed, decompress only when needed: -- Compressed asset in binary: 2KB -- Decompressed in memory: 8KB (4x larger) -- Only decompress samples actually used (save memory) - -### Q: How does this affect binary size? - -**Estimated overhead:** -- `SpectrogramResourceManager` (basic): ~300 bytes -- `SpectrogramResourceManager` (with lazy loading): ~500 bytes -- `AudioEngine` wrapper: ~200 bytes -- Total: **~700 bytes** (1.1% of 64KB budget) - -**Savings from removing global state:** ~100 bytes -**Net impact:** **~600 bytes** (acceptable for robustness + lazy loading) - ---- - -## Compile-Time Configuration Flags - -**Optional features controlled by CMake flags:** - -```cmake -# CMakeLists.txt -option(DEMO_ENABLE_CACHE_EVICTION "Enable automatic cache eviction (LRU)" OFF) -option(DEMO_ENABLE_TIMELINE_SEEKING "Enable timeline seeking for debugging" ON) -option(DEMO_ENABLE_ASYNC_LOADING "Enable background thread pre-warming" OFF) -``` - -**Usage in code:** - -```cpp -// Cache eviction (optional, for long demos or low memory) -#if defined(DEMO_ENABLE_CACHE_EVICTION) -void SpectrogramResourceManager::try_evict_lru() { - // ... LRU eviction logic -} -#endif - -// Timeline seeking (enabled by default in debug, stripped in STRIP_ALL) -#if defined(DEMO_ENABLE_TIMELINE_SEEKING) && !defined(STRIP_ALL) -void AudioEngine::seek(float target_time) { - // ... seeking logic -} -#endif - -// Async loading (optional, for zero-latency pre-warming) -#if defined(DEMO_ENABLE_ASYNC_LOADING) -void SpectrogramResourceManager::prewarm_async(int sample_id) { - thread_pool_.enqueue([this, sample_id]() { - load_now(&cache_[sample_id]); - }); -} -#endif -``` - -**Recommended configurations:** - -| Build Type | Cache Eviction | Timeline Seeking | Async Loading | -|------------|---------------|-----------------|---------------| -| **Development** | ON (safety) | ON (debugging) | OFF (simplicity) | -| **Testing** | ON (coverage) | ON (test seeks) | OFF (deterministic) | -| **Release** | OFF (explicit) | OFF (no debug) | OFF (size) | -| **Final 64k** | OFF | OFF (`STRIP_ALL`) | OFF | - -**Size impact:** -- Cache eviction: +150 bytes -- Timeline seeking: +300 bytes (stripped in final) -- Async loading: +250 bytes - -**Binary size budget:** -``` -Base AudioEngine: 500 bytes -+ Lazy loading: +200 bytes -+ Seeking (debug only): +300 bytes (stripped) -+ Cache eviction (opt): +150 bytes (disabled) -+ Async loading (opt): +250 bytes (disabled) -──────────────────────────────────── -Development build: 1150 bytes -Final 64k build: 700 bytes -``` - ---- +A key feature enabled by this refactor is robust timeline seeking, which is invaluable for development and debugging. The `AudioEngine::seek()` method (`#if !defined(STRIP_ALL)`) allows jumping to any point in the demo timeline. -## References +**`seek(target_time)` Process:** +1. **Reset:** The synth and tracker states are completely reset, clearing all active voices and pattern states. +2. **State Reconstruction:** The tracker re-scans the musical score from the beginning up to `target_time` to determine which patterns should be active. +3. **Pre-warming:** The resource manager pre-loads all samples needed for the time range around `target_time`. +4. **Ready:** The audio system is now in the exact state it would have been if the demo had run normally up to `target_time`, ready for playback to resume. -- Current issue: Commit 7721f57 "fix(audio): Resolve tracker test failures..." -- Related: `doc/TRACKER.md`, `doc/ASSET_SYSTEM.md` -- Design patterns: Facade pattern, Dependency Injection, Resource Manager pattern +This allows developers to instantly jump to a specific scene to debug audio or visual issues without having to watch the demo from the start. |
