diff options
| author | skal <pascal.massimino@gmail.com> | 2026-02-07 11:00:27 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-02-07 11:00:27 +0100 |
| commit | 13e41ff17ba91c07197e318b3235373aef845023 (patch) | |
| tree | ca56d233a05948b9c0b49230cf71d41903574cb9 /TODO.md | |
| parent | 375414e1d790c5a2b521aa457b5d77b8cf620b40 (diff) | |
docs(todo): Add Task #69 - Convert audio pipeline to clipped int16
Added low-priority task to convert audio processing from float32 to
clipped int16 for faster/easier processing and reduced memory footprint.
Scope: Three-phase approach (output → mixing → full pipeline)
Trade-offs: Quality vs performance/size
Priority: Low (final optimization only, if 64k budget requires it)
Benefits:
- Simpler arithmetic (no float operations)
- Smaller memory footprint (2 bytes vs 4 bytes)
- Hardware-native format (eliminates conversion)
- Natural clipping behavior
Testing requirements documented for quality validation.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Diffstat (limited to 'TODO.md')
| -rw-r--r-- | TODO.md | 50 |
1 files changed, 50 insertions, 0 deletions
@@ -327,6 +327,56 @@ This file tracks prioritized tasks with detailed attack plans. - **Benefits**: Quantify FFT speedup, validate SIMD optimizations, identify regressions - **Priority**: Very Low (nice-to-have for future optimization work) +- [ ] **Task #69: Convert Audio Pipeline to Clipped Int16**: Use clipped int16 for all audio processing + - **Current**: Audio pipeline uses float32 throughout (generation, mixing, synthesis, output) + - **Goal**: Convert to clipped int16 for faster/easier processing and reduced memory footprint + - **Rationale**: + - Simpler arithmetic (no float operations) + - Smaller memory footprint (2 bytes vs 4 bytes per sample) + - Hardware-native format (most audio devices use int16) + - Eliminates float→int16 conversion at output stage + - Natural clipping behavior (overflow wraps/clips automatically) + - **Scope**: + - Output path: Definitely convert (backends, WAV dump) + - Synthesis: Consider keeping float32 for quality (IDCT produces float) + - Mixing: Could use int16 with proper overflow handling + - Asset storage: Already int16 in .spec files + - **Implementation Phases**: + 1. **Phase 1: Output Only** (Minimal change, ~50 lines) + - Convert `synth_render()` output from float to int16 + - Update `MiniaudioBackend` and `WavDumpBackend` to accept int16 + - Keep all internal processing as float + - **Benefit**: Eliminates final conversion step + 2. **Phase 2: Mixing Stage** (Moderate change, ~200 lines) + - Convert voice mixing to int16 arithmetic + - Add saturation/clipping logic + - Keep IDCT output as float, convert after synthesis + - **Benefit**: Faster mixing, reduced memory bandwidth + 3. **Phase 3: Full Pipeline** (Large change, ~500+ lines) + - Convert spectrograms from float to int16 storage + - Modify IDCT to output int16 directly + - All synthesis in int16 + - **Benefit**: Maximum size reduction and performance + - **Trade-offs**: + - Quality loss: 16-bit resolution vs 32-bit float precision + - Dynamic range: Limited to [-32768, 32767] + - Clipping: Must handle overflow carefully in mixing stage + - Code complexity: Saturation arithmetic more complex than float + - **Testing Requirements**: + - Verify no audible quality degradation + - Ensure clipping behavior matches float version + - Check mixing overflow doesn't cause artifacts + - Validate WAV dumps bit-identical to hardware output + - **Size Impact**: + - Phase 1: Negligible (~50 bytes) + - Phase 2: Small reduction (~100-200 bytes, faster code) + - Phase 3: Large reduction (50% memory, ~1-2KB code savings) + - **Priority**: Low (final optimization, after size budget is tight) + - **Notes**: + - This is a FINAL optimization task, only if 64k budget requires it + - Quality must be validated - may not be worth the trade-off + - Consider keeping float for procedural generation quality + ### Developer Tools - [ ] **Task #66: External Asset Loading for Debugging**: mmap() asset files instead of embedded data - **Current**: All assets embedded in `assets_data.cc` (regenerate on every asset change) |
