summaryrefslogtreecommitdiff
path: root/tools/spectral_editor
AgeCommit message (Collapse)Author
9 hoursfeat(audio): FFT implementation Phase 1 - Infrastructure and foundationskal
Phase 1 Complete: Robust FFT infrastructure for future DCT optimization Current production code continues using O(N²) DCT/IDCT (perfectly accurate) FFT Infrastructure Implemented: ================================ Core FFT Engine: - Radix-2 Cooley-Tukey algorithm (power-of-2 sizes) - Bit-reversal permutation with in-place reordering - Butterfly operations with twiddle factor rotation - Forward FFT (time → frequency domain) - Inverse FFT (frequency → time domain, scaled by 1/N) Files Created: - src/audio/fft.{h,cc} - C++ implementation (~180 lines) - tools/spectral_editor/dct.js - Matching JavaScript implementation (~190 lines) - src/tests/test_fft.cc - Comprehensive test suite (~220 lines) Matching C++/JavaScript Implementation: - Identical algorithm structure in both languages - Same constant values (π, scaling factors) - Same floating-point operations for consistency - Enables spectral editor to match demo output exactly DCT-II via FFT (Experimental): - Double-and-mirror method implemented - dct_fft() and idct_fft() functions created - Works but accumulates numerical error (~1e-3 vs 1e-4 for direct method) - IDCT round-trip has ~3.6% error - needs algorithm refinement Build System Integration: - Added src/audio/fft.cc to AUDIO_SOURCES - Created test_fft target with comprehensive tests - Tests verify FFT correctness against reference O(N²) DCT Current Status: =============== Production Code: - Demo continues using existing O(N²) DCT/IDCT (fdct.cc, idct.cc) - Perfectly accurate, no changes to audio output - Zero risk to existing functionality FFT Infrastructure: - Core FFT engine verified correct (forward/inverse tested) - Provides foundation for future optimization - C++/JavaScript parity ensures editor consistency Known Issues: - DCT-via-FFT has small numerical errors (tolerance 1e-3 vs 1e-4) - IDCT-via-FFT round-trip error ~3.6% (hermitian symmetry needs work) - Double-and-mirror algorithm sensitive to implementation details Phase 2 TODO (Future Optimization): ==================================== Algorithm Refinement: 1. Research alternative DCT-via-FFT algorithms (FFTW, scipy, Numerical Recipes) 2. Fix IDCT hermitian symmetry packing for correct round-trip 3. Add reference value tests (compare against known good outputs) 4. Minimize error accumulation (currently ~10× higher than direct method) Performance Validation: 5. Benchmark O(N log N) FFT-based DCT vs O(N²) direct DCT 6. Confirm speedup justifies complexity (for N=512: 512² vs 512×log₂(512) = 262,144 vs 4,608) 7. Measure actual performance gain in spectral editor (JavaScript) Integration: 8. Replace fdct.cc/idct.cc with fft.cc once algorithms perfected 9. Update spectral editor to use FFT-based DCT by default 10. Remove old O(N²) implementations (size optimization) Technical Details: ================== FFT Complexity: O(N log N) where N = 512 - Radix-2 requires log₂(N) = 9 stages - Each stage: N/2 butterfly operations - Total: 9 × 256 = 2,304 complex multiplications DCT-II via FFT Complexity: O(N log N) + O(N) preprocessing - Theoretical speedup: 262,144 / 4,608 ≈ 57× faster - Actual speedup depends on constant factors and cache behavior Algorithm Used (Double-and-Mirror): 1. Extend signal to 2N by mirroring: [x₀, x₁, ..., x_{N-1}, x_{N-1}, ..., x₁] 2. Apply 2N-point FFT 3. Extract DCT coefficients: DCT[k] = Re{FFT[k] × exp(-jπk/(2N))} / 2 4. Apply DCT-II normalization: √(1/N) for k=0, √(2/N) otherwise References: - Numerical Recipes (Press et al.) - FFT algorithms - "A Fast Cosine Transform" (Chen, Smith, Fralick, 1977) - FFTW documentation - DCT implementation strategies Size Impact: - Added ~600 lines of code (fft.cc + fft.h + tests) - Test code stripped in final build (STRIP_ALL) - Core FFT: ~180 lines, will replace ~200 lines of O(N²) DCT when ready - Net size impact: Minimal (similar code size, better performance) Next Steps: =========== 1. Continue development with existing O(N²) DCT (stable, accurate) 2. Phase 2: Refine FFT-based DCT algorithm when time permits 3. Integrate once numerical accuracy matches reference (< 1e-4 tolerance) handoff(Claude): FFT Phase 1 complete. Infrastructure ready for Phase 2 refinement. Current production code unchanged (zero risk). Next: Algorithm debugging or other tasks. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
9 hoursfeat(spectral_editor): Complete Phase 2 milestone - Full-featured web editorskal
MILESTONE: Spectral Brush Editor Phase 2 Complete (February 6, 2026) Phase 2 delivers a production-ready web-based editor for creating procedural audio by tracing spectrograms with parametric Bezier curves. This tool enables replacing 5KB .spec binary assets with ~100 bytes of C++ code (50-100× compression). Core Features Implemented: ======================== Audio I/O: - Load .wav and .spec files as reference spectrograms - Real-time audio preview (procedural vs original) - Live volume control with GainNode (updates during playback) - Export to procedural_params.txt (human-readable, re-editable format) - Generate C++ code (copy-paste ready for demo integration) Curve Editing: - Multi-curve support with individual colors and volumes - Bezier curve control points (frame, frequency, amplitude) - Drag-and-drop control point editing - Per-curve volume control (0-100%) - Right-click to delete control points - Curves only render within control point range (no spill) Profile System (All 3 types implemented): - Gaussian: exp(-(dist² / σ²)) - smooth harmonic falloff - Decaying Sinusoid: exp(-decay × dist) × cos(ω × dist) - metallic resonance - Noise: noise × exp(-(dist² / decay²)) - textured grit with decay envelope Visualization: - Log-scale frequency axis (20 Hz to 16 kHz) for better bass visibility - Logarithmic dB-scale intensity mapping (-60 dB to +40 dB range) - Reference opacity slider (0-100%) for mixing original/procedural views - Playhead indicator (red dashed line) during playback - Mouse crosshair with tooltip (frame number, frequency) - Control point info panel (frame, frequency, amplitude) Real-time Spectrum Viewer (NEW): - Always-visible bottom-right overlay (200×100px) - Shows frequency spectrum for frame under mouse (hover mode) - Shows current playback frame spectrum (playback mode) - Dual display: Reference (green) + Procedural (red) overlaid - dB-scale bar heights for accurate visualization - Frame number label (red during playback, gray when hovering) Rendering Architecture: - Destination-to-source pixel mapping (prevents gaps in log-scale) - Offscreen canvas compositing for proper alpha blending - Alpha channel for procedural intensity (pure colors, not dimmed) - Steeper dB falloff for procedural curves (-40 dB floor vs -60 dB reference) UI/UX: - Undo/Redo system (50-action history) - Keyboard shortcuts (1/2/Space for playback, Ctrl+Z/Ctrl+Shift+Z, Delete, Esc) - File load confirmation (warns about unsaved curves) - Automatic curve reset on new file load Technical Details: - DCT/IDCT implementation (JavaScript port matching C++ runtime) - Overlap-add synthesis with Hanning window - Web Audio API integration (32 kHz sample rate) - Zero external dependencies (pure HTML/CSS/JS) Files Modified: - tools/spectral_editor/script.js (~1730 lines, main implementation) - tools/spectral_editor/index.html (UI structure, spectrum viewer) - tools/spectral_editor/style.css (VSCode dark theme styling) - tools/spectral_editor/README.md (updated features, roadmap) Phase 3 TODO (Next): =================== - Effect combination system (noise + Gaussian modulation, layer compositing) - Improved C++ code testing (validation, edge cases) - Better frequency scale (mu-law or perceptual scale, less bass-heavy) - Pre-defined shape library (kick, snare, hi-hat templates) - Load procedural_params.txt back into editor (re-editing) - FFT-based DCT optimization (O(N log N) vs O(N²)) Integration: - Generate C++ code → Copy to src/audio/procedural_samples.cc - Add PROC() entry to assets/final/demo_assets.txt - Rebuild demo → Use AssetId::SOUND_PROC handoff(Claude): Phase 2 complete. Next: FFT implementation task for performance optimization. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
11 hoursfix(spectral_editor): Fix procedural audio and add color-coded curvesskal
Fixed three issues reported during testing: 1. Procedural audio now audible: - Added AMPLITUDE_SCALE=10.0 to match DCT coefficient magnitudes - Amplitude range 0-1 from Y-position now scaled to proper spectral levels 2. Procedural spectrogram now visible: - Each curve rendered separately with its own color - Normalized intensity calculation (specValue / 10.0) - Only draw pixels with intensity > 0.01 for performance 3. Color-coded curves: - Each curve assigned unique color from palette (8 colors cycling) - Colors: Blue, Green, Orange, Purple, Cyan, Brown, Pink, Gold - Control points and paths use curve color - Curve list shows color indicator dot - Procedural spectrogram uses curve colors for easy tracking Visual improvements: - Selected curves have thicker stroke (3px vs 2px) - Each curve contribution visible in separate color - Color dots in sidebar for quick identification Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
11 hoursfix(spectral_editor): Resolve variable name conflict in playAudioskal
Fixed 'Identifier source has already been declared' error at line 935. Bug: Function parameter 'source' (string: 'procedural' or 'original') conflicted with local AudioBufferSourceNode variable. Fix: Renamed local variable to 'bufferSource' for clarity. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
11 hoursfeat(tools): Add Spectral Brush Editor UI (Phase 2 of Task #5)skal
Implement web-based editor for procedural audio tracing. New Files: - tools/spectral_editor/index.html - Main UI structure - tools/spectral_editor/style.css - VSCode-inspired dark theme - tools/spectral_editor/script.js - Editor logic (~1200 lines) - tools/spectral_editor/dct.js - IDCT/DCT implementation (reused) - tools/spectral_editor/README.md - Complete user guide Features: - Dual-layer canvas (reference + procedural spectrograms) - Bezier curve editor (click to place, drag to adjust, right-click to delete) - Profile controls (Gaussian sigma slider) - Real-time audio playback (Key 1=procedural, Key 2=original, Space=stop) - Undo/Redo system (50-action history with snapshots) - File I/O: - Load .wav/.spec files (FFT/STFT or binary parser) - Save procedural_params.txt (human-readable, re-editable) - Generate C++ code (copy-paste ready for runtime) - Keyboard shortcuts (Ctrl+Z/Shift+Z, Ctrl+S/Shift+S, Ctrl+O, ?) - Help modal with shortcut reference Technical: - Pure HTML/CSS/JS (no dependencies) - Web Audio API for playback (32 kHz sample rate) - Canvas 2D for visualization (log-scale frequency) - Linear Bezier interpolation matching C++ runtime - IDCT with overlap-add synthesis Next: Phase 3 (currently integrated in Phase 2) - File loading already implemented - Export already implemented - Ready for user testing! Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>