# Spectral Brush Editor (Task #5) ## Concept The **Spectral Brush Editor** is a web-based tool for creating procedural audio by tracing spectrograms with parametric curves. It enables compact representation of audio samples for 64k demos. ### Goal Replace large `.spec` asset files with tiny procedural C++ code: - **Before:** 5 KB binary `.spec` file - **After:** ~100 bytes of C++ code calling `draw_bezier_curve()` ### Workflow ``` .wav file → Load in editor → Trace with spectral brushes → Export params + C++ code ↓ (later) procedural_params.txt → Load in editor → Adjust curves → Re-export ``` --- ## Core Primitive: "Spectral Brush" A spectral brush consists of two components: ### 1. Central Curve (Bezier) Traces a path through time-frequency space: ``` {freq_bin, amplitude} = bezier(frame_number) ``` **Properties:** - Control points: `[(frame, freq_hz, amplitude), ...]` - Interpolation: Linear (between control points) - Future: Cubic Bezier, Catmull-Rom splines **Example:** ```javascript control_points: [ {frame: 0, freq_hz: 200.0, amplitude: 0.9}, // Attack {frame: 20, freq_hz: 80.0, amplitude: 0.7}, // Sustain {frame: 100, freq_hz: 50.0, amplitude: 0.0} // Decay ] ``` ### 2. Vertical Profile At each frame, applies a shape **vertically** in frequency bins around the central curve. **Profile Types:** #### Gaussian (Smooth harmonic) ``` amplitude(dist) = exp(-(dist² / σ²)) ``` - **σ (sigma)**: Width in frequency bins - Use case: Musical tones, bass notes, melodic lines #### Decaying Sinusoid (Textured/resonant) ``` amplitude(dist) = exp(-decay * dist) * cos(ω * dist) ``` - **decay**: Falloff rate - **ω (omega)**: Oscillation frequency - Use case: Metallic sounds, bells, resonant tones #### Noise (Textured/gritty) ``` amplitude(dist) = random(seed, dist) * noise_amplitude ``` - **seed**: Deterministic RNG seed - Use case: Hi-hats, cymbals, textured sounds #### Composite (Combinable) ```javascript { type: "composite", operation: "add" | "subtract" | "multiply", profiles: [ {type: "gaussian", sigma: 30.0}, {type: "noise", amplitude: 0.1, seed: 42} ] } ``` --- ## Visual Model ``` Frequency (bins) ^ | 512 | * ← Gaussian profile (frame 80) | * * * 256 | **** * * * ← Gaussian profile (frame 50) | ** ** * * 128 | ** Curve * * ← Central Bezier curve | ** * 64 | ** * | * * 0 +--*--*--*--*--*--*--*--*---→ Time (frames) 0 10 20 30 40 50 60 70 80 At each frame: 1. Evaluate curve → get freq_bin_0 and amplitude 2. Draw profile vertically at that frame ``` --- ## File Formats ### A. `procedural_params.txt` (Human-readable, re-editable) ```text # Kick drum spectral brush definition METADATA dct_size=512 num_frames=100 sample_rate=32000 CURVE bezier CONTROL_POINT 0 200.0 0.9 # frame, freq_hz, amplitude CONTROL_POINT 20 80.0 0.7 CONTROL_POINT 50 60.0 0.3 CONTROL_POINT 100 50.0 0.0 PROFILE gaussian sigma=30.0 PROFILE_ADD noise amplitude=0.1 seed=42 END_CURVE CURVE bezier CONTROL_POINT 0 500.0 0.5 CONTROL_POINT 30 300.0 0.2 PROFILE decaying_sinusoid decay=0.15 frequency=0.8 END_CURVE ``` **Purpose:** - Load back into editor for re-editing - Human-readable, version-control friendly - Can be hand-tweaked in text editor ### B. C++ Code (Ready to compile) ```cpp // Generated from procedural_params.txt // File: src/audio/gen_kick_procedural.cc #include "audio/spectral_brush.h" void gen_kick_procedural(float* spec, int dct_size, int num_frames) { // Curve 0: Low-frequency punch with noise texture { const float frames[] = {0.0f, 20.0f, 50.0f, 100.0f}; const float freqs[] = {200.0f, 80.0f, 60.0f, 50.0f}; const float amps[] = {0.9f, 0.7f, 0.3f, 0.0f}; draw_bezier_curve(spec, dct_size, num_frames, frames, freqs, amps, 4, PROFILE_GAUSSIAN, 30.0f); draw_bezier_curve_add(spec, dct_size, num_frames, frames, freqs, amps, 4, PROFILE_NOISE, 0.1f, 42.0f); } // Curve 1: High-frequency attack { const float frames[] = {0.0f, 30.0f}; const float freqs[] = {500.0f, 300.0f}; const float amps[] = {0.5f, 0.2f}; draw_bezier_curve(spec, dct_size, num_frames, frames, freqs, amps, 2, PROFILE_DECAYING_SINUSOID, 0.15f, 0.8f); } } // Usage in demo_assets.txt: // KICK_PROC, PROC(gen_kick_procedural), NONE, "Procedural kick drum" ``` **Purpose:** - Copy-paste into `src/audio/procedural_samples.cc` - Compile directly into demo - Zero runtime parsing overhead --- ## C++ Runtime API ### New Files #### `src/audio/spectral_brush.h` ```cpp #pragma once #include enum ProfileType { PROFILE_GAUSSIAN = 0, PROFILE_DECAYING_SINUSOID = 1, PROFILE_NOISE = 2 }; // Evaluate linear Bezier interpolation at frame t float evaluate_bezier_linear(const float* control_frames, const float* control_values, int n_points, float frame); // Draw spectral brush: Bezier curve with vertical profile void draw_bezier_curve(float* spectrogram, int dct_size, int num_frames, const float* control_frames, const float* control_freqs_hz, const float* control_amps, int n_control_points, ProfileType profile_type, float profile_param1, float profile_param2 = 0.0f); // Additive variant (for compositing profiles) void draw_bezier_curve_add(float* spectrogram, int dct_size, int num_frames, const float* control_frames, const float* control_freqs_hz, const float* control_amps, int n_control_points, ProfileType profile_type, float profile_param1, float profile_param2 = 0.0f); // Profile evaluation float evaluate_profile(ProfileType type, float distance, float param1, float param2); // Home-brew deterministic RNG (small, portable) uint32_t spectral_brush_rand(uint32_t seed); ``` **Future Extensions:** - Cubic Bezier interpolation: `evaluate_bezier_cubic()` - Generic loader: `gen_from_params(const PrimitiveData*)` --- ## Editor Architecture ### Technology Stack - **HTML5 Canvas**: Spectrogram visualization - **Web Audio API**: Playback (IDCT → audio) - **Pure JavaScript**: No dependencies - **Reuse from existing editor**: `dct.js` (IDCT implementation) ### Editor UI Layout ``` ┌─────────────────────────────────────────────────────────────┐ │ Spectral Brush Editor [Load .wav] │ ├─────────────────────────────────────────────────────────────┤ │ │ │ ┌───────────────────────────────────────────────────┐ │ │ │ │ │ │ │ CANVAS (Spectrogram Display) │ T │ │ │ │ O │ │ │ • Background: Reference .wav (gray, transparent) │ O │ │ │ • Foreground: Procedural curves (colored) │ L │ │ │ • Bezier control points (draggable circles) │ S │ │ │ │ │ │ │ │ [+]│ │ │ │ [-]│ │ │ │ [x]│ │ └───────────────────────────────────────────────────┘ │ │ │ ├─────────────────────────────────────────────────────────────┤ │ Profile: [Gaussian ▾] Sigma: [████████░░] 30.0 │ │ Curves: [Curve 0 ▾] Amplitude: [██████░░░░] 0.8 │ ├─────────────────────────────────────────────────────────────┤ │ [1] Play Procedural [2] Play Original [Space] Pause │ │ [Save Params] [Generate C++] [Undo] [Redo] │ └─────────────────────────────────────────────────────────────┘ ``` ### Features (Phase 1: Minimal Working Version) #### Editing - Click to place Bezier control points - Drag to adjust control points (frame, frequency, amplitude) - Delete control points (right-click or Delete key) - Undo/Redo support (action history) #### Visualization - Dual-layer canvas: - **Background**: Reference spectrogram (semi-transparent gray) - **Foreground**: Procedural spectrogram (colored) - Log-scale frequency axis (musical perception) - Control points: Draggable circles with labels #### Audio Playback - **Key '1'**: Play procedural sound - **Key '2'**: Play original .wav - **Space**: Play/pause toggle #### File I/O - Load .wav or .spec (reference sound) - Save procedural_params.txt (re-editable) - Generate C++ code (copy-paste ready) #### Undo/Redo - Action history with snapshots - Commands: Add control point, Move control point, Delete control point, Change profile - **Ctrl+Z**: Undo - **Ctrl+Shift+Z**: Redo ### Keyboard Shortcuts | Key | Action | |-----|--------| | **1** | Play procedural sound | | **2** | Play original .wav | | **Space** | Play/pause | | **Delete** | Delete selected control point | | **Esc** | Deselect all | | **Ctrl+Z** | Undo | | **Ctrl+Shift+Z** | Redo | | **Ctrl+S** | Save procedural_params.txt | | **Ctrl+Shift+S** | Generate C++ code | --- ## Implementation Plan ### Phase 1: C++ Runtime (Foundation) **Files:** `src/audio/spectral_brush.h`, `src/audio/spectral_brush.cc` **Tasks:** - [ ] Define API (`ProfileType`, `draw_bezier_curve()`, etc.) - [ ] Implement linear Bezier interpolation - [ ] Implement Gaussian profile evaluation - [ ] Implement home-brew RNG (for future noise support) - [ ] Add unit tests (`src/tests/test_spectral_brush.cc`) **Deliverable:** Compiles, tests pass --- ### Phase 2: Editor Core **Files:** `tools/spectral_editor/index.html`, `script.js`, `style.css`, `dct.js` (reuse) **Tasks:** - [ ] HTML structure (canvas, controls, file input) - [ ] Canvas rendering (dual-layer: reference + procedural) - [ ] Bezier curve editor (place/drag/delete control points) - [ ] Profile controls (Gaussian sigma slider) - [ ] Real-time spectrogram rendering - [ ] Audio playback (IDCT → Web Audio API) - [ ] Undo/Redo system **Deliverable:** Interactive editor, can trace .wav files --- ### Phase 3: File I/O **Tasks:** - [ ] Load .wav (decode, FFT/STFT → spectrogram) - [ ] Load .spec (binary format parser) - [ ] Save procedural_params.txt (text format writer) - [ ] Generate C++ code (code generation template) - [ ] Load procedural_params.txt (re-editing workflow) **Deliverable:** Full save/load cycle works --- ### Phase 4: Polish & Documentation **Tasks:** - [ ] UI refinements (tooltips, visual feedback) - [ ] Keyboard shortcut overlay (press '?' to show help) - [ ] Error handling (invalid files, audio context failures) - [ ] User guide (README.md in tools/spectral_editor/) - [ ] Example files (kick.wav → kick_procedural.cc) **Deliverable:** Production-ready tool --- ## Design Decisions ### 1. Bezier Interpolation - **Current:** Linear (simple, fast, small code) - **Future:** Cubic Bezier, Catmull-Rom splines ### 2. Parameter Limits - **Editor UI:** Soft ranges (reasonable defaults, not enforced) - **Examples:** - Sigma: 1.0 - 100.0 (suggested range, user can type beyond) - Amplitude: 0.0 - 1.0 (suggested, can exceed for overdrive effects) ### 3. Random Number Generator - **Implementation:** Home-brew deterministic RNG - **Reason:** Independence, small code, repeatable results - **Algorithm:** Simple LCG or xorshift ### 4. Code Generation - **Current:** Single function per sound (e.g., `gen_kick_procedural()`) - **Future:** Generic `gen_from_params(const PrimitiveData*)` with data tables ### 5. Initial UI Scope - **Phase 1:** Bezier + Gaussian only - **Rationale:** Validate workflow before adding complexity - **Future:** Decaying sinusoid, noise, composite profiles --- ## Future Extensions ### Bezier Enhancements - [ ] Cubic Bezier interpolation (smoother curves) - [ ] Catmull-Rom splines (automatic tangent control) - [ ] Bezier curve tangent handles (manual control) ### Additional Profiles - [ ] Decaying sinusoid (metallic sounds) - [ ] Noise (textured sounds) - [ ] Composite profiles (add/subtract/multiply) - [ ] User-defined profiles (custom formulas) ### Multi-Dimensional Bezier - [ ] `{freq, amplitude, oscillator_freq, decay} = bezier(frame)` - [ ] Per-parameter control curves ### Advanced Features - [ ] Frequency snapping (snap to musical notes: C4, D4, E4, etc.) - [ ] Amplitude envelope presets (ADSR) - [ ] Profile library (save/load custom profiles) - [ ] Batch processing (convert entire folder of .wav files) ### Code Generation - [ ] Generic `gen_from_params()` with data tables - [ ] Optimization: Merge similar curves - [ ] Size estimation (preview bytes saved vs. original .spec) --- ## Testing Strategy ### C++ Runtime Tests **File:** `src/tests/test_spectral_brush.cc` **Test cases:** - Linear Bezier interpolation (verify values at control points) - Gaussian profile evaluation (verify falloff curve) - Full `draw_bezier_curve()` (verify spectrogram output) - Edge cases (0 control points, 1 control point, out-of-range frames) ### Editor Tests **Manual testing workflow:** 1. Load example .wav (kick drum) 2. Place 3-4 control points to trace low-frequency punch 3. Adjust Gaussian sigma 4. Play procedural sound (should resemble original) 5. Save procedural_params.txt 6. Generate C++ code 7. Copy C++ code into demo, compile, verify runtime output matches editor --- ## Size Impact Estimate **Example: Kick drum sample** **Before (Binary .spec):** - DCT size: 512 - Num frames: 100 - Size: 512 × 100 × 4 bytes = 200 KB (uncompressed) - Compressed (zlib): ~5-10 KB **After (Procedural C++):** ```cpp // 4 control points × 3 arrays × 4 values = ~48 bytes of data const float frames[] = {0.0f, 20.0f, 50.0f, 100.0f}; const float freqs[] = {200.0f, 80.0f, 60.0f, 50.0f}; const float amps[] = {0.9f, 0.7f, 0.3f, 0.0f}; draw_bezier_curve(...); // ~20 bytes function call // Total: ~100 bytes ``` **Compression ratio:** 50-100× reduction! **Trade-off:** Runtime CPU cost (generation vs. lookup), but acceptable for 64k demo. --- ## References - **Bezier curves:** https://en.wikipedia.org/wiki/B%C3%A9zier_curve - **DCT/IDCT:** Existing implementation in `src/audio/dct.cc` - **Web Audio API:** https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API - **Spectrogram visualization:** Existing editor in `tools/editor/` (to be replaced)