diff options
| author | skal <pascal.massimino@gmail.com> | 2026-02-06 11:12:34 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-02-06 11:12:34 +0100 |
| commit | 5a1adde097e489c259bd052971546e95683c3596 (patch) | |
| tree | bf03cf8b803604638ad84ddd9cc26de64baea64f /doc/SPECTRAL_BRUSH_EDITOR.md | |
| parent | 83f34fb955524c09b7f3e124b97c3d4feef02a0c (diff) | |
feat(audio): Add Spectral Brush runtime (Phase 1 of Task #5)
Implement C++ runtime foundation for procedural audio tracing tool.
Changes:
- Created spectral_brush.h/cc with core API
- Linear Bezier interpolation
- Vertical profile evaluation (Gaussian, Decaying Sinusoid, Noise)
- draw_bezier_curve() for spectrogram rendering
- Home-brew deterministic RNG for noise profile
- Added comprehensive unit tests (test_spectral_brush.cc)
- Tests Bezier interpolation, profiles, edge cases
- Tests full spectrogram rendering pipeline
- All 9 tests pass
- Integrated into CMake build system
- Fixed test_assets.cc include (asset_manager_utils.h)
Design:
- Spectral Brush = Central Curve (Bezier) + Vertical Profile
- Enables 50-100x compression (5KB .spec to 100 bytes C++ code)
- Future: Cubic Bezier, composite profiles, multi-dimensional curves
Documentation:
- Added doc/SPECTRAL_BRUSH_EDITOR.md (complete architecture)
- Updated TODO.md with Phase 1-4 implementation plan
- Updated PROJECT_CONTEXT.md to mark Task #5 in progress
Test results: 21/21 tests pass (100%)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Diffstat (limited to 'doc/SPECTRAL_BRUSH_EDITOR.md')
| -rw-r--r-- | doc/SPECTRAL_BRUSH_EDITOR.md | 497 |
1 files changed, 497 insertions, 0 deletions
diff --git a/doc/SPECTRAL_BRUSH_EDITOR.md b/doc/SPECTRAL_BRUSH_EDITOR.md new file mode 100644 index 0000000..7ea0270 --- /dev/null +++ b/doc/SPECTRAL_BRUSH_EDITOR.md @@ -0,0 +1,497 @@ +# Spectral Brush Editor (Task #5) + +## Concept + +The **Spectral Brush Editor** is a web-based tool for creating procedural audio by tracing spectrograms with parametric curves. It enables compact representation of audio samples for 64k demos. + +### Goal + +Replace large `.spec` asset files with tiny procedural C++ code: +- **Before:** 5 KB binary `.spec` file +- **After:** ~100 bytes of C++ code calling `draw_bezier_curve()` + +### Workflow + +``` +.wav file → Load in editor → Trace with spectral brushes → Export params + C++ code + ↓ + (later) +procedural_params.txt → Load in editor → Adjust curves → Re-export +``` + +--- + +## Core Primitive: "Spectral Brush" + +A spectral brush consists of two components: + +### 1. Central Curve (Bezier) +Traces a path through time-frequency space: +``` +{freq_bin, amplitude} = bezier(frame_number) +``` + +**Properties:** +- Control points: `[(frame, freq_hz, amplitude), ...]` +- Interpolation: Linear (between control points) +- Future: Cubic Bezier, Catmull-Rom splines + +**Example:** +```javascript +control_points: [ + {frame: 0, freq_hz: 200.0, amplitude: 0.9}, // Attack + {frame: 20, freq_hz: 80.0, amplitude: 0.7}, // Sustain + {frame: 100, freq_hz: 50.0, amplitude: 0.0} // Decay +] +``` + +### 2. Vertical Profile +At each frame, applies a shape **vertically** in frequency bins around the central curve. + +**Profile Types:** + +#### Gaussian (Smooth harmonic) +``` +amplitude(dist) = exp(-(dist² / σ²)) +``` +- **σ (sigma)**: Width in frequency bins +- Use case: Musical tones, bass notes, melodic lines + +#### Decaying Sinusoid (Textured/resonant) +``` +amplitude(dist) = exp(-decay * dist) * cos(ω * dist) +``` +- **decay**: Falloff rate +- **ω (omega)**: Oscillation frequency +- Use case: Metallic sounds, bells, resonant tones + +#### Noise (Textured/gritty) +``` +amplitude(dist) = random(seed, dist) * noise_amplitude +``` +- **seed**: Deterministic RNG seed +- Use case: Hi-hats, cymbals, textured sounds + +#### Composite (Combinable) +```javascript +{ + type: "composite", + operation: "add" | "subtract" | "multiply", + profiles: [ + {type: "gaussian", sigma: 30.0}, + {type: "noise", amplitude: 0.1, seed: 42} + ] +} +``` + +--- + +## Visual Model + +``` +Frequency (bins) + ^ + | +512 | * ← Gaussian profile (frame 80) + | * * * +256 | **** * * * ← Gaussian profile (frame 50) + | ** ** * * +128 | ** Curve * * ← Central Bezier curve + | ** * + 64 | ** * + | * * + 0 +--*--*--*--*--*--*--*--*---→ Time (frames) + 0 10 20 30 40 50 60 70 80 + + At each frame: + 1. Evaluate curve → get freq_bin_0 and amplitude + 2. Draw profile vertically at that frame +``` + +--- + +## File Formats + +### A. `procedural_params.txt` (Human-readable, re-editable) + +```text +# Kick drum spectral brush definition +METADATA dct_size=512 num_frames=100 sample_rate=32000 + +CURVE bezier + CONTROL_POINT 0 200.0 0.9 # frame, freq_hz, amplitude + CONTROL_POINT 20 80.0 0.7 + CONTROL_POINT 50 60.0 0.3 + CONTROL_POINT 100 50.0 0.0 + PROFILE gaussian sigma=30.0 + PROFILE_ADD noise amplitude=0.1 seed=42 +END_CURVE + +CURVE bezier + CONTROL_POINT 0 500.0 0.5 + CONTROL_POINT 30 300.0 0.2 + PROFILE decaying_sinusoid decay=0.15 frequency=0.8 +END_CURVE +``` + +**Purpose:** +- Load back into editor for re-editing +- Human-readable, version-control friendly +- Can be hand-tweaked in text editor + +### B. C++ Code (Ready to compile) + +```cpp +// Generated from procedural_params.txt +// File: src/audio/gen_kick_procedural.cc + +#include "audio/spectral_brush.h" + +void gen_kick_procedural(float* spec, int dct_size, int num_frames) { + // Curve 0: Low-frequency punch with noise texture + { + const float frames[] = {0.0f, 20.0f, 50.0f, 100.0f}; + const float freqs[] = {200.0f, 80.0f, 60.0f, 50.0f}; + const float amps[] = {0.9f, 0.7f, 0.3f, 0.0f}; + + draw_bezier_curve(spec, dct_size, num_frames, + frames, freqs, amps, 4, + PROFILE_GAUSSIAN, 30.0f); + + draw_bezier_curve_add(spec, dct_size, num_frames, + frames, freqs, amps, 4, + PROFILE_NOISE, 0.1f, 42.0f); + } + + // Curve 1: High-frequency attack + { + const float frames[] = {0.0f, 30.0f}; + const float freqs[] = {500.0f, 300.0f}; + const float amps[] = {0.5f, 0.2f}; + + draw_bezier_curve(spec, dct_size, num_frames, + frames, freqs, amps, 2, + PROFILE_DECAYING_SINUSOID, 0.15f, 0.8f); + } +} + +// Usage in demo_assets.txt: +// KICK_PROC, PROC(gen_kick_procedural), NONE, "Procedural kick drum" +``` + +**Purpose:** +- Copy-paste into `src/audio/procedural_samples.cc` +- Compile directly into demo +- Zero runtime parsing overhead + +--- + +## C++ Runtime API + +### New Files + +#### `src/audio/spectral_brush.h` +```cpp +#pragma once +#include <cstdint> + +enum ProfileType { + PROFILE_GAUSSIAN = 0, + PROFILE_DECAYING_SINUSOID = 1, + PROFILE_NOISE = 2 +}; + +// Evaluate linear Bezier interpolation at frame t +float evaluate_bezier_linear(const float* control_frames, + const float* control_values, + int n_points, + float frame); + +// Draw spectral brush: Bezier curve with vertical profile +void draw_bezier_curve(float* spectrogram, int dct_size, int num_frames, + const float* control_frames, + const float* control_freqs_hz, + const float* control_amps, + int n_control_points, + ProfileType profile_type, + float profile_param1, + float profile_param2 = 0.0f); + +// Additive variant (for compositing profiles) +void draw_bezier_curve_add(float* spectrogram, int dct_size, int num_frames, + const float* control_frames, + const float* control_freqs_hz, + const float* control_amps, + int n_control_points, + ProfileType profile_type, + float profile_param1, + float profile_param2 = 0.0f); + +// Profile evaluation +float evaluate_profile(ProfileType type, float distance, + float param1, float param2); + +// Home-brew deterministic RNG (small, portable) +uint32_t spectral_brush_rand(uint32_t seed); +``` + +**Future Extensions:** +- Cubic Bezier interpolation: `evaluate_bezier_cubic()` +- Generic loader: `gen_from_params(const PrimitiveData*)` + +--- + +## Editor Architecture + +### Technology Stack +- **HTML5 Canvas**: Spectrogram visualization +- **Web Audio API**: Playback (IDCT → audio) +- **Pure JavaScript**: No dependencies +- **Reuse from existing editor**: `dct.js` (IDCT implementation) + +### Editor UI Layout + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Spectral Brush Editor [Load .wav] │ +├─────────────────────────────────────────────────────────────┤ +│ │ +│ ┌───────────────────────────────────────────────────┐ │ +│ │ │ │ +│ │ CANVAS (Spectrogram Display) │ T │ +│ │ │ O │ +│ │ • Background: Reference .wav (gray, transparent) │ O │ +│ │ • Foreground: Procedural curves (colored) │ L │ +│ │ • Bezier control points (draggable circles) │ S │ +│ │ │ │ +│ │ │ [+]│ +│ │ │ [-]│ +│ │ │ [x]│ +│ └───────────────────────────────────────────────────┘ │ +│ │ +├─────────────────────────────────────────────────────────────┤ +│ Profile: [Gaussian ▾] Sigma: [████████░░] 30.0 │ +│ Curves: [Curve 0 ▾] Amplitude: [██████░░░░] 0.8 │ +├─────────────────────────────────────────────────────────────┤ +│ [1] Play Procedural [2] Play Original [Space] Pause │ +│ [Save Params] [Generate C++] [Undo] [Redo] │ +└─────────────────────────────────────────────────────────────┘ +``` + +### Features (Phase 1: Minimal Working Version) + +#### Editing +- Click to place Bezier control points +- Drag to adjust control points (frame, frequency, amplitude) +- Delete control points (right-click or Delete key) +- Undo/Redo support (action history) + +#### Visualization +- Dual-layer canvas: + - **Background**: Reference spectrogram (semi-transparent gray) + - **Foreground**: Procedural spectrogram (colored) +- Log-scale frequency axis (musical perception) +- Control points: Draggable circles with labels + +#### Audio Playback +- **Key '1'**: Play procedural sound +- **Key '2'**: Play original .wav +- **Space**: Play/pause toggle + +#### File I/O +- Load .wav or .spec (reference sound) +- Save procedural_params.txt (re-editable) +- Generate C++ code (copy-paste ready) + +#### Undo/Redo +- Action history with snapshots +- Commands: Add control point, Move control point, Delete control point, Change profile +- **Ctrl+Z**: Undo +- **Ctrl+Shift+Z**: Redo + +### Keyboard Shortcuts + +| Key | Action | +|-----|--------| +| **1** | Play procedural sound | +| **2** | Play original .wav | +| **Space** | Play/pause | +| **Delete** | Delete selected control point | +| **Esc** | Deselect all | +| **Ctrl+Z** | Undo | +| **Ctrl+Shift+Z** | Redo | +| **Ctrl+S** | Save procedural_params.txt | +| **Ctrl+Shift+S** | Generate C++ code | + +--- + +## Implementation Plan + +### Phase 1: C++ Runtime (Foundation) +**Files:** `src/audio/spectral_brush.h`, `src/audio/spectral_brush.cc` + +**Tasks:** +- [ ] Define API (`ProfileType`, `draw_bezier_curve()`, etc.) +- [ ] Implement linear Bezier interpolation +- [ ] Implement Gaussian profile evaluation +- [ ] Implement home-brew RNG (for future noise support) +- [ ] Add unit tests (`src/tests/test_spectral_brush.cc`) + +**Deliverable:** Compiles, tests pass + +--- + +### Phase 2: Editor Core +**Files:** `tools/spectral_editor/index.html`, `script.js`, `style.css`, `dct.js` (reuse) + +**Tasks:** +- [ ] HTML structure (canvas, controls, file input) +- [ ] Canvas rendering (dual-layer: reference + procedural) +- [ ] Bezier curve editor (place/drag/delete control points) +- [ ] Profile controls (Gaussian sigma slider) +- [ ] Real-time spectrogram rendering +- [ ] Audio playback (IDCT → Web Audio API) +- [ ] Undo/Redo system + +**Deliverable:** Interactive editor, can trace .wav files + +--- + +### Phase 3: File I/O +**Tasks:** +- [ ] Load .wav (decode, FFT/STFT → spectrogram) +- [ ] Load .spec (binary format parser) +- [ ] Save procedural_params.txt (text format writer) +- [ ] Generate C++ code (code generation template) +- [ ] Load procedural_params.txt (re-editing workflow) + +**Deliverable:** Full save/load cycle works + +--- + +### Phase 4: Polish & Documentation +**Tasks:** +- [ ] UI refinements (tooltips, visual feedback) +- [ ] Keyboard shortcut overlay (press '?' to show help) +- [ ] Error handling (invalid files, audio context failures) +- [ ] User guide (README.md in tools/spectral_editor/) +- [ ] Example files (kick.wav → kick_procedural.cc) + +**Deliverable:** Production-ready tool + +--- + +## Design Decisions + +### 1. Bezier Interpolation +- **Current:** Linear (simple, fast, small code) +- **Future:** Cubic Bezier, Catmull-Rom splines + +### 2. Parameter Limits +- **Editor UI:** Soft ranges (reasonable defaults, not enforced) +- **Examples:** + - Sigma: 1.0 - 100.0 (suggested range, user can type beyond) + - Amplitude: 0.0 - 1.0 (suggested, can exceed for overdrive effects) + +### 3. Random Number Generator +- **Implementation:** Home-brew deterministic RNG +- **Reason:** Independence, small code, repeatable results +- **Algorithm:** Simple LCG or xorshift + +### 4. Code Generation +- **Current:** Single function per sound (e.g., `gen_kick_procedural()`) +- **Future:** Generic `gen_from_params(const PrimitiveData*)` with data tables + +### 5. Initial UI Scope +- **Phase 1:** Bezier + Gaussian only +- **Rationale:** Validate workflow before adding complexity +- **Future:** Decaying sinusoid, noise, composite profiles + +--- + +## Future Extensions + +### Bezier Enhancements +- [ ] Cubic Bezier interpolation (smoother curves) +- [ ] Catmull-Rom splines (automatic tangent control) +- [ ] Bezier curve tangent handles (manual control) + +### Additional Profiles +- [ ] Decaying sinusoid (metallic sounds) +- [ ] Noise (textured sounds) +- [ ] Composite profiles (add/subtract/multiply) +- [ ] User-defined profiles (custom formulas) + +### Multi-Dimensional Bezier +- [ ] `{freq, amplitude, oscillator_freq, decay} = bezier(frame)` +- [ ] Per-parameter control curves + +### Advanced Features +- [ ] Frequency snapping (snap to musical notes: C4, D4, E4, etc.) +- [ ] Amplitude envelope presets (ADSR) +- [ ] Profile library (save/load custom profiles) +- [ ] Batch processing (convert entire folder of .wav files) + +### Code Generation +- [ ] Generic `gen_from_params()` with data tables +- [ ] Optimization: Merge similar curves +- [ ] Size estimation (preview bytes saved vs. original .spec) + +--- + +## Testing Strategy + +### C++ Runtime Tests +**File:** `src/tests/test_spectral_brush.cc` + +**Test cases:** +- Linear Bezier interpolation (verify values at control points) +- Gaussian profile evaluation (verify falloff curve) +- Full `draw_bezier_curve()` (verify spectrogram output) +- Edge cases (0 control points, 1 control point, out-of-range frames) + +### Editor Tests +**Manual testing workflow:** +1. Load example .wav (kick drum) +2. Place 3-4 control points to trace low-frequency punch +3. Adjust Gaussian sigma +4. Play procedural sound (should resemble original) +5. Save procedural_params.txt +6. Generate C++ code +7. Copy C++ code into demo, compile, verify runtime output matches editor + +--- + +## Size Impact Estimate + +**Example: Kick drum sample** + +**Before (Binary .spec):** +- DCT size: 512 +- Num frames: 100 +- Size: 512 × 100 × 4 bytes = 200 KB (uncompressed) +- Compressed (zlib): ~5-10 KB + +**After (Procedural C++):** +```cpp +// 4 control points × 3 arrays × 4 values = ~48 bytes of data +const float frames[] = {0.0f, 20.0f, 50.0f, 100.0f}; +const float freqs[] = {200.0f, 80.0f, 60.0f, 50.0f}; +const float amps[] = {0.9f, 0.7f, 0.3f, 0.0f}; +draw_bezier_curve(...); // ~20 bytes function call + +// Total: ~100 bytes +``` + +**Compression ratio:** 50-100× reduction! + +**Trade-off:** Runtime CPU cost (generation vs. lookup), but acceptable for 64k demo. + +--- + +## References + +- **Bezier curves:** https://en.wikipedia.org/wiki/B%C3%A9zier_curve +- **DCT/IDCT:** Existing implementation in `src/audio/dct.cc` +- **Web Audio API:** https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API +- **Spectrogram visualization:** Existing editor in `tools/editor/` (to be replaced) |
