summaryrefslogtreecommitdiff
path: root/doc/SPECTRAL_BRUSH_EDITOR.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/SPECTRAL_BRUSH_EDITOR.md')
-rw-r--r--doc/SPECTRAL_BRUSH_EDITOR.md474
1 files changed, 86 insertions, 388 deletions
diff --git a/doc/SPECTRAL_BRUSH_EDITOR.md b/doc/SPECTRAL_BRUSH_EDITOR.md
index 7ea0270..a7d0e3a 100644
--- a/doc/SPECTRAL_BRUSH_EDITOR.md
+++ b/doc/SPECTRAL_BRUSH_EDITOR.md
@@ -2,43 +2,26 @@
## Concept
-The **Spectral Brush Editor** is a web-based tool for creating procedural audio by tracing spectrograms with parametric curves. It enables compact representation of audio samples for 64k demos.
+Replace large `.spec` assets with procedural C++ code (50-100× compression).
-### Goal
-
-Replace large `.spec` asset files with tiny procedural C++ code:
-- **Before:** 5 KB binary `.spec` file
-- **After:** ~100 bytes of C++ code calling `draw_bezier_curve()`
-
-### Workflow
+**Before:** 5 KB binary `.spec` file
+**After:** ~100 bytes C++ code calling `draw_bezier_curve()`
+**Workflow:**
```
-.wav file → Load in editor → Trace with spectral brushes → Export params + C++ code
- ↓
- (later)
-procedural_params.txt → Load in editor → Adjust curves → Re-export
+.wav → Load in editor → Trace with Bezier curves → Export procedural_params.txt + C++ code
```
---
## Core Primitive: "Spectral Brush"
-A spectral brush consists of two components:
-
### 1. Central Curve (Bezier)
-Traces a path through time-frequency space:
-```
-{freq_bin, amplitude} = bezier(frame_number)
-```
-
-**Properties:**
-- Control points: `[(frame, freq_hz, amplitude), ...]`
-- Interpolation: Linear (between control points)
-- Future: Cubic Bezier, Catmull-Rom splines
+Traces time-frequency path: `{freq_bin, amplitude} = bezier(frame_number)`
-**Example:**
+**Example control points:**
```javascript
-control_points: [
+[
{frame: 0, freq_hz: 200.0, amplitude: 0.9}, // Attack
{frame: 20, freq_hz: 80.0, amplitude: 0.7}, // Sustain
{frame: 100, freq_hz: 50.0, amplitude: 0.0} // Decay
@@ -46,168 +29,59 @@ control_points: [
```
### 2. Vertical Profile
-At each frame, applies a shape **vertically** in frequency bins around the central curve.
+Shapes "brush stroke" around curve at each frame.
**Profile Types:**
-
-#### Gaussian (Smooth harmonic)
-```
-amplitude(dist) = exp(-(dist² / σ²))
-```
-- **σ (sigma)**: Width in frequency bins
-- Use case: Musical tones, bass notes, melodic lines
-
-#### Decaying Sinusoid (Textured/resonant)
-```
-amplitude(dist) = exp(-decay * dist) * cos(ω * dist)
-```
-- **decay**: Falloff rate
-- **ω (omega)**: Oscillation frequency
-- Use case: Metallic sounds, bells, resonant tones
-
-#### Noise (Textured/gritty)
-```
-amplitude(dist) = random(seed, dist) * noise_amplitude
-```
-- **seed**: Deterministic RNG seed
-- Use case: Hi-hats, cymbals, textured sounds
-
-#### Composite (Combinable)
-```javascript
-{
- type: "composite",
- operation: "add" | "subtract" | "multiply",
- profiles: [
- {type: "gaussian", sigma: 30.0},
- {type: "noise", amplitude: 0.1, seed: 42}
- ]
-}
-```
-
----
-
-## Visual Model
-
-```
-Frequency (bins)
- ^
- |
-512 | * ← Gaussian profile (frame 80)
- | * * *
-256 | **** * * * ← Gaussian profile (frame 50)
- | ** ** * *
-128 | ** Curve * * ← Central Bezier curve
- | ** *
- 64 | ** *
- | * *
- 0 +--*--*--*--*--*--*--*--*---→ Time (frames)
- 0 10 20 30 40 50 60 70 80
-
- At each frame:
- 1. Evaluate curve → get freq_bin_0 and amplitude
- 2. Draw profile vertically at that frame
-```
+- **Gaussian**: `exp(-(dist² / σ²))` - Musical tones, bass
+- **Decaying Sinusoid**: `exp(-decay * dist) * cos(ω * dist)` - Metallic sounds
+- **Noise**: `random(seed, dist) * amplitude` - Hi-hats, cymbals
+- **Composite**: Combine multiple profiles (add/subtract/multiply)
---
## File Formats
-### A. `procedural_params.txt` (Human-readable, re-editable)
-
+### A. `procedural_params.txt` (Human-readable)
```text
-# Kick drum spectral brush definition
METADATA dct_size=512 num_frames=100 sample_rate=32000
CURVE bezier
- CONTROL_POINT 0 200.0 0.9 # frame, freq_hz, amplitude
+ CONTROL_POINT 0 200.0 0.9
CONTROL_POINT 20 80.0 0.7
- CONTROL_POINT 50 60.0 0.3
CONTROL_POINT 100 50.0 0.0
PROFILE gaussian sigma=30.0
- PROFILE_ADD noise amplitude=0.1 seed=42
-END_CURVE
-
-CURVE bezier
- CONTROL_POINT 0 500.0 0.5
- CONTROL_POINT 30 300.0 0.2
- PROFILE decaying_sinusoid decay=0.15 frequency=0.8
END_CURVE
```
-**Purpose:**
-- Load back into editor for re-editing
-- Human-readable, version-control friendly
-- Can be hand-tweaked in text editor
-
### B. C++ Code (Ready to compile)
-
```cpp
-// Generated from procedural_params.txt
-// File: src/audio/gen_kick_procedural.cc
-
#include "audio/spectral_brush.h"
void gen_kick_procedural(float* spec, int dct_size, int num_frames) {
- // Curve 0: Low-frequency punch with noise texture
- {
- const float frames[] = {0.0f, 20.0f, 50.0f, 100.0f};
- const float freqs[] = {200.0f, 80.0f, 60.0f, 50.0f};
- const float amps[] = {0.9f, 0.7f, 0.3f, 0.0f};
-
- draw_bezier_curve(spec, dct_size, num_frames,
- frames, freqs, amps, 4,
- PROFILE_GAUSSIAN, 30.0f);
-
- draw_bezier_curve_add(spec, dct_size, num_frames,
- frames, freqs, amps, 4,
- PROFILE_NOISE, 0.1f, 42.0f);
- }
-
- // Curve 1: High-frequency attack
- {
- const float frames[] = {0.0f, 30.0f};
- const float freqs[] = {500.0f, 300.0f};
- const float amps[] = {0.5f, 0.2f};
+ const float frames[] = {0.0f, 20.0f, 100.0f};
+ const float freqs[] = {200.0f, 80.0f, 50.0f};
+ const float amps[] = {0.9f, 0.7f, 0.0f};
- draw_bezier_curve(spec, dct_size, num_frames,
- frames, freqs, amps, 2,
- PROFILE_DECAYING_SINUSOID, 0.15f, 0.8f);
- }
+ draw_bezier_curve(spec, dct_size, num_frames,
+ frames, freqs, amps, 3,
+ PROFILE_GAUSSIAN, 30.0f);
}
-
-// Usage in demo_assets.txt:
-// KICK_PROC, PROC(gen_kick_procedural), NONE, "Procedural kick drum"
```
-**Purpose:**
-- Copy-paste into `src/audio/procedural_samples.cc`
-- Compile directly into demo
-- Zero runtime parsing overhead
-
---
## C++ Runtime API
-### New Files
+### Files: `src/audio/spectral_brush.{h,cc}`
-#### `src/audio/spectral_brush.h`
+**Key functions:**
```cpp
-#pragma once
-#include <cstdint>
-
enum ProfileType {
- PROFILE_GAUSSIAN = 0,
- PROFILE_DECAYING_SINUSOID = 1,
- PROFILE_NOISE = 2
+ PROFILE_GAUSSIAN,
+ PROFILE_DECAYING_SINUSOID,
+ PROFILE_NOISE
};
-// Evaluate linear Bezier interpolation at frame t
-float evaluate_bezier_linear(const float* control_frames,
- const float* control_values,
- int n_points,
- float frame);
-
-// Draw spectral brush: Bezier curve with vertical profile
void draw_bezier_curve(float* spectrogram, int dct_size, int num_frames,
const float* control_frames,
const float* control_freqs_hz,
@@ -217,281 +91,105 @@ void draw_bezier_curve(float* spectrogram, int dct_size, int num_frames,
float profile_param1,
float profile_param2 = 0.0f);
-// Additive variant (for compositing profiles)
-void draw_bezier_curve_add(float* spectrogram, int dct_size, int num_frames,
- const float* control_frames,
- const float* control_freqs_hz,
- const float* control_amps,
- int n_control_points,
- ProfileType profile_type,
- float profile_param1,
- float profile_param2 = 0.0f);
+float evaluate_bezier_linear(const float* control_frames,
+ const float* control_values,
+ int n_points,
+ float frame);
-// Profile evaluation
float evaluate_profile(ProfileType type, float distance,
float param1, float param2);
-
-// Home-brew deterministic RNG (small, portable)
-uint32_t spectral_brush_rand(uint32_t seed);
```
-**Future Extensions:**
-- Cubic Bezier interpolation: `evaluate_bezier_cubic()`
-- Generic loader: `gen_from_params(const PrimitiveData*)`
-
---
-## Editor Architecture
+## Editor UI
### Technology Stack
-- **HTML5 Canvas**: Spectrogram visualization
-- **Web Audio API**: Playback (IDCT → audio)
-- **Pure JavaScript**: No dependencies
-- **Reuse from existing editor**: `dct.js` (IDCT implementation)
-
-### Editor UI Layout
-
-```
-┌─────────────────────────────────────────────────────────────┐
-│ Spectral Brush Editor [Load .wav] │
-├─────────────────────────────────────────────────────────────┤
-│ │
-│ ┌───────────────────────────────────────────────────┐ │
-│ │ │ │
-│ │ CANVAS (Spectrogram Display) │ T │
-│ │ │ O │
-│ │ • Background: Reference .wav (gray, transparent) │ O │
-│ │ • Foreground: Procedural curves (colored) │ L │
-│ │ • Bezier control points (draggable circles) │ S │
-│ │ │ │
-│ │ │ [+]│
-│ │ │ [-]│
-│ │ │ [x]│
-│ └───────────────────────────────────────────────────┘ │
-│ │
-├─────────────────────────────────────────────────────────────┤
-│ Profile: [Gaussian ▾] Sigma: [████████░░] 30.0 │
-│ Curves: [Curve 0 ▾] Amplitude: [██████░░░░] 0.8 │
-├─────────────────────────────────────────────────────────────┤
-│ [1] Play Procedural [2] Play Original [Space] Pause │
-│ [Save Params] [Generate C++] [Undo] [Redo] │
-└─────────────────────────────────────────────────────────────┘
-```
+- HTML5 Canvas (visualization)
+- Web Audio API (playback)
+- Pure JavaScript (no dependencies)
+- Reuse `dct.js` from existing editor
-### Features (Phase 1: Minimal Working Version)
-
-#### Editing
-- Click to place Bezier control points
-- Drag to adjust control points (frame, frequency, amplitude)
-- Delete control points (right-click or Delete key)
-- Undo/Redo support (action history)
-
-#### Visualization
-- Dual-layer canvas:
- - **Background**: Reference spectrogram (semi-transparent gray)
- - **Foreground**: Procedural spectrogram (colored)
-- Log-scale frequency axis (musical perception)
-- Control points: Draggable circles with labels
-
-#### Audio Playback
-- **Key '1'**: Play procedural sound
-- **Key '2'**: Play original .wav
-- **Space**: Play/pause toggle
-
-#### File I/O
-- Load .wav or .spec (reference sound)
-- Save procedural_params.txt (re-editable)
-- Generate C++ code (copy-paste ready)
-
-#### Undo/Redo
-- Action history with snapshots
-- Commands: Add control point, Move control point, Delete control point, Change profile
-- **Ctrl+Z**: Undo
-- **Ctrl+Shift+Z**: Redo
+### Key Features
+- Dual-layer canvas (reference + procedural spectrograms)
+- Drag control points to adjust curves
+- Real-time spectrogram rendering
+- Audio playback (keys 1/2 for procedural/original)
+- Undo/Redo (Ctrl+Z, Ctrl+Shift+Z)
+- Load .wav/.spec, save params, generate C++ code
### Keyboard Shortcuts
-
| Key | Action |
|-----|--------|
-| **1** | Play procedural sound |
-| **2** | Play original .wav |
-| **Space** | Play/pause |
-| **Delete** | Delete selected control point |
-| **Esc** | Deselect all |
-| **Ctrl+Z** | Undo |
-| **Ctrl+Shift+Z** | Redo |
-| **Ctrl+S** | Save procedural_params.txt |
-| **Ctrl+Shift+S** | Generate C++ code |
+| 1 | Play procedural |
+| 2 | Play original |
+| Space | Play/pause |
+| Delete | Remove control point |
+| Ctrl+Z | Undo |
+| Ctrl+S | Save params |
---
-## Implementation Plan
+## Implementation Phases
-### Phase 1: C++ Runtime (Foundation)
-**Files:** `src/audio/spectral_brush.h`, `src/audio/spectral_brush.cc`
+### Phase 1: C++ Runtime
+**Files:** `src/audio/spectral_brush.{h,cc}`, `src/tests/test_spectral_brush.cc`
**Tasks:**
-- [ ] Define API (`ProfileType`, `draw_bezier_curve()`, etc.)
-- [ ] Implement linear Bezier interpolation
-- [ ] Implement Gaussian profile evaluation
-- [ ] Implement home-brew RNG (for future noise support)
-- [ ] Add unit tests (`src/tests/test_spectral_brush.cc`)
-
-**Deliverable:** Compiles, tests pass
-
----
+- Define API (ProfileType, draw_bezier_curve, evaluate_profile)
+- Implement linear Bezier interpolation
+- Implement Gaussian profile
+- Add unit tests
+- **Deliverable:** Compiles, tests pass
### Phase 2: Editor Core
-**Files:** `tools/spectral_editor/index.html`, `script.js`, `style.css`, `dct.js` (reuse)
+**Files:** `tools/spectral_editor/{index.html, script.js, style.css, dct.js}`
**Tasks:**
-- [ ] HTML structure (canvas, controls, file input)
-- [ ] Canvas rendering (dual-layer: reference + procedural)
-- [ ] Bezier curve editor (place/drag/delete control points)
-- [ ] Profile controls (Gaussian sigma slider)
-- [ ] Real-time spectrogram rendering
-- [ ] Audio playback (IDCT → Web Audio API)
-- [ ] Undo/Redo system
-
-**Deliverable:** Interactive editor, can trace .wav files
-
----
+- HTML structure (canvas, controls, file input)
+- Bezier curve editor (place/drag/delete control points)
+- Dual-layer canvas rendering
+- Real-time spectrogram generation
+- Audio playback (IDCT → Web Audio API)
+- Undo/Redo system
+- **Deliverable:** Interactive editor, can trace .wav files
### Phase 3: File I/O
**Tasks:**
-- [ ] Load .wav (decode, FFT/STFT → spectrogram)
-- [ ] Load .spec (binary format parser)
-- [ ] Save procedural_params.txt (text format writer)
-- [ ] Generate C++ code (code generation template)
-- [ ] Load procedural_params.txt (re-editing workflow)
-
-**Deliverable:** Full save/load cycle works
-
----
-
-### Phase 4: Polish & Documentation
-**Tasks:**
-- [ ] UI refinements (tooltips, visual feedback)
-- [ ] Keyboard shortcut overlay (press '?' to show help)
-- [ ] Error handling (invalid files, audio context failures)
-- [ ] User guide (README.md in tools/spectral_editor/)
-- [ ] Example files (kick.wav → kick_procedural.cc)
-
-**Deliverable:** Production-ready tool
+- Load .wav (decode, STFT → spectrogram)
+- Load .spec (binary parser)
+- Save procedural_params.txt (text format)
+- Generate C++ code (template)
+- Load procedural_params.txt (re-editing)
+- **Deliverable:** Full save/load cycle
---
## Design Decisions
-### 1. Bezier Interpolation
-- **Current:** Linear (simple, fast, small code)
-- **Future:** Cubic Bezier, Catmull-Rom splines
-
-### 2. Parameter Limits
-- **Editor UI:** Soft ranges (reasonable defaults, not enforced)
-- **Examples:**
- - Sigma: 1.0 - 100.0 (suggested range, user can type beyond)
- - Amplitude: 0.0 - 1.0 (suggested, can exceed for overdrive effects)
-
-### 3. Random Number Generator
-- **Implementation:** Home-brew deterministic RNG
-- **Reason:** Independence, small code, repeatable results
-- **Algorithm:** Simple LCG or xorshift
-
-### 4. Code Generation
-- **Current:** Single function per sound (e.g., `gen_kick_procedural()`)
-- **Future:** Generic `gen_from_params(const PrimitiveData*)` with data tables
-
-### 5. Initial UI Scope
-- **Phase 1:** Bezier + Gaussian only
-- **Rationale:** Validate workflow before adding complexity
-- **Future:** Decaying sinusoid, noise, composite profiles
+- **Bezier:** Linear interpolation (Phase 1), cubic later
+- **Profiles:** Gaussian only (Phase 1), others later
+- **Parameters:** Soft UI limits, no enforced bounds
+- **RNG:** Home-brew deterministic (small, repeatable)
+- **Code gen:** Single function per sound (generic loader later)
---
-## Future Extensions
-
-### Bezier Enhancements
-- [ ] Cubic Bezier interpolation (smoother curves)
-- [ ] Catmull-Rom splines (automatic tangent control)
-- [ ] Bezier curve tangent handles (manual control)
-
-### Additional Profiles
-- [ ] Decaying sinusoid (metallic sounds)
-- [ ] Noise (textured sounds)
-- [ ] Composite profiles (add/subtract/multiply)
-- [ ] User-defined profiles (custom formulas)
+## Size Impact
-### Multi-Dimensional Bezier
-- [ ] `{freq, amplitude, oscillator_freq, decay} = bezier(frame)`
-- [ ] Per-parameter control curves
+**Example: Kick drum**
-### Advanced Features
-- [ ] Frequency snapping (snap to musical notes: C4, D4, E4, etc.)
-- [ ] Amplitude envelope presets (ADSR)
-- [ ] Profile library (save/load custom profiles)
-- [ ] Batch processing (convert entire folder of .wav files)
+**Before (Binary):**
+- 512 bins × 100 frames × 4 bytes = 200 KB uncompressed
+- ~5 KB compressed (zlib)
-### Code Generation
-- [ ] Generic `gen_from_params()` with data tables
-- [ ] Optimization: Merge similar curves
-- [ ] Size estimation (preview bytes saved vs. original .spec)
+**After (Procedural):**
+- 4 control points × 3 arrays × 4 floats = ~48 bytes data
+- Function call overhead = ~20 bytes
+- **Total: ~100 bytes** (50-100× reduction)
----
-
-## Testing Strategy
-
-### C++ Runtime Tests
-**File:** `src/tests/test_spectral_brush.cc`
-
-**Test cases:**
-- Linear Bezier interpolation (verify values at control points)
-- Gaussian profile evaluation (verify falloff curve)
-- Full `draw_bezier_curve()` (verify spectrogram output)
-- Edge cases (0 control points, 1 control point, out-of-range frames)
-
-### Editor Tests
-**Manual testing workflow:**
-1. Load example .wav (kick drum)
-2. Place 3-4 control points to trace low-frequency punch
-3. Adjust Gaussian sigma
-4. Play procedural sound (should resemble original)
-5. Save procedural_params.txt
-6. Generate C++ code
-7. Copy C++ code into demo, compile, verify runtime output matches editor
+**Trade-off:** Runtime CPU cost, acceptable for 64k demo.
---
-## Size Impact Estimate
-
-**Example: Kick drum sample**
-
-**Before (Binary .spec):**
-- DCT size: 512
-- Num frames: 100
-- Size: 512 × 100 × 4 bytes = 200 KB (uncompressed)
-- Compressed (zlib): ~5-10 KB
-
-**After (Procedural C++):**
-```cpp
-// 4 control points × 3 arrays × 4 values = ~48 bytes of data
-const float frames[] = {0.0f, 20.0f, 50.0f, 100.0f};
-const float freqs[] = {200.0f, 80.0f, 60.0f, 50.0f};
-const float amps[] = {0.9f, 0.7f, 0.3f, 0.0f};
-draw_bezier_curve(...); // ~20 bytes function call
-
-// Total: ~100 bytes
-```
-
-**Compression ratio:** 50-100× reduction!
-
-**Trade-off:** Runtime CPU cost (generation vs. lookup), but acceptable for 64k demo.
-
----
-
-## References
-
-- **Bezier curves:** https://en.wikipedia.org/wiki/B%C3%A9zier_curve
-- **DCT/IDCT:** Existing implementation in `src/audio/dct.cc`
-- **Web Audio API:** https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API
-- **Spectrogram visualization:** Existing editor in `tools/editor/` (to be replaced)
+*See TODO.md for detailed implementation tasks.*