# Spectral Brush Editor v2: MQ-Based Sinusoidal Synthesis

**Status:** Design Phase
**Target:** Procedural audio compression for short samples (drums, piano, impacts)
**Replaces:** Spectrogram-based synthesis (poor audio quality)

---

## Overview

McAulay-Quatieri (MQ) sinusoidal modeling for audio compression. Extract frequency/amplitude trajectories as bezier curves, apply "style" via replicas (harmonics, spread, jitter), synthesize to baked PCM buffers.

**Key Features:**
- **50-100× compression:** WAV → bezier curves + replica params → C++ structs
- **Web-based editor:** Real-time MQ extraction, curve editing, synthesis preview
- **Procedural synthesis:** Bandwidth-enhanced oscillators with phase jitter and frequency spread
- **Tracker integration:** MQ samples triggered as assets, future pitch/amp modulation

---

## Architecture

### Data Flow

```
┌─────────────────────────────────────────────────────┐
│ Web Editor (tools/mq_editor/)                       │
├─────────────────────────────────────────────────────┤
│ Input: WAV or saved .txt params                     │
│   ↓                                                  │
│ MQ Extraction: FFT → Peak Tracking → Bezier Fitting │
│   ↓                                                  │
│ Editing: Drag control points, adjust replicas       │
│   ↓                                                  │
│ JS Synthesizer: Preview original vs. synthesized    │
│   ↓                                                  │
│ Export: .txt params + generated .cc code            │
└─────────────────────────────────────────────────────┘
                         ↓
┌─────────────────────────────────────────────────────┐
│ C++ Demo (src/audio/)                               │
├─────────────────────────────────────────────────────┤
│ Build: .txt → generated .cc (MQSample structs)      │
│   ↓                                                  │
│ Synthesis: Bake PCM at init (CPU, future GPU)       │
│   ↓                                                  │
│ AudioEngine: Register as sample asset               │
│   ↓                                                  │
│ Tracker: Trigger via patterns (future modulation)   │
└─────────────────────────────────────────────────────┘
```

---

## Data Model

### Per-Partial Representation

Each sinusoidal partial stores:

```
Partial {
  freq_curve: CubicBezier    // Frequency trajectory (Hz vs. seconds)
  amp_curve: CubicBezier     // Amplitude envelope (0-1 vs. seconds)
  replicas: ReplicaConfig    // Harmonic/inharmonic copies
}

CubicBezier {
  (t0, v0), (t1, v1), (t2, v2), (t3, v3)  // 4 control points
}

ReplicaConfig {
  offsets: [ratio1, ratio2, ...]          // Frequency ratios (1.0, 2.01, 0.5, ...)
  decay_alpha: float                      // Amplitude decay: exp(-α·|f-f₀|)
  jitter: float [0-1]                     // Phase randomization amount
  spread_above: float [0-1]               // Frequency spread +% of f₀
  spread_below: float [0-1]               // Frequency spread -% of f₀
  bandwidth: float [0-1]                  // Noise bandwidth ±% of f
}
```

### Text Format (.txt)

Stored in `workspaces/main/mq_samples/`:

```
# MQ Sample: drum_kick.txt
sample_rate 32000
duration 1.5

# Global defaults (optional, can override per partial)
replica_defaults
  decay_alpha 0.1
  jitter 0.05
  spread_above 0.02
  spread_below 0.02
  bandwidth 0.01
end

# Partial 0: fundamental
partial
  # Frequency bezier (seconds, Hz): t0 f0 t1 f1 t2 f2 t3 f3
  freq_curve 0.0 60.0 0.2 58.0 0.8 55.0 1.5 50.0

  # Amplitude bezier (seconds, 0-1): t0 a0 t1 a1 t2 a2 t3 a3
  amp_curve 0.0 0.0 0.05 1.0 0.5 0.3 1.5 0.0

  # Replica frequency ratios
  replicas 1.0 2.01 3.03

  # Override defaults (optional)
  decay_alpha 0.15
  jitter 0.08
  spread_above 0.03
  spread_below 0.01
  bandwidth 0.02
end

# Partial 1: overtone
partial
  freq_curve 0.0 180.0 0.2 178.0 0.8 175.0 1.5 170.0
  amp_curve 0.0 0.0 0.05 0.6 0.5 0.2 1.5 0.0
  replicas 1.0 1.99
end
```

### Generated C++ Code

Stored in `src/generated/mq_<name>.cc`:

```cpp
// Auto-generated from mq_samples/drum_kick.txt
// DO NOT EDIT

struct MQBezier {
  float t0, v0, t1, v1, t2, v2, t3, v3;
};

struct MQPartial {
  MQBezier freq;
  MQBezier amp;
  const float* replicas;
  int num_replicas;
  float decay_alpha;
  float jitter;
  float spread_above;
  float spread_below;
  float bandwidth;
};

static const float drum_kick_replicas_0[] = {1.0f, 2.01f, 3.03f};
static const float drum_kick_replicas_1[] = {1.0f, 1.99f};

static const MQPartial drum_kick_partials[] = {
  {
    {0.0f, 60.0f, 0.2f, 58.0f, 0.8f, 55.0f, 1.5f, 50.0f},
    {0.0f, 0.0f, 0.05f, 1.0f, 0.5f, 0.3f, 1.5f, 0.0f},
    drum_kick_replicas_0, 3,
    0.15f, 0.08f, 0.03f, 0.01f, 0.02f
  },
  {
    {0.0f, 180.0f, 0.2f, 178.0f, 0.8f, 175.0f, 1.5f, 170.0f},
    {0.0f, 0.0f, 0.05f, 0.6f, 0.5f, 0.2f, 1.5f, 0.0f},
    drum_kick_replicas_1, 2,
    0.1f, 0.05f, 0.02f, 0.02f, 0.01f
  }
};

struct MQSample {
  int sample_rate;
  float duration;
  const MQPartial* partials;
  int num_partials;
};

const MQSample ASSET_MQ_DRUM_KICK = {
  32000, 1.5f, drum_kick_partials, 2
};
```

---

## McAulay-Quatieri Algorithm

### Phase 1: Peak Detection

STFT with overlapping windows:

```
For each frame (hop = 512 samples):
  1. FFT (size = 2048)
  2. Magnitude spectrum |X[k]|
  3. Detect peaks: local maxima above threshold
  4. Extract (frequency, amplitude, phase) via parabolic interpolation
```

**Parameters:**
- `fft_size`: 2048 (adjustable 1024-4096)
- `hop_size`: 512 (75% overlap)
- `peak_threshold`: -60 dB (adjustable)

### Phase 2: Trajectory Tracking

Link peaks across frames into continuous partials:

```
Birth/Death/Continuation model:
  - Match peak to existing partial if |f_new - f_old| < threshold
  - Birth new partial if unmatched peak persists 2+ frames
  - Death partial if no match for 2+ frames
```

**Tracking threshold:** 50 Hz (adjustable)

### Phase 3: Bezier Curve Fitting

Fit cubic bezier to each partial's trajectory:

```
Input: [(t1, f1), (t2, f2), ..., (tN, fN)]
Output: 4 control points minimizing least-squares error

Algorithm:
  1. Fix endpoints: (t0, f0) = first, (t3, f3) = last
  2. Solve for (t1, f1), (t2, f2) via linear regression
  3. Repeat for amplitude trajectory
```

**Error threshold:** Auto-fit to minimize control points (future: user-adjustable simplification)

---

## Synthesis Model

### Replica Oscillator Bank

For each partial at time `t`:

```python
# Evaluate bezier curves
f0 = eval_bezier(partial.freq_curve, t)
A0 = eval_bezier(partial.amp_curve, t)

# For each replica offset ratio
for ratio in partial.replicas:
    # Frequency spread (asymmetric randomization)
    spread = random.uniform(-partial.spread_below, +partial.spread_above)
    f = f0 * ratio * (1.0 + spread)

    # Amplitude decay
    A = A0 * exp(-partial.decay_alpha * abs(f - f0))

    # Phase (non-deterministic, seeded by frame counter)
    phase = 2*pi*f*t + partial.jitter * random.uniform(0, 2*pi)

    # Base sinusoid
    sample += A * sin(phase)

    # Bandwidth-enhanced noise (optional)
    if partial.bandwidth > 0:
        noise_bw = f * partial.bandwidth
        sample += A * bandlimited_noise(f - noise_bw, f + noise_bw)
```

### Bezier Evaluation (Cubic)

De Casteljau's algorithm:

```cpp
float eval_bezier(const MQBezier& b, float t) {
  // Normalize t to [0, 1]
  float u = (t - b.t0) / (b.t3 - b.t0);
  u = clamp(u, 0.0f, 1.0f);

  // Cubic interpolation
  float u1 = 1.0f - u;
  return u1*u1*u1 * b.v0 +
         3*u1*u1*u * b.v1 +
         3*u1*u*u * b.v2 +
         u*u*u * b.v3;
}
```

### Baking Process (C++)

```cpp
// At audio_init() time
void synth_bake_mq(const MQSample& sample, std::vector<float>& pcm_out) {
  int num_samples = sample.sample_rate * sample.duration;
  pcm_out.resize(num_samples);

  for (int i = 0; i < num_samples; ++i) {
    float t = (float)i / sample.sample_rate;
    float sample_val = 0.0f;

    for (int p = 0; p < sample.num_partials; ++p) {
      const MQPartial& partial = sample.partials[p];
      float f0 = eval_bezier(partial.freq, t);
      float A0 = eval_bezier(partial.amp, t);

      for (int r = 0; r < partial.num_replicas; ++r) {
        float ratio = partial.replicas[r];

        // Frequency spread
        uint32_t seed = i * 12345 + p * 67890 + r;
        float spread = rand_float(seed, -partial.spread_below, partial.spread_above);
        float f = f0 * ratio * (1.0f + spread);

        // Amplitude decay
        float A = A0 * expf(-partial.decay_alpha * fabsf(f - f0));

        // Phase jitter
        float jitter = rand_float(seed + 1, 0.0f, 1.0f) * partial.jitter;
        float phase = 2.0f * M_PI * f * t + jitter * 2.0f * M_PI;

        sample_val += A * sinf(phase);

        // TODO: bandwidth-enhanced noise
      }
    }

    pcm_out[i] = sample_val;
  }
}
```

---

## Web Editor

### UI Layout

```
┌─────────────────────────────────────────────────────┐
│ [Load WAV] [Load .txt] [Save .txt] [Export C++]    │
├─────────────────────────────────────────────────────┤
│ MQ Extraction Params:                               │
│   FFT Size: [2048▼]  Hop: [512]  Threshold: [-60dB]│
│   [Extract Partials] [Re-extract]                   │
├─────────────────────────────────────────────────────┤
│ ┌─────────────────────────────────────────────────┐ │
│ │                                                 │ │
│ │  Time-Frequency Canvas                          │ │
│ │  - Spectrogram background                       │ │
│ │  - Bezier curves (colored per partial)          │ │
│ │  - Draggable control points (circles)           │ │
│ │                                                 │ │
│ └─────────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────────┤
│ Selected Partial: [0▼]  [Add Point] [Remove Point] │
│   Replicas: [1.0, 2.01, 3.03] [Edit]               │
│   Decay α: [0.15]  Jitter: [0.08]                  │
│   Spread+: [3%]  Spread-: [1%]  Bandwidth: [2%]    │
├─────────────────────────────────────────────────────┤
│ Playback: [▶ Original] [▶ Synthesized] [▶ Both]    │
│ Time: [━━━━━━━━━━━━━━━━━━━━━━━] 0.0s / 1.5s        │
└─────────────────────────────────────────────────────┘
```

### Features

**Phase 1 (Extraction):**
- Load WAV, run MQ algorithm, visualize partials
- Real-time parameter adjustment (FFT size, threshold, tracking)

**Phase 2 (Synthesis Preview):**
- JS implementation of full synthesis pipeline
- Playback original vs. synthesized audio (Web Audio API)

**Phase 3 (Editing):**
- Drag control points to adjust curves
- Add/remove control points (future: auto-simplification)
- Per-partial replica configuration

**Phase 4 (Export):**
- Save `.txt` format (human-readable)
- Generate C++ code (copy-paste or auto-commit)

---

## C++ Integration

### File Organization

```
workspaces/main/
  mq_samples/
    drum_kick.txt
    piano_c4.txt
    synth_pad.txt

src/generated/
  mq_drum_kick.cc    # Auto-generated
  mq_piano_c4.cc
  mq_synth_pad.cc

src/audio/
  mq_synth.h         # Bezier eval, baking API
  mq_synth.cc
```

### Asset Registration

Add to `workspaces/main/assets.txt`:

```
MQ_DRUM_KICK, NONE, mq_samples/drum_kick.txt, "MQ kick drum"
```

Build system:
1. Detect `.txt` changes → trigger code generator
2. Compile generated `.cc` → link into demo
3. `ASSET_MQ_DRUM_KICK` available in code

### Tracker Integration

```cpp
// Register MQ samples at init
void audio_init() {
  synth_register_mq_sample(SAMPLE_ID_KICK, &ASSET_MQ_DRUM_KICK);
  synth_register_mq_sample(SAMPLE_ID_PIANO, &ASSET_MQ_PIANO_C4);
}

// Trigger from pattern
void pattern_callback(int sample_id, float volume) {
  synth_trigger_mq(sample_id, volume);
  // Future: pitch modulation, time stretch
}
```

---

## Implementation Roadmap

### Phase 1: MQ Extraction (Web)
**Goal:** Load WAV → Extract partials → Visualize trajectories
**Deliverables:**
- `tools/mq_editor/index.html` (basic UI)
- `tools/mq_editor/mq_extract.js` (FFT + peak tracking + bezier fitting)
- `tools/mq_editor/render.js` (canvas visualization)

**Timeline:** 1-2 weeks

### Phase 2: JS Synthesizer
**Goal:** Preview synthesized audio in browser
**Deliverables:**
- `tools/mq_editor/mq_synth.js` (replica oscillator bank)
- Web Audio API integration (playback comparison)

**Timeline:** 1 week

### Phase 3: Web Editor UI
**Goal:** Full editing workflow
**Deliverables:**
- Draggable control points (canvas interaction)
- Per-partial replica sliders
- Save/load `.txt` format

**Timeline:** 1-2 weeks

### Phase 4: C++ Code Generator
**Goal:** `.txt` → generated `.cc` code
**Deliverables:**
- `tools/mq_codegen.py` (parser + C++ emitter)
- Build system integration (CMake hook)

**Timeline:** 3-5 days

### Phase 5: C++ Synthesis
**Goal:** Bake PCM at demo init
**Deliverables:**
- `src/audio/mq_synth.{h,cc}` (bezier eval, oscillator bank)
- Integration with AudioEngine/tracker

**Timeline:** 1 week

### Phase 6: Optimization
**Goal:** GPU baking, quantization, size reduction
**Deliverables:**
- Compute shader for parallel synthesis
- Quantized bezier control points (f16 or i16)
- Curve simplification algorithm

**Timeline:** 2-3 weeks (future work)

---

## Future Enhancements

### Short-Term (Post-MVP)
- **Pitch modulation:** `synth_trigger_mq(sample_id, volume, pitch_ratio)`
- **Time stretch:** Adjust bezier time domain dynamically
- **Amplitude modulation:** LFO/envelope override

### Medium-Term
- **GPU synthesis:** Compute shader for baked PCM (parallel oscillators)
- **Curve simplification:** Iterative control point reduction (error tolerance)
- **Quantization:** f32 → f16/i16 control points (~50% size reduction)

### Long-Term
- **Hybrid synthesis:** MQ partials + noise residual (stochastic component)
- **Real-time synthesis:** Per-chunk fillBuffer() instead of baked PCM
- **Segmented beziers:** Multi-segment curves for complex trajectories

---

## References

- McAulay, R. J., & Quatieri, T. F. (1986). "Speech analysis/synthesis based on a sinusoidal representation." IEEE TASSP.
- Serra, X., & Smith, J. O. (1990). "Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition." Computer Music Journal.
- De Casteljau's algorithm: https://en.wikipedia.org/wiki/De_Casteljau%27s_algorithm

---

## Status

- [x] Design document
- [x] Phase 1: MQ extraction (Web)
  - [x] FFT-based peak detection with parabolic interpolation
  - [x] Frequency-dependent trajectory tracking (5% tolerance, candidate system)
  - [x] Cubic bezier curve fitting for freq/amp trajectories
  - [x] Spectrogram visualization with zoom/scroll/playhead
  - [x] Original WAV playback
- [ ] Phase 2: JS synthesizer
- [ ] Phase 3: Web editor UI
- [ ] Phase 4: C++ code generator
- [ ] Phase 5: C++ synthesis + integration
- [ ] Phase 6: GPU optimization