diff options
| author | skal <pascal.massimino@gmail.com> | 2026-02-18 13:24:10 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-02-18 13:24:10 +0100 |
| commit | c3c1011cb6bf9bca28736b89049d76875a031ebe (patch) | |
| tree | 062ed40059840418c2c64bc6fc44ea8e5673467b | |
| parent | 362f862da4fb5d9c666c8ca7b0dc329d4b8d1f7e (diff) | |
feat(mq_editor): implement MQ extraction improvements
- Implement Predictive Kinematic Tracking to improve partial tracking during fast glissandos and vibrato.
- Add Peak Prominence Pruning to filter out insignificant local maxima.
- Replace heuristic Bezier fitting with a Least-Squares solver for more accurate trajectories.
- Update UI to include a Prominence parameter input.
- Archive MQ_EXTRACTION_IMPROVEMENTS.md design document.
handoff(Gemini): implemented MQ extraction improvements (kinematic tracking, prominence pruning, least-squares bezier)
| -rw-r--r-- | doc/COMPLETED.md | 12 | ||||
| -rw-r--r-- | doc/archive/MQ_EXTRACTION_IMPROVEMENTS.md | 42 | ||||
| -rw-r--r-- | tools/mq_editor/README.md | 7 | ||||
| -rw-r--r-- | tools/mq_editor/index.html | 5 | ||||
| -rw-r--r-- | tools/mq_editor/mq_extract.js | 94 |
5 files changed, 146 insertions, 14 deletions
diff --git a/doc/COMPLETED.md b/doc/COMPLETED.md index 67724a1..3e02c40 100644 --- a/doc/COMPLETED.md +++ b/doc/COMPLETED.md @@ -29,6 +29,18 @@ Detailed historical documents have been moved to `doc/archive/` for reference: Use `read @doc/archive/FILENAME.md` to access archived documents. +## Recently Completed (February 18, 2026) + +- [x] **MQ Spectral Editor Improvements** + - **Goal**: Improve tracking accuracy and Bezier curve fitting for sinusoidal analysis. + - **Implementation**: + - **Predictive Kinematic Tracking**: Added velocity tracking to `mq_extract.js`. Partials now predict their next frequency (`freq + velocity`) during the search phase, improving tracking for fast glissandos and vibrato. + - **Peak Prominence Pruning**: Added `prominence` parameter (default 1.0 dB) to filtering. Discards peaks that don't stand out sufficiently from their surrounding "valley floor," reducing noise. + - **Least-Squares Bezier Fitting**: Replaced heuristic 1/3-2/3 control point placement with a proper least-squares solver for cubic Bezier curves. Minimizes global error across the entire partial trajectory. + - **UI Update**: Wired up the "Prominence" input in `index.html` to pass the value to the extraction engine. + - **Documentation**: Updated `tools/mq_editor/README.md` with new parameters and algorithm details. + - **Files**: `tools/mq_editor/mq_extract.js`, `tools/mq_editor/index.html`, `tools/mq_editor/README.md` + ## Recently Completed (February 17, 2026) - [x] **MQ Spectral Editor Phase 2: JS Synthesizer** diff --git a/doc/archive/MQ_EXTRACTION_IMPROVEMENTS.md b/doc/archive/MQ_EXTRACTION_IMPROVEMENTS.md new file mode 100644 index 0000000..7bc5c5d --- /dev/null +++ b/doc/archive/MQ_EXTRACTION_IMPROVEMENTS.md @@ -0,0 +1,42 @@ +# MQ Extraction Improvements + +This document outlines three enhancements to the partial extraction algorithm (`mq_extract.js`) to improve tracking accuracy and the quality of the resulting sinusoidal model. + +## 1. Proposal: Predictive Kinematic Tracking + +- **Problem**: The original tracking algorithm assumes a partial's frequency is relatively static between frames. It can fail to track partials with significant, rapid frequency changes (e.g., fast vibrato or glissando) if the change exceeds the fixed frequency tolerance (`trackingRatio`). + +- **Solution**: A simple kinematic model has been added to the tracking logic. + - For each active partial, we now estimate its frequency "velocity" (the rate of change between the last two frames). + - When searching for a matching peak in the next frame, the search is centered around a *predicted* frequency: `predicted_freq = last_freq + velocity`. + - This allows the tracker to anticipate movement and maintain lock on partials that are undergoing rapid, continuous change. + +- **Implementation**: + - The `trackPartials` function now stores a `velocity` property on active partial objects. + - This velocity is updated each time a new peak is added to the partial. + - The core matching logic now uses the predicted frequency as its reference. + +## 2. Proposal: Peak Prominence Pruning + +- **Problem**: The original peak detection algorithm identified any local maximum within a 5-bin window. This could lead to the detection of many small, noisy, or spurious peaks that are not musically significant, creating a large number of short, irrelevant partials for the tracker to process. + +- **Solution**: The `detectPeaks` function has been enhanced with a "prominence" filter. + - A peak's prominence is its height in decibels relative to the lowest "valley" between it and the next higher peak on either side. This measures how much a peak "stands out" from the surrounding spectral landscape. + - A new **Prominence (dB)** parameter is available in the UI. Only peaks that exceed this prominence threshold are passed to the tracking stage. + +- **Implementation**: + - After a local maximum is found, a new algorithm searches left and right to find the lowest point (the valley floor) before encountering a bin with a higher magnitude than the peak itself. + - The prominence is calculated as `peak_magnitude - valley_floor_magnitude`. + - If this value is below the user-defined threshold, the peak is discarded. This effectively prunes insignificant peaks, cleaning the data for the tracker. + +## 3. Proposal: Least-Squares Bezier Fitting + +- **Problem**: The original `fitBezier` function used a simple heuristic. It forced the Bezier curve to pass exactly through four points (start, end, and two internal points at 1/3 and 2/3 of the duration). For noisy or complex partials, this could result in a curve that did not accurately represent the partial's overall trajectory. + +- **Solution**: The heuristic has been replaced with a proper **least-squares fitting algorithm**. + - This method calculates the cubic Bezier curve that minimizes the overall error across *all* points in the partial's trajectory. + - The start and end points of the curve are fixed to match the partial's birth and death, but the two intermediate control points are mathematically optimized to produce the best possible fit to the data. + +- **Implementation**: + - A new `fitBezier` function implements the normal equations to solve the 2x2 linear system for the optimal `v1` and `v2` control point values. + - This results in a smoother, more representative curve that is less sensitive to individual noisy points within the partial. diff --git a/tools/mq_editor/README.md b/tools/mq_editor/README.md index bde7e54..c1f2732 100644 --- a/tools/mq_editor/README.md +++ b/tools/mq_editor/README.md @@ -27,6 +27,7 @@ open tools/mq_editor/index.html - **Hop Size:** 64–1024 samples (default 256) - **Threshold:** dB floor for peak detection (default −60 dB) +- **Prominence:** Min dB height of a peak above its surrounding "valley floor" (default 1.0 dB). Filters out insignificant local maxima. - **f·Power:** checkbox — weight spectrum by frequency (`f·FFT_Power(f)`) before peak detection, accentuating high-frequency peaks - **Keep %:** slider to limit how many partials are shown/synthesized @@ -121,10 +122,10 @@ For a partial at 440 Hz with `spread = 0.02`: `BW ≈ 8.8 Hz`, `r ≈ exp(−π ## Algorithm 1. **STFT:** Overlapping Hann windows, radix-2 FFT -2. **Peak Detection:** Local maxima above threshold + parabolic interpolation; optional `f·Power(f)` frequency weighting to accentuate high-frequency peaks -3. **Forward Tracking:** Birth/death/continuation with frequency-dependent tolerance, candidate persistence +2. **Peak Detection:** Local maxima above threshold + parabolic interpolation. Includes **Prominence Filtering** (rejects peaks not significantly higher than surroundings). Optional `f·Power(f)` weighting. +3. **Forward Tracking:** Birth/death/continuation with frequency-dependent tolerance. Includes **Predictive Kinematic Tracking** (uses velocity to track rapidly moving partials). 4. **Backward Expansion:** Second pass extends each partial leftward to recover onset frames -5. **Bezier Fitting:** Cubic curves with control points at t/3 and 2t/3 +5. **Bezier Fitting:** Cubic curves optimized via **Least-Squares** (minimizes error across all points). ## Implementation Status diff --git a/tools/mq_editor/index.html b/tools/mq_editor/index.html index c663c69..a2daff5 100644 --- a/tools/mq_editor/index.html +++ b/tools/mq_editor/index.html @@ -272,6 +272,9 @@ <label>Threshold (dB):</label> <input type="number" id="threshold" value="-20" step="any"> + <label>Prominence (dB):</label> + <input type="number" id="prominence" value="1.0" step="0.1" min="0"> + <label style="margin-left:16px;" title="Weight spectrum by frequency before peak detection (f * FFT_Power(f)), accentuates high-frequency peaks"> <input type="checkbox" id="freqWeight"> f·Power </label> @@ -446,6 +449,7 @@ const hopSize = document.getElementById('hopSize'); const threshold = document.getElementById('threshold'); + const prominence = document.getElementById('prominence'); const freqWeightCb = document.getElementById('freqWeight'); const keepPct = document.getElementById('keepPct'); const keepPctLabel = document.getElementById('keepPctLabel'); @@ -564,6 +568,7 @@ fftSize: fftSize, hopSize: parseInt(hopSize.value), threshold: parseFloat(threshold.value), + prominence: parseFloat(prominence.value), freqWeight: freqWeightCb.checked, sampleRate: audioBuffer.sampleRate }; diff --git a/tools/mq_editor/mq_extract.js b/tools/mq_editor/mq_extract.js index c03e869..97fbb00 100644 --- a/tools/mq_editor/mq_extract.js +++ b/tools/mq_editor/mq_extract.js @@ -3,14 +3,14 @@ // Extract partials from audio buffer function extractPartials(params, stftCache) { - const {fftSize, threshold, sampleRate, freqWeight} = params; + const {fftSize, threshold, sampleRate, freqWeight, prominence} = params; const numFrames = stftCache.getNumFrames(); const frames = []; for (let i = 0; i < numFrames; ++i) { const cachedFrame = stftCache.getFrameAtIndex(i); const squaredAmp = stftCache.getSquaredAmplitude(cachedFrame.time); - const peaks = detectPeaks(squaredAmp, fftSize, sampleRate, threshold, freqWeight); + const peaks = detectPeaks(squaredAmp, fftSize, sampleRate, threshold, freqWeight, prominence); frames.push({time: cachedFrame.time, peaks}); } @@ -30,7 +30,7 @@ function extractPartials(params, stftCache) { // Detect spectral peaks via local maxima + parabolic interpolation // squaredAmp: pre-computed re*re+im*im per bin // freqWeight: if true, weight by f before peak detection (f * Power(f)) -function detectPeaks(squaredAmp, fftSize, sampleRate, thresholdDB, freqWeight) { +function detectPeaks(squaredAmp, fftSize, sampleRate, thresholdDB, freqWeight, prominenceDB = 0) { const mag = new Float32Array(fftSize / 2); const binHz = sampleRate / fftSize; for (let i = 0; i < fftSize / 2; ++i) { @@ -44,6 +44,24 @@ function detectPeaks(squaredAmp, fftSize, sampleRate, thresholdDB, freqWeight) { mag[i] > mag[i-1] && mag[i] > mag[i-2] && mag[i] > mag[i+1] && mag[i] > mag[i+2]) { + // Check prominence if requested + if (prominenceDB > 0) { + let minLeft = mag[i]; + for (let k = i - 1; k >= 0; --k) { + if (mag[k] > mag[i]) break; // Found higher peak + if (mag[k] < minLeft) minLeft = mag[k]; + } + + let minRight = mag[i]; + for (let k = i + 1; k < mag.length; ++k) { + if (mag[k] > mag[i]) break; // Found higher peak + if (mag[k] < minRight) minRight = mag[k]; + } + + const valley = Math.max(minLeft, minRight); + if (mag[i] - valley < prominenceDB) continue; + } + // Parabolic interpolation for sub-bin accuracy const alpha = mag[i-1]; const beta = mag[i]; @@ -77,12 +95,15 @@ function trackPartials(frames) { // Continue active partials for (const partial of activePartials) { const lastFreq = partial.freqs[partial.freqs.length - 1]; + const velocity = partial.velocity || 0; + const predicted = lastFreq + velocity; + const tol = Math.max(lastFreq * trackingRatio, minTrackingHz); let bestIdx = -1, bestDist = Infinity; for (let i = 0; i < frame.peaks.length; ++i) { if (matched.has(i)) continue; - const dist = Math.abs(frame.peaks[i].freq - lastFreq); + const dist = Math.abs(frame.peaks[i].freq - predicted); if (dist < tol && dist < bestDist) { bestDist = dist; bestIdx = i; } } @@ -92,6 +113,7 @@ function trackPartials(frames) { partial.freqs.push(pk.freq); partial.amps.push(pk.amp); partial.age = 0; + partial.velocity = pk.freq - lastFreq; matched.add(bestIdx); } else { partial.age++; @@ -102,12 +124,15 @@ function trackPartials(frames) { for (let i = candidates.length - 1; i >= 0; --i) { const cand = candidates[i]; const lastFreq = cand.freqs[cand.freqs.length - 1]; + const velocity = cand.velocity || 0; + const predicted = lastFreq + velocity; + const tol = Math.max(lastFreq * trackingRatio, minTrackingHz); let bestIdx = -1, bestDist = Infinity; for (let j = 0; j < frame.peaks.length; ++j) { if (matched.has(j)) continue; - const dist = Math.abs(frame.peaks[j].freq - lastFreq); + const dist = Math.abs(frame.peaks[j].freq - predicted); if (dist < tol && dist < bestDist) { bestDist = dist; bestIdx = j; } } @@ -116,6 +141,7 @@ function trackPartials(frames) { cand.times.push(frame.time); cand.freqs.push(pk.freq); cand.amps.push(pk.amp); + cand.velocity = pk.freq - lastFreq; matched.add(bestIdx); if (cand.times.length >= birthPersistence) { activePartials.push(cand); @@ -130,7 +156,13 @@ function trackPartials(frames) { for (let i = 0; i < frame.peaks.length; ++i) { if (matched.has(i)) continue; const pk = frame.peaks[i]; - candidates.push({times: [frame.time], freqs: [pk.freq], amps: [pk.amp], age: 0}); + candidates.push({ + times: [frame.time], + freqs: [pk.freq], + amps: [pk.amp], + age: 0, + velocity: 0 + }); } // Kill aged-out partials @@ -251,19 +283,59 @@ function autodetectSpread(partial, stftCache, fftSize, sampleRate) { return {spread_above, spread_below}; } -// Fit cubic bezier to trajectory using samples at ~1/3 and ~2/3 as control points +// Fit cubic bezier to trajectory using least-squares for inner control points function fitBezier(times, values) { const n = times.length - 1; const t0 = times[0], v0 = values[0]; const t3 = times[n], v3 = values[n]; const dt = t3 - t0; - if (dt <= 0 || n === 0) { - return {t0, v0, t1: t0, v1: v0, t2: t3, v2: v3, t3, v3}; + if (dt <= 1e-9 || n < 2) { + // Linear fallback for too few points or zero duration + return {t0, v0, t1: t0 + dt / 3, v1: v0 + (v3 - v0) / 3, t2: t0 + 2 * dt / 3, v2: v0 + 2 * (v3 - v0) / 3, t3, v3}; } - const v1 = values[Math.round(n / 3)]; - const v2 = values[Math.round(2 * n / 3)]; + // Least squares solve for v1, v2 + // Bezier: B(u) = (1-u)^3*v0 + 3(1-u)^2*u*v1 + 3(1-u)*u^2*v2 + u^3*v3 + // Target_i = val_i - (1-u)^3*v0 - u^3*v3 + // Model_i = A_i*v1 + B_i*v2 + // A_i = 3(1-u)^2*u + // B_i = 3(1-u)*u^2 + + let sA2 = 0, sB2 = 0, sAB = 0, sAT = 0, sBT = 0; + + for (let i = 0; i <= n; ++i) { + const u = (times[i] - t0) / dt; + const u2 = u * u; + const u3 = u2 * u; + const invU = 1.0 - u; + const invU2 = invU * invU; + const invU3 = invU2 * invU; + + const A = 3 * invU2 * u; + const B = 3 * invU * u2; + const target = values[i] - (invU3 * v0 + u3 * v3); + + sA2 += A * A; + sB2 += B * B; + sAB += A * B; + sAT += A * target; + sBT += B * target; + } + + const det = sA2 * sB2 - sAB * sAB; + let v1, v2; + + if (Math.abs(det) < 1e-9) { + // Fallback to simple 1/3, 2/3 heuristic if matrix is singular + const idx1 = Math.round(n / 3); + const idx2 = Math.round(2 * n / 3); + v1 = values[idx1]; + v2 = values[idx2]; + } else { + v1 = (sB2 * sAT - sAB * sBT) / det; + v2 = (sA2 * sBT - sAB * sAT) / det; + } return {t0, v0, t1: t0 + dt / 3, v1, t2: t0 + 2 * dt / 3, v2, t3, v3}; } |
