diff options
| author | skal <pascal.massimino@gmail.com> | 2026-03-05 21:50:53 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-03-05 21:50:53 +0100 |
| commit | 2f8926f433248af28081497e8371e02abe61d6ff (patch) | |
| tree | 30e480325e2b7f01947a5ca2f8b3865e600d8bb7 /src/audio/fft.cc | |
| parent | e2c3c3e95b6a9e53b4631b271640bb9914f8c95e (diff) | |
feat(spectool): add --wav decode, IMDCT, and roundtrip test
- spectool --wav <input.spec> <output.wav>: decodes .spec to mono 16-bit
WAV at 32 kHz using IDCT-OLA synthesis (no synthesis window).
The analysis Hann window at 50% overlap satisfies w[n]+w[n+H]=1,
so the synthesis window must be rectangular for perfect reconstruction.
- Add imdct_512 / imdct_fft to audio lib (fft.cc, fft.h, idct.cc, dct.h)
for future MDCT-based synthesis.
- test_wav_roundtrip: in-process OLA analyze+decode SNR test (≥30 dB).
Currently measures 53 dB on a 440 Hz sine.
- Fix stale test_spectool.cc: version assertion updated from 1 to
SPEC_VERSION_V2_OLA (was always wrong since OLA fix landed).
- Docs: TOOLS_REFERENCE.md removes dead specview, documents --wav /
--normalize / test_gen. HOWTO.md adds decode section. TRACKER.md
notes spec v2 OLA format and decode command.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Diffstat (limited to 'src/audio/fft.cc')
| -rw-r--r-- | src/audio/fft.cc | 43 |
1 files changed, 43 insertions, 0 deletions
diff --git a/src/audio/fft.cc b/src/audio/fft.cc index 64d7b1a..6029454 100644 --- a/src/audio/fft.cc +++ b/src/audio/fft.cc @@ -140,6 +140,49 @@ void dct_fft(const float* input, float* output, size_t N) { delete[] imag; } +// IMDCT via FFT +// Produces 2N time-domain samples from N MDCT coefficients. +// Formula: x[n] = (2/N) * sum_{k=0}^{N-1} X[k] * cos(pi*(2n+1+N)*(2k+1)/(2N)) +// When windowed (Hann, length 2N) and OLA'd with hop N, gives perfect reconstruction. +void imdct_fft(const float* input, float* output, size_t N) { + const float PI = 3.14159265358979323846f; + const size_t M = 2 * N; // output length + + float* real = new float[M]; + float* imag = new float[M]; + + // Pre-multiply X[k] by exp(-j*pi*(2k+1)/(4N)), build 2N complex FFT input + // via standard IMDCT-via-FFT algorithm (N-point complex FFT) + for (size_t k = 0; k < N; k++) { + const float angle = -PI * (2.0f * k + 1.0f) / (4.0f * N); + real[k] = input[k] * cosf(angle); + imag[k] = input[k] * sinf(angle); + } + for (size_t k = N; k < M; k++) { + real[k] = 0.0f; + imag[k] = 0.0f; + } + + // Inverse FFT of length 2N + bit_reverse_permute(real, imag, M); + fft_radix2(real, imag, M, -1); + const float scale = 1.0f / (float)M; + for (size_t i = 0; i < M; i++) { + real[i] *= scale; + imag[i] *= scale; + } + + // Post-multiply by 2N * exp(-j*pi*(2n+1)/(4N)) and take real part, scale by 2/N + const float gain = 2.0f; + for (size_t n = 0; n < M; n++) { + const float angle = -PI * (2.0f * n + 1.0f) / (4.0f * N); + output[n] = gain * (real[n] * cosf(angle) - imag[n] * sinf(angle)); + } + + delete[] real; + delete[] imag; +} + // IDCT (DCT-III) via FFT - inverse of the DCT-II reordering method // Reference: Numerical Recipes Chapter 12.3, Press et al. void idct_fft(const float* input, float* output, size_t N) { |
