From 7eb38fb10c7bea8d07889d2563fbc076307f8050 Mon Sep 17 00:00:00 2001 From: skal Date: Mon, 16 Feb 2026 17:25:57 +0100 Subject: docs: streamline and consolidate markdown documentation Remove 530 lines of redundant content, archive dated docs, compact CNN training sections, fix inconsistencies (effect count, test status). Improves maintainability and reduces context load for AI agents. Co-Authored-By: Claude Sonnet 4.5 --- doc/AUDIO_WAV_DRIFT_BUG.md | 185 --------------------- doc/COMPLETED.md | 5 +- doc/FILE_HIERARCHY_CLEANUP_2026-02-13.md | 202 ----------------------- doc/HOWTO.md | 133 ++------------- doc/archive/AUDIO_WAV_DRIFT_BUG.md | 185 +++++++++++++++++++++ doc/archive/FILE_HIERARCHY_CLEANUP_2026-02-13.md | 202 +++++++++++++++++++++++ 6 files changed, 405 insertions(+), 507 deletions(-) delete mode 100644 doc/AUDIO_WAV_DRIFT_BUG.md delete mode 100644 doc/FILE_HIERARCHY_CLEANUP_2026-02-13.md create mode 100644 doc/archive/AUDIO_WAV_DRIFT_BUG.md create mode 100644 doc/archive/FILE_HIERARCHY_CLEANUP_2026-02-13.md (limited to 'doc') diff --git a/doc/AUDIO_WAV_DRIFT_BUG.md b/doc/AUDIO_WAV_DRIFT_BUG.md deleted file mode 100644 index 050dd49..0000000 --- a/doc/AUDIO_WAV_DRIFT_BUG.md +++ /dev/null @@ -1,185 +0,0 @@ -# Audio WAV Drift Bug Investigation - -**Date:** 2026-02-15 -**Status:** ACCEPTABLE (to be continued) -**Current State:** -150ms drift at beat 64b, no glitches - -## Problem Statement - -Timeline viewer shows progressive visual drift between audio waveform and beat grid markers: -- At beat 8 (5.33s @ 90 BPM): kick waveform appears **-30ms early** (left of grid line) -- At beat 60 (40.0s @ 90 BPM): kick waveform appears **-180ms early** (left of grid line) - -Progressive drift rate: ~4.3ms/second - -## Initial Hypotheses (Ruled Out) - -### 1. ❌ Viewer Display Bug -- **Tested:** Sample rate detection in viewer (32kHz correctly detected) -- **Result:** Viewer BPM = 90 (correct), `pixelsPerSecond` mapping correct -- **Conclusion:** Not a viewer rendering issue - -### 2. ❌ WAV File Content Error -- **Tested:** Direct WAV sample position analysis via Python -- **Result:** Actual kick positions in WAV file: - ``` - Beat | Expected(s) | WAV(s) | Drift - -----|-------------|----------|------- - 8 | 5.3333 | 5.3526 | +19ms (LATE) - 60 | 40.0000 | 39.9980 | -2ms (nearly perfect) - ``` -- **Conclusion:** WAV file samples are at correct positions; visual drift not in WAV content - -### 3. ❌ Frame Truncation (Partial Cause) -- **Issue:** `frames_per_update = (int)(32000 * (1/60))` = 533 frames (truncates 0.333) -- **Impact:** Loses 0.333 frames/update = 10.4μs/frame -- **Total drift over 40s:** 2400 frames × 10.4μs = **25ms** -- **Conclusion:** Explains 25ms of 180ms, but not sufficient - -## Root Cause Discovery - -### Investigation Method -Added debug tracking to `audio_render_ahead()` (audio.cc:115): -```cpp -static int64_t g_total_render_calls = 0; -static int64_t g_total_frames_rendered = 0; - -// Track actual frames rendered vs expected -const int64_t actual_rendered = frames_after - frames_before; -g_total_render_calls++; -g_total_frames_rendered += actual_rendered; -``` - -### Critical Finding: Over-Rendering - -**WAV dump @ 40s (2400 iterations):** -``` -Expected frames: 1,279,200 (2400 × 533) -Actual rendered: 1,290,933 -Difference: +11,733 frames = +366.66ms EXTRA audio -``` - -**Pattern observed every 10s (600 calls @ 60fps):** -``` -[RENDER_DRIFT] calls=600 expect=319800 actual=331533 drift=-366.66ms -[RENDER_DRIFT] calls=1200 expect=639600 actual=651333 drift=-366.66ms -[RENDER_DRIFT] calls=1800 expect=959400 actual=971133 drift=-366.66ms -[RENDER_DRIFT] calls=2400 expect=1279200 actual=1290933 drift=-366.66ms -``` - -### Why This Causes Visual Drift - -**WAV Dump Flow (main.cc:289-302):** -1. `fill_audio_buffer(update_dt)` → calls `audio_render_ahead()` - - Renders audio into ring buffer - - **BUG:** Renders MORE than `chunk_frames` due to buffer management loop -2. `ring_buffer->read(chunk_buffer, samples_per_update)` - - Reads exactly 533 frames from ring buffer -3. `wav_backend.write_audio(chunk_buffer, samples_per_update)` - - Writes exactly 533 frames to WAV - -**Result:** Ring buffer accumulates 11,733 extra frames over 40s. - -### Timing Shift Mechanism - -Ring buffer acts as FIFO queue with 400ms lookahead: -- Initially fills to 400ms (12,800 frames) -- Each iteration: renders 533.333 (actual: ~536) frames, reads 533 frames -- Net accumulation: ~3 frames/iteration -- After 2400 iterations: 12,800 + (2400 × 3) = 20,000 frames buffer size - -Events trigger at correct `music_time` but get written to ring buffer position that's ahead. When WAV reads from buffer, it reads from older position, causing events to appear EARLIER in WAV file than their nominal music_time. - -## Technical Details - -### Code Locations - -**Truncation point 1:** `main.cc:282` -```cpp -const int frames_per_update = (int)(32000 * update_dt); // 533.333 → 533 -``` - -**Truncation point 2:** `audio.cc:105` -```cpp -const int chunk_frames = (int)(dt * RING_BUFFER_SAMPLE_RATE); // 533.333 → 533 -``` - -**Over-render loop:** `audio.cc:112-229` -```cpp -while (true) { - // Keeps rendering until buffer >= target_lookahead - // Renders MORE than chunk_frames due to buffer management - ... -} -``` - -### Why 366ms Per 10s? - -At 60fps, 10s = 600 iterations: -- Expected: 600 × 533 = 319,800 frames -- Actual: 331,533 frames -- Extra: 11,733 frames ÷ 600 = **19.55 frames extra per iteration** - -But `chunk_frames = 533`, so we render 533 + 19.55 = **~552.55 frames per call** on average. - -Discrepancy from 533.333 expected: 552.55 - 533.333 = **19.22 frames/call over-render** - -This 19.22 frames = 0.6ms per iteration accumulates to 366ms per 10s. - -## Proposed Fix - -### Option 1: Match Render to Read (Recommended) -In WAV dump mode, ensure `audio_render_ahead()` renders exactly `frames_per_update`: -```cpp -// main.cc WAV dump loop -const int frames_per_update = (int)(32000 * update_dt); -audio_render_ahead(g_music_time, update_dt, /* force_exact_amount */ frames_per_update); -``` - -Modify `audio_render_ahead()` to accept optional exact frame count and render precisely that amount instead of filling to target lookahead. - -### Option 2: Round Instead of Truncate -```cpp -const int frames_per_update = (int)(32000 * update_dt + 0.5f); // Round: 533.333 → 533 -``` -Reduces truncation error but doesn't solve over-rendering. - -### Option 3: Use Double Precision + Accumulator -```cpp -static double accumulated_time = 0.0; -accumulated_time += update_dt; -const int frames_to_render = (int)(accumulated_time * 32000); -accumulated_time -= frames_to_render / 32000.0; -``` -Eliminates cumulative truncation error. - -## Related Issues - -- `tracker.cc:237` TODO comment mentions "180ms drift over 63 beats" - this is the same bug -- Ring buffer lookahead (400ms) is separate from drift (not the cause) -- Web Audio API `outputLatency` in viewer is unrelated (affects playback, not waveform display) - -## Verification Steps - -1. ✅ Measure WAV sample positions directly (Python script) -2. ✅ Add render tracking debug output -3. ✅ Confirm over-rendering (366ms per 10s) -4. ✅ Implement partial fix (bypass ring buffer, direct render) -5. ⚠️ Current result: -150ms drift at beat 64b (acceptable, needs further work) - -## Current Implementation (main.cc:286-308) - -**WAV dump now bypasses ring buffer entirely:** -1. **Frame accumulator**: Calculates exact frames per update (no truncation) -2. **Direct render**: Calls `synth_render()` directly with exact frame count -3. **No ring buffer**: Eliminates buffer management complexity -4. **Result**: No glitches, but -150ms drift remains - -**Remaining issue:** Drift persists despite direct rendering. Likely related to tempo scaling or audio engine state management. Acceptable for now. - -## Notes - -- Viewer waveform rendering is CORRECT - displays WAV content accurately -- Bug is in demo's WAV generation, specifically ring buffer management in `audio_render_ahead()` -- Progressive nature of drift (30ms → 180ms) indicates accumulation, not one-time offset -- Fix must ensure rendered frames = read frames in WAV dump mode diff --git a/doc/COMPLETED.md b/doc/COMPLETED.md index debfc3d..2a22845 100644 --- a/doc/COMPLETED.md +++ b/doc/COMPLETED.md @@ -48,10 +48,9 @@ Use `read @doc/archive/FILENAME.md` to access archived documents. - Compile-time ping-pong optimization (aliased nodes) - Unified preprocess/postprocess per sequence - Python compiler (seq_compiler_v2.py) generates optimized C++ SequenceV2 subclasses - - **Testing**: 34/36 passing (2 v1-dependent tests disabled: test_demo_effects, test_sequence) + - **Testing**: 35/35 passing (all v2 tests ported) - **Status**: demo64k and test_demo run successfully, all seek positions work - - **TODO**: Port CNN effects to v2, flatten mode implementation, remaining effects - - **Design**: `doc/SEQUENCE_v2.md`, Plan: `doc/archive/SEQUENCE_V2_MIGRATION_PLAN.md` + - **Design**: `doc/SEQUENCE.md`, Archive: `doc/archive/SEQUENCE_V2_MIGRATION_PLAN.md` ## Recently Completed (February 14, 2026) diff --git a/doc/FILE_HIERARCHY_CLEANUP_2026-02-13.md b/doc/FILE_HIERARCHY_CLEANUP_2026-02-13.md deleted file mode 100644 index 8af5efd..0000000 --- a/doc/FILE_HIERARCHY_CLEANUP_2026-02-13.md +++ /dev/null @@ -1,202 +0,0 @@ -# File Hierarchy Cleanup - February 13, 2026 - -## Summary - -Comprehensive reorganization of project file structure for improved maintainability and code reuse. - ---- - -## Changes Implemented - -### 1. Application Entry Points → `src/app/` - -**Before:** -``` -src/ - main.cc - stub_main.cc - test_demo.cc -``` - -**After:** -``` -src/app/ - main.cc - stub_main.cc - test_demo.cc -``` - -**Impact:** Cleaner src/ directory, separates application code from libraries. - ---- - -### 2. Workspace Reorganization - -**Before:** -``` -workspaces/main/ - assets/music/*.spec - shaders/*.wgsl - obj/*.obj - -assets/ - demo.seq - music.track - originals/ - common/ - final/ -``` - -**After:** -``` -workspaces/{main,test}/ - music/ # Audio samples (.spec) - weights/ # CNN binary weights (.bin) - obj/ # 3D models (.obj) - shaders/ # Workspace-specific WGSL only - -common/ - shaders/ # Shared WGSL utilities - math/ - render/ - compute/ - -tools/ - originals/ # Source audio files - test_demo.seq - test_demo.track -``` - -**Removed:** -- `assets/` directory (legacy structure) -- `assets/common/` (replaced by `common/`) -- `assets/final/` (superseded by workspaces) -- `assets/originals/` → `tools/originals/` - ---- - -### 3. Shared Shader System - -**Problem:** 36 duplicate shader files across workspaces (byte-identical). - -**Solution:** Implemented Option 1 from `SHADER_REUSE_INVESTIGATION.md`. - -**Structure:** -``` -common/shaders/ - common_uniforms.wgsl - lighting.wgsl - passthrough.wgsl - ray_box.wgsl - ray_triangle.wgsl - sdf_primitives.wgsl - skybox.wgsl - math/ - common_utils.wgsl - noise.wgsl - sdf_shapes.wgsl - sdf_utils.wgsl - render/ - lighting_utils.wgsl - scene_query_bvh.wgsl - scene_query_linear.wgsl - shadows.wgsl - compute/ - gen_blend.wgsl - gen_grid.wgsl - gen_mask.wgsl - gen_noise.wgsl - gen_perlin.wgsl -``` - -**Reference in assets.txt:** -``` -SHADER_COMMON_UNIFORMS, NONE, ../../common/shaders/common_uniforms.wgsl -SHADER_MATH_NOISE, NONE, ../../common/shaders/math/noise.wgsl -``` - -**Asset Packer Enhancement:** -- Added `#include ` for path normalization -- Implemented `lexically_normal()` to resolve `../../common/` references -- Cross-platform path handling for workspace-relative includes - ---- - -## Updated Files - -### Configuration -- `workspaces/main/workspace.cfg` - Updated asset_dirs and shader_dirs -- `workspaces/test/workspace.cfg` - Updated asset_dirs and shader_dirs -- `workspaces/main/assets.txt` - Common shader references -- `workspaces/test/assets.txt` - Common shader references - -### Build System -- `cmake/DemoExecutables.cmake` - src/app/ paths, test_demo paths -- `cmake/DemoCodegen.cmake` - Removed legacy fallback paths -- `cmake/Validation.cmake` - Workspace shader paths - -### Tools -- `tools/asset_packer.cc` - Filesystem path normalization -- `scripts/gen_spectrograms.sh` - tools/originals/ paths -- `scripts/train_cnn_v2_full.sh` - workspaces/main/weights/ paths -- `training/export_cnn_v2_weights.py` - workspaces/main/weights/ paths - -### Application -- `src/app/main.cc` - Hot-reload workspace paths - ---- - -## Metrics - -**File Reduction:** -- Removed 36 duplicate shader files -- Deleted legacy assets/ structure (~70 files) -- Net: ~100 files eliminated - -**Disk Space:** -- Common shaders: 20 files -- Per-workspace shaders: ~30-35 files -- Saved: ~36 shader duplicates - -**Workspace Structure:** -``` -workspaces/main/: 31 shaders (workspace-specific) -workspaces/test/: 19 shaders (workspace-specific) -common/: 20 shaders (shared) -Total unique: 70 shaders (vs 106 before) -``` - ---- - -## Benefits - -1. **Single Source of Truth:** Common shaders in one location -2. **No Duplication:** Bug fixes apply everywhere automatically -3. **Clear Separation:** Common vs workspace-specific code -4. **Size Optimization:** Important for 64k target -5. **Maintainability:** Easier to understand and modify -6. **Workspace Isolation:** Each workspace still self-contained for specific content - ---- - -## Migration Notes - -**For New Workspaces:** -1. Create `workspaces/new_workspace/` with subdirs: `music/`, `weights/`, `obj/`, `shaders/` -2. Reference common shaders: `../../common/shaders/...` -3. Add workspace-specific shaders to local `shaders/` -4. Update `workspace.cfg`: `asset_dirs = ["music/", "weights/", "obj/"]` - -**For Common Shader Changes:** -- Edit files in `common/shaders/` -- Changes apply to all workspaces immediately -- Run full build to verify all workspaces - ---- - -## Documentation Updated - -- `doc/WORKSPACE_SYSTEM.md` - New structure reflected -- `doc/SHADER_REUSE_INVESTIGATION.md` - Implementation status -- `doc/PROJECT_CONTEXT.md` - Current project state -- `doc/FILE_HIERARCHY_CLEANUP_2026-02-13.md` - This document diff --git a/doc/HOWTO.md b/doc/HOWTO.md index 4cafaa2..f1401df 100644 --- a/doc/HOWTO.md +++ b/doc/HOWTO.md @@ -96,147 +96,46 @@ make run_util_tests # Utility tests ## Training -### Patch-Based (Recommended) -Extracts patches at salient points, trains on center pixels only (matches WGSL sliding window): +### CNN v1 (Legacy) ```bash -# Train with 32×32 patches at detected corners/edges +# Patch-based (recommended) ./cnn_v1/training/train_cnn.py \ --input training/input/ --target training/output/ \ --patch-size 32 --patches-per-image 64 --detector harris \ - --layers 3 --kernel_sizes 3,5,3 --epochs 5000 --batch_size 16 \ - --checkpoint-every 1000 -``` - -**Training behavior:** -- Loss computed only on center pixels (excludes conv padding borders) -- For 3-layer network: excludes 3px border on each side -- Matches GPU shader sliding-window paradigm - -**Detectors:** `harris` (default), `fast`, `shi-tomasi`, `gradient` - -### Full-Image -Processes entire image with sliding window (matches WGSL): -```bash -./cnn_v1/training/train_cnn.py \ - --input training/input/ --target training/output/ \ - --layers 3 --kernel_sizes 3,5,3 --epochs 10000 --batch_size 8 \ - --checkpoint-every 1000 -``` + --layers 3 --kernel_sizes 3,5,3 --epochs 5000 -### Export & Validation -```bash -# Generate shaders from checkpoint -./cnn_v1/training/train_cnn.py --export-only checkpoints/checkpoint_epoch_5000.pth - -# Generate ground truth (sliding window, no tiling) -./cnn_v1/training/train_cnn.py --infer input.png \ - --export-only checkpoints/checkpoint_epoch_5000.pth \ - --output ground_truth.png +# Export shaders +./cnn_v1/training/train_cnn.py --export-only checkpoints/checkpoint.pth ``` -**Inference:** Processes full image with sliding window (each pixel from NxN neighborhood). No tiling artifacts. - -**Kernel sizes:** 3×3 (36 weights), 5×5 (100 weights), 7×7 (196 weights) - ### CNN v2 Training -Enhanced CNN with parametric static features (7D input: RGBD + UV + sin encoding + bias). - -**Complete Pipeline** (recommended): ```bash -# Train → Export → Build → Validate (default config) +# Default pipeline (train → export → validate) ./cnn_v2/scripts/train_cnn_v2_full.sh -# Rapid debug (1 layer, 3×3, 5 epochs) -./cnn_v2/scripts/train_cnn_v2_full.sh --num-layers 1 --kernel-sizes 3 --epochs 5 --output-weights test.bin - -# Custom training parameters -./cnn_v2/scripts/train_cnn_v2_full.sh --epochs 500 --batch-size 32 --checkpoint-every 100 +# Quick debug (1 layer, 5 epochs) +./cnn_v2/scripts/train_cnn_v2_full.sh --num-layers 1 --epochs 5 # Custom architecture -./cnn_v2/scripts/train_cnn_v2_full.sh --kernel-sizes 3,5,3 --num-layers 3 --mip-level 1 - -# Custom output path -./cnn_v2/scripts/train_cnn_v2_full.sh --output-weights workspaces/test/cnn_weights.bin - -# Grayscale loss (compute loss on luminance instead of RGBA) -./cnn_v2/scripts/train_cnn_v2_full.sh --grayscale-loss +./cnn_v2/scripts/train_cnn_v2_full.sh --kernel-sizes 3,5,3 --epochs 500 -# Custom directories -./cnn_v2/scripts/train_cnn_v2_full.sh --input training/input --target training/target_2 - -# Full-image mode (instead of patch-based) -./cnn_v2/scripts/train_cnn_v2_full.sh --full-image --image-size 256 +# Validation only +./cnn_v2/scripts/train_cnn_v2_full.sh --validate -# See all options +# All options ./cnn_v2/scripts/train_cnn_v2_full.sh --help ``` -**Defaults:** 200 epochs, 3×3 kernels, 8→4→4 channels, batch-size 16, patch-based (8×8, harris detector). -- Live progress with single-line update -- Always saves final checkpoint (regardless of --checkpoint-every interval) -- When multiple kernel sizes provided (e.g., 3,5,3), num_layers derived from list length -- Validates all input images on final epoch -- Exports binary weights (storage buffer architecture) -- Streamlined output: single-line export summary, compact validation -- All parameters configurable via command-line - -**Validation Only** (skip training): -```bash -# Use latest checkpoint -./cnn_v2/scripts/train_cnn_v2_full.sh --validate - -# Use specific checkpoint -./cnn_v2/scripts/train_cnn_v2_full.sh --validate checkpoints/checkpoint_epoch_50.pth -``` +**Defaults:** 200 epochs, 3×3 kernels, 8→4→4 channels, patch-based (8×8). Outputs ~3.2 KB f16 weights. -**Manual Training:** +**Manual export:** ```bash -# Default config -./cnn_v2/training/train_cnn_v2.py \ - --input training/input/ --target training/target_2/ \ - --epochs 100 --batch-size 16 --checkpoint-every 5 - -# Custom architecture (per-layer kernel sizes) -./cnn_v2/training/train_cnn_v2.py \ - --input training/input/ --target training/target_2/ \ - --kernel-sizes 1,3,5 \ - --epochs 5000 --batch-size 16 - -# Mip-level for p0-p3 features (0=original, 1=half, 2=quarter, 3=eighth) -./cnn_v2/training/train_cnn_v2.py \ - --input training/input/ --target training/target_2/ \ - --mip-level 1 \ - --epochs 100 --batch-size 16 - -# Grayscale loss (compute loss on luminance Y = 0.299*R + 0.587*G + 0.114*B) -./cnn_v2/training/train_cnn_v2.py \ - --input training/input/ --target training/target_2/ \ - --grayscale-loss \ - --epochs 100 --batch-size 16 -``` - -**Export Binary Weights:** -```bash -# Verbose output (shows all layer details) -./training/export_cnn_v2_weights.py checkpoints/checkpoint_epoch_100.pth \ +./training/export_cnn_v2_weights.py checkpoints/checkpoint.pth \ --output-weights workspaces/main/cnn_v2_weights.bin - -# Quiet mode (single-line summary) -./training/export_cnn_v2_weights.py checkpoints/checkpoint_epoch_100.pth \ - --output-weights workspaces/main/cnn_v2_weights.bin \ - --quiet -``` - -Generates binary format: header + layer info + f16 weights (~3.2 KB for 3-layer model). -Storage buffer architecture allows dynamic layer count. -Use `--quiet` for streamlined output in scripts (used automatically by train_cnn_v2_full.sh). - -**TODO:** 8-bit quantization for 2× size reduction (~1.6 KB). Requires quantization-aware training (QAT). - ``` -**Validation:** Use HTML tool (`cnn_v2/tools/cnn_v2_test/index.html`) for CNN v2 validation. See `cnn_v2/docs/CNN_V2_WEB_TOOL.md`. +See `cnn_v2/docs/CNN_V2.md` for architecture details and web validation tool. --- diff --git a/doc/archive/AUDIO_WAV_DRIFT_BUG.md b/doc/archive/AUDIO_WAV_DRIFT_BUG.md new file mode 100644 index 0000000..050dd49 --- /dev/null +++ b/doc/archive/AUDIO_WAV_DRIFT_BUG.md @@ -0,0 +1,185 @@ +# Audio WAV Drift Bug Investigation + +**Date:** 2026-02-15 +**Status:** ACCEPTABLE (to be continued) +**Current State:** -150ms drift at beat 64b, no glitches + +## Problem Statement + +Timeline viewer shows progressive visual drift between audio waveform and beat grid markers: +- At beat 8 (5.33s @ 90 BPM): kick waveform appears **-30ms early** (left of grid line) +- At beat 60 (40.0s @ 90 BPM): kick waveform appears **-180ms early** (left of grid line) + +Progressive drift rate: ~4.3ms/second + +## Initial Hypotheses (Ruled Out) + +### 1. ❌ Viewer Display Bug +- **Tested:** Sample rate detection in viewer (32kHz correctly detected) +- **Result:** Viewer BPM = 90 (correct), `pixelsPerSecond` mapping correct +- **Conclusion:** Not a viewer rendering issue + +### 2. ❌ WAV File Content Error +- **Tested:** Direct WAV sample position analysis via Python +- **Result:** Actual kick positions in WAV file: + ``` + Beat | Expected(s) | WAV(s) | Drift + -----|-------------|----------|------- + 8 | 5.3333 | 5.3526 | +19ms (LATE) + 60 | 40.0000 | 39.9980 | -2ms (nearly perfect) + ``` +- **Conclusion:** WAV file samples are at correct positions; visual drift not in WAV content + +### 3. ❌ Frame Truncation (Partial Cause) +- **Issue:** `frames_per_update = (int)(32000 * (1/60))` = 533 frames (truncates 0.333) +- **Impact:** Loses 0.333 frames/update = 10.4μs/frame +- **Total drift over 40s:** 2400 frames × 10.4μs = **25ms** +- **Conclusion:** Explains 25ms of 180ms, but not sufficient + +## Root Cause Discovery + +### Investigation Method +Added debug tracking to `audio_render_ahead()` (audio.cc:115): +```cpp +static int64_t g_total_render_calls = 0; +static int64_t g_total_frames_rendered = 0; + +// Track actual frames rendered vs expected +const int64_t actual_rendered = frames_after - frames_before; +g_total_render_calls++; +g_total_frames_rendered += actual_rendered; +``` + +### Critical Finding: Over-Rendering + +**WAV dump @ 40s (2400 iterations):** +``` +Expected frames: 1,279,200 (2400 × 533) +Actual rendered: 1,290,933 +Difference: +11,733 frames = +366.66ms EXTRA audio +``` + +**Pattern observed every 10s (600 calls @ 60fps):** +``` +[RENDER_DRIFT] calls=600 expect=319800 actual=331533 drift=-366.66ms +[RENDER_DRIFT] calls=1200 expect=639600 actual=651333 drift=-366.66ms +[RENDER_DRIFT] calls=1800 expect=959400 actual=971133 drift=-366.66ms +[RENDER_DRIFT] calls=2400 expect=1279200 actual=1290933 drift=-366.66ms +``` + +### Why This Causes Visual Drift + +**WAV Dump Flow (main.cc:289-302):** +1. `fill_audio_buffer(update_dt)` → calls `audio_render_ahead()` + - Renders audio into ring buffer + - **BUG:** Renders MORE than `chunk_frames` due to buffer management loop +2. `ring_buffer->read(chunk_buffer, samples_per_update)` + - Reads exactly 533 frames from ring buffer +3. `wav_backend.write_audio(chunk_buffer, samples_per_update)` + - Writes exactly 533 frames to WAV + +**Result:** Ring buffer accumulates 11,733 extra frames over 40s. + +### Timing Shift Mechanism + +Ring buffer acts as FIFO queue with 400ms lookahead: +- Initially fills to 400ms (12,800 frames) +- Each iteration: renders 533.333 (actual: ~536) frames, reads 533 frames +- Net accumulation: ~3 frames/iteration +- After 2400 iterations: 12,800 + (2400 × 3) = 20,000 frames buffer size + +Events trigger at correct `music_time` but get written to ring buffer position that's ahead. When WAV reads from buffer, it reads from older position, causing events to appear EARLIER in WAV file than their nominal music_time. + +## Technical Details + +### Code Locations + +**Truncation point 1:** `main.cc:282` +```cpp +const int frames_per_update = (int)(32000 * update_dt); // 533.333 → 533 +``` + +**Truncation point 2:** `audio.cc:105` +```cpp +const int chunk_frames = (int)(dt * RING_BUFFER_SAMPLE_RATE); // 533.333 → 533 +``` + +**Over-render loop:** `audio.cc:112-229` +```cpp +while (true) { + // Keeps rendering until buffer >= target_lookahead + // Renders MORE than chunk_frames due to buffer management + ... +} +``` + +### Why 366ms Per 10s? + +At 60fps, 10s = 600 iterations: +- Expected: 600 × 533 = 319,800 frames +- Actual: 331,533 frames +- Extra: 11,733 frames ÷ 600 = **19.55 frames extra per iteration** + +But `chunk_frames = 533`, so we render 533 + 19.55 = **~552.55 frames per call** on average. + +Discrepancy from 533.333 expected: 552.55 - 533.333 = **19.22 frames/call over-render** + +This 19.22 frames = 0.6ms per iteration accumulates to 366ms per 10s. + +## Proposed Fix + +### Option 1: Match Render to Read (Recommended) +In WAV dump mode, ensure `audio_render_ahead()` renders exactly `frames_per_update`: +```cpp +// main.cc WAV dump loop +const int frames_per_update = (int)(32000 * update_dt); +audio_render_ahead(g_music_time, update_dt, /* force_exact_amount */ frames_per_update); +``` + +Modify `audio_render_ahead()` to accept optional exact frame count and render precisely that amount instead of filling to target lookahead. + +### Option 2: Round Instead of Truncate +```cpp +const int frames_per_update = (int)(32000 * update_dt + 0.5f); // Round: 533.333 → 533 +``` +Reduces truncation error but doesn't solve over-rendering. + +### Option 3: Use Double Precision + Accumulator +```cpp +static double accumulated_time = 0.0; +accumulated_time += update_dt; +const int frames_to_render = (int)(accumulated_time * 32000); +accumulated_time -= frames_to_render / 32000.0; +``` +Eliminates cumulative truncation error. + +## Related Issues + +- `tracker.cc:237` TODO comment mentions "180ms drift over 63 beats" - this is the same bug +- Ring buffer lookahead (400ms) is separate from drift (not the cause) +- Web Audio API `outputLatency` in viewer is unrelated (affects playback, not waveform display) + +## Verification Steps + +1. ✅ Measure WAV sample positions directly (Python script) +2. ✅ Add render tracking debug output +3. ✅ Confirm over-rendering (366ms per 10s) +4. ✅ Implement partial fix (bypass ring buffer, direct render) +5. ⚠️ Current result: -150ms drift at beat 64b (acceptable, needs further work) + +## Current Implementation (main.cc:286-308) + +**WAV dump now bypasses ring buffer entirely:** +1. **Frame accumulator**: Calculates exact frames per update (no truncation) +2. **Direct render**: Calls `synth_render()` directly with exact frame count +3. **No ring buffer**: Eliminates buffer management complexity +4. **Result**: No glitches, but -150ms drift remains + +**Remaining issue:** Drift persists despite direct rendering. Likely related to tempo scaling or audio engine state management. Acceptable for now. + +## Notes + +- Viewer waveform rendering is CORRECT - displays WAV content accurately +- Bug is in demo's WAV generation, specifically ring buffer management in `audio_render_ahead()` +- Progressive nature of drift (30ms → 180ms) indicates accumulation, not one-time offset +- Fix must ensure rendered frames = read frames in WAV dump mode diff --git a/doc/archive/FILE_HIERARCHY_CLEANUP_2026-02-13.md b/doc/archive/FILE_HIERARCHY_CLEANUP_2026-02-13.md new file mode 100644 index 0000000..8af5efd --- /dev/null +++ b/doc/archive/FILE_HIERARCHY_CLEANUP_2026-02-13.md @@ -0,0 +1,202 @@ +# File Hierarchy Cleanup - February 13, 2026 + +## Summary + +Comprehensive reorganization of project file structure for improved maintainability and code reuse. + +--- + +## Changes Implemented + +### 1. Application Entry Points → `src/app/` + +**Before:** +``` +src/ + main.cc + stub_main.cc + test_demo.cc +``` + +**After:** +``` +src/app/ + main.cc + stub_main.cc + test_demo.cc +``` + +**Impact:** Cleaner src/ directory, separates application code from libraries. + +--- + +### 2. Workspace Reorganization + +**Before:** +``` +workspaces/main/ + assets/music/*.spec + shaders/*.wgsl + obj/*.obj + +assets/ + demo.seq + music.track + originals/ + common/ + final/ +``` + +**After:** +``` +workspaces/{main,test}/ + music/ # Audio samples (.spec) + weights/ # CNN binary weights (.bin) + obj/ # 3D models (.obj) + shaders/ # Workspace-specific WGSL only + +common/ + shaders/ # Shared WGSL utilities + math/ + render/ + compute/ + +tools/ + originals/ # Source audio files + test_demo.seq + test_demo.track +``` + +**Removed:** +- `assets/` directory (legacy structure) +- `assets/common/` (replaced by `common/`) +- `assets/final/` (superseded by workspaces) +- `assets/originals/` → `tools/originals/` + +--- + +### 3. Shared Shader System + +**Problem:** 36 duplicate shader files across workspaces (byte-identical). + +**Solution:** Implemented Option 1 from `SHADER_REUSE_INVESTIGATION.md`. + +**Structure:** +``` +common/shaders/ + common_uniforms.wgsl + lighting.wgsl + passthrough.wgsl + ray_box.wgsl + ray_triangle.wgsl + sdf_primitives.wgsl + skybox.wgsl + math/ + common_utils.wgsl + noise.wgsl + sdf_shapes.wgsl + sdf_utils.wgsl + render/ + lighting_utils.wgsl + scene_query_bvh.wgsl + scene_query_linear.wgsl + shadows.wgsl + compute/ + gen_blend.wgsl + gen_grid.wgsl + gen_mask.wgsl + gen_noise.wgsl + gen_perlin.wgsl +``` + +**Reference in assets.txt:** +``` +SHADER_COMMON_UNIFORMS, NONE, ../../common/shaders/common_uniforms.wgsl +SHADER_MATH_NOISE, NONE, ../../common/shaders/math/noise.wgsl +``` + +**Asset Packer Enhancement:** +- Added `#include ` for path normalization +- Implemented `lexically_normal()` to resolve `../../common/` references +- Cross-platform path handling for workspace-relative includes + +--- + +## Updated Files + +### Configuration +- `workspaces/main/workspace.cfg` - Updated asset_dirs and shader_dirs +- `workspaces/test/workspace.cfg` - Updated asset_dirs and shader_dirs +- `workspaces/main/assets.txt` - Common shader references +- `workspaces/test/assets.txt` - Common shader references + +### Build System +- `cmake/DemoExecutables.cmake` - src/app/ paths, test_demo paths +- `cmake/DemoCodegen.cmake` - Removed legacy fallback paths +- `cmake/Validation.cmake` - Workspace shader paths + +### Tools +- `tools/asset_packer.cc` - Filesystem path normalization +- `scripts/gen_spectrograms.sh` - tools/originals/ paths +- `scripts/train_cnn_v2_full.sh` - workspaces/main/weights/ paths +- `training/export_cnn_v2_weights.py` - workspaces/main/weights/ paths + +### Application +- `src/app/main.cc` - Hot-reload workspace paths + +--- + +## Metrics + +**File Reduction:** +- Removed 36 duplicate shader files +- Deleted legacy assets/ structure (~70 files) +- Net: ~100 files eliminated + +**Disk Space:** +- Common shaders: 20 files +- Per-workspace shaders: ~30-35 files +- Saved: ~36 shader duplicates + +**Workspace Structure:** +``` +workspaces/main/: 31 shaders (workspace-specific) +workspaces/test/: 19 shaders (workspace-specific) +common/: 20 shaders (shared) +Total unique: 70 shaders (vs 106 before) +``` + +--- + +## Benefits + +1. **Single Source of Truth:** Common shaders in one location +2. **No Duplication:** Bug fixes apply everywhere automatically +3. **Clear Separation:** Common vs workspace-specific code +4. **Size Optimization:** Important for 64k target +5. **Maintainability:** Easier to understand and modify +6. **Workspace Isolation:** Each workspace still self-contained for specific content + +--- + +## Migration Notes + +**For New Workspaces:** +1. Create `workspaces/new_workspace/` with subdirs: `music/`, `weights/`, `obj/`, `shaders/` +2. Reference common shaders: `../../common/shaders/...` +3. Add workspace-specific shaders to local `shaders/` +4. Update `workspace.cfg`: `asset_dirs = ["music/", "weights/", "obj/"]` + +**For Common Shader Changes:** +- Edit files in `common/shaders/` +- Changes apply to all workspaces immediately +- Run full build to verify all workspaces + +--- + +## Documentation Updated + +- `doc/WORKSPACE_SYSTEM.md` - New structure reflected +- `doc/SHADER_REUSE_INVESTIGATION.md` - Implementation status +- `doc/PROJECT_CONTEXT.md` - Current project state +- `doc/FILE_HIERARCHY_CLEANUP_2026-02-13.md` - This document -- cgit v1.2.3