# 64k Demo Project Goal: - Produce a <=64k native demo binary - Same C++ codebase for Windows, macOS, Linux Graphics: - WebGPU via wgpu-native - WGSL shaders - Single fullscreen pass initially Audio: - 32 kHz, 16-bit mono - Procedurally generated samples - No decoding, no assets Constraints: - Size-sensitive - Minimal dependencies - Explicit control over all allocations Style: - Demoscene - No engine abstractions --- ## Project Roadmap ### Next Up - **Task #8: Implement Final Build Stripping** - [ ] Define and document a consistent set of rules for code stripping under the `STRIP_ALL` macro. - [ ] Example sub-tasks: remove unused functions, strip debug fields from structs, simplify code paths where possible. ### Future Goals - **Task #5: Implement Spectrogram Editor** - [ ] Develop a web-based tool (`tools/editor`) for creating and editing `.spec` files visually. - [ ] The tool should support generating `.spec` files from elementary shapes (lines, curves) for extreme compression. - **Task #18: 3D System Enhancements** - [ ] **Visual Debug Mode**: Implement a debug overlay (removable with `STRIP_ALL`) to render wireframe bounding volumes, object trajectories, and light source representations. - [ ] **Blender Exporter**: Create a tool to convert simple Blender scenes into the demo's internal asset format. - [ ] **GPU BVH & Shadows**: Implement a GPU-based Bounding Volume Hierarchy (BVH) to optimize scene queries (shadows, AO) from the shader, replacing the current O(N) loop. - [ ] **Texture and binding groups**: currently we can only bind one texture. Support arbitrary number of textures / binding in a sane way. Avoid immediate 'hacks' and 'fixes' - **Phase 2: Advanced Size Optimization** - [x] PC+Windows (.exe binary) via MinGW - [ ] Task #4a: Linux Cross-Compilation - [ ] Replace GLFW with a minimal native Windows API layer. - [ ] Investigate and implement advanced asset compression techniques. - [ ] Explore replacing the standard C/C++ runtime with a more lightweight alternative. ### Recently Completed - **Task #4b: Create `check_all.sh` script** to build and test all platform targets. - **Task #10: Optimized `spectool`** to trim leading/trailing silent frames from `.spec` files. --- *For a detailed list of all completed tasks, see the git history.* ## Architectural Overview ### Sequence & Effect System - **Core Idea**: The demo is orchestrated by a powerful sequence and effect system. - **`Effect`**: An abstract base class for any visual element (e.g., 3D scene, post-processing). Effects are managed within a timeline. - **`Sequence`**: A timeline that manages the start and end times of multiple `Effect`s. - **`MainSequence`**: The top-level manager that holds core WebGPU resources and renders the active `Sequence`s for each frame. - **`seq_compiler`**: A custom tool that transpiles a simple text file (`assets/demo.seq`) into a C++ timeline (`timeline.cc`), allowing for rapid iteration on the demo's choreography without recompiling the engine. ### Asset & Build System - **`asset_packer`**: A tool that packs binary assets (like `.spec` files) into C++ arrays. - **Runtime Manager**: A highly efficient, array-based lookup system (`GetAsset`) provides O(1) access to all packed assets. - **Procedural Assets**: The system supports `PROC(function, params...)` syntax in asset lists, enabling runtime generation of assets like noise textures. The `asset_packer` stores the function name, and a runtime dispatcher executes the corresponding C++ function. - **Automation**: The entire build, asset generation, and packaging process is automated via CMake and helper scripts (`gen_assets.sh`, `build_win.sh`). ### Audio Engine - **Real-time Synthesis**: The engine uses an additive synthesizer that generates audio in real-time from spectrograms (`.spec` files) via an Inverse Discrete Cosine Transform (IDCT). - **Dynamic Updates**: Employs a thread-safe double-buffer mechanism to allow spectrogram data to be modified live for dynamic and evolving soundscapes. - **Procedural Generation**: Includes a library for generating note spectrograms at runtime, complete with spectral filters for effects like noise and phasing. ### Platform & Windowing - **Stateless Refactor**: The entire platform layer (`platform.h`, `platform.cc`) is stateless. All window and input state is encapsulated in a `PlatformState` struct, which is passed to all platform functions. - **High-DPI Aware**: The platform layer correctly queries framebuffer dimensions in pixels, resolving rendering issues on high-DPI (Retina) displays. - **Cross-Platform Surface**: Uses `glfw3webgpu` to create a WebGPU-compatible surface on Windows, macOS, and Linux. ### Procedural Textures - **TextureManager**: Handles CPU generation -> GPU upload. - **API**: Supports `WGPUTexelCopyTextureInfo` (new API) with fallback macros for older headers. ### Sequence & Effect System - **Architecture**: Implemented a hierarchical sequencing system. - **Effect**: Abstract base for visual elements. Supports `compute` (physics) and `render` (draw) phases. Idempotent `init` for shared asset loading. - **Sequence**: Manages a timeline of effects with `start_time` and `end_time`. Handles activation/deactivation and sorting by priority. - **MainSequence**: The top-level coordinator. Holds the WebGPU device/queue/surface. Manages multiple overlapping `Sequence` layers (sorted by priority). - **Sequence Compiler**: Implemented a C++ tool (`seq_compiler`) that transpiles a textual timeline description (`assets/demo.seq`) into a generated C++ file (`timeline.cc`). - **Workflow**: Allows rapid experimentation with timing, layering, and effect parameters without touching engine code. - **Flexibility**: Supports passing arbitrary constructor arguments from the script directly to C++ classes. - **Implementation**: - `src/gpu/effect.h/cc`: Core logic. - `src/gpu/demo_effects.h/cc`: Concrete implementations of `HeptagonEffect` and `ParticlesEffect`. - `src/gpu/gpu.cc`: Simplified to be a thin wrapper initializing and driving `MainSequence`. - **Integration**: The main loop now calculates fractional beats and passes them along with `time` and `aspect_ratio` to the rendering system. ### Debugging Features - **Fast Forward / Seek**: Implemented `--seek ` CLI option. - **Mechanism**: Simulates logic, audio (silent render), and GPU compute (physics) frame-by-frame from `t=0` to `target_time` before starting real-time playback. - **Audio Refactor**: Split `audio_init()` (resource allocation) from `audio_start()` (device callback start) to allow manual `audio_render_silent()` during the seek phase. ### Asset Management System: - **Architecture**: Implemented a C++ tool (`asset_packer`) that bundles external files into hex-encoded C arrays. - **Lookup**: Uses a highly efficient array-based lookup (AssetRecord) for O(1) retrieval of raw byte data at runtime. - **Descriptors**: Assets are defined in `assets/final/demo_assets.txt` (for the demo) and `assets/final/test_assets_list.txt` (for tests). - **Organization**: Common retrieval logic is decoupled and located in `src/util/asset_manager.cc`, ensuring generated data files only contain raw binary blobs. - **Lazy Decompression**: Scaffolding implemented for future on-demand decompression support. ### Build System: - **Production Pipeline**: Automated the entire assembly process via a `final` CMake target (`make final`). - **Automation**: This target builds the tools, runs the `gen_assets.sh` script to re-analyze audio and regenerate sources, and then performs final binary stripping and crunching (using `strip` and `gzexe` on macOS). - **Windows Cross-Compilation**: Implemented a full pipeline from macOS to Windows x86_64 using MinGW. - `scripts/fetch_win_deps.sh`: Downloads pre-compiled GLFW and `wgpu-native` binaries. - `scripts/build_win.sh`: Cross-compiles the demo, bundles MinGW DLLs, and crunches the binary. - `scripts/run_win.sh`: Executes the resulting `.exe` using Wine. - `scripts/crunch_win.sh`: Strips and packs the Windows binary using UPX (LZMA). - `scripts/analyze_win_bloat.sh`: Reports section sizes and top symbols for size optimization. ### Audio Engine (Synth): - **Architecture**: Real-time additive synthesis from spectrograms using Inverse Discrete Cosine Transform (IDCT). - **Dynamic Updates**: Implemented a double-buffering (flip-flop) mechanism for thread-safe, real-time updates of spectrogram data. - **Peak Detection**: Real-time output peak detection with exponential decay for smooth visual synchronization. - **Procedural Melody**: Implemented a shared library (`src/audio/gen.cc`) for generating note spectrograms at runtime. Supports melody pasting with overlapping frames and spectral post-processing. - **Spectral Filters**: Implemented runtime spectral effects including noise (grit), lowpass filtering, and comb filtering (phaser/flanger effects). - **Timing System**: Implemented a beat-based timing system (BPM) used for both audio generation and visual synchronization. ### WebGPU Integration: - **Resource Management**: Introduced `GpuBuffer`, `RenderPass`, and `ComputePass` abstractions to group pipeline and bind group resources. - **Compute Shaders**: Implemented a high-performance particle system (10,000 particles) using compute shaders for physics and audio-reactive updates. - **Visuals**: Pulsating heptagon and a compute-driven particle field synchronized with audio peaks and global time. ### WebGPU Integration Fixes: Several critical issues were resolved to ensure stable WebGPU operation across platforms: - **Surface Creation (macOS)**: Fixed a `g_surface` assertion failure by adding platform-specific compile definitions (`-DGLFW_EXPOSE_NATIVE_COCOA`) to `CMakeLists.txt`. This allows `glfw3webgpu` to access the native window handles required for surface creation. - **Texture Usage**: Resolved a validation error (`RENDER_ATTACHMENT` usage missing) by explicitly setting `g_config.usage = WGPUTextureUsage_RenderAttachment` in the surface configuration. - **Render Pass Validation**: Fixed a "Depth slice provided but view is not 3D" error by ensuring `WGPURenderPassColorAttachment` is correctly initialized, specifically setting `resolveTarget = nullptr` and `depthSlice = WGPU_DEPTH_SLICE_UNDEFINED`. ### Optimizations: - **Audio Decoding**: Disabled FLAC, WAV, MP3, and all encoding features in `miniaudio` for the runtime demo build (via `MA_NO_FLAC`, `MA_NO_WAV`, etc.). This reduced the packed Windows binary size by ~100KB (461KB -> 356KB). `spectool` retains full decoding capabilities. - **Build Stripping**: Implemented `DEMO_STRIP_ALL` CMake option to remove command-line parsing, debug info, and non-essential error handling strings. ### Future Optimizations (Phase 2) - **Task #4a: Linux Cross-Compilation**: Implement Linux x86_64 cross-compilation from macOS. - **Windows Platform Layer**: Replace the static GLFW library with a minimal, native Windows API implementation (`CreateWindow`, `PeekMessage`) to significantly reduce binary size. - **Asset Compression**: Implement logarithmic frequency storage and quantization for `.spec` files. - **CRT Replacement**: investigate minimal C runtime alternatives. ### WebGPU Portability Layer: To maintain a single codebase while supporting different `wgpu-native` versions (Native macOS/Linux headers vs. Windows/MinGW v0.19.4.1), a portability layer was implemented in `src/gpu/gpu.cc`: - **Header Mapping**: Conditional inclusion of `` for Windows vs. `` for native builds. - **Type Shims**: Implementation of `WGPUStringView` as a simple `const char*` for older APIs. - **Callback Signatures**: Handles the difference between 4-argument (Windows/Old) and 5-argument (macOS/New) callback signatures for `wgpuInstanceRequestAdapter` and `wgpuAdapterRequestDevice`, including the use of callback info structs on newer APIs. - **API Lifecycle**: - **Wait Mechanism**: Abstraction of `wgpuInstanceWaitAny` (new) vs. `wgpuInstanceProcessEvents` (old). - **Struct Differences**: - **Color Attachments**: Conditional removal of the `depthSlice` member in `WGPURenderPassColorAttachment`, which is not present in v0.19. - **Error Handling**: Abstracted `wgpuDeviceSetUncapturedErrorCallback` usage. - **Surface Creation**: Custom logic in `glfw3webgpu.c` to handle `WGPUSurfaceSourceWindowsHWND` (new) vs. `WGPUSurfaceDescriptorFromWindowsHWND` (old). ### Coding Style: - **Standard**: Strictly enforced project-specific rules in `CONTRIBUTING.md`. - **Cleanup**: Automate removal of trailing whitespaces and addition of missing newlines at EOF across all source files. - **Constraints**: No `auto`, no C++ style casts (`static_cast`, etc.), mandatory 3-line headers. ### Design Decision: Spectrogram Data Representation * **Current State:** Spectrogram frequency bins are stored linearly in `.spec` files and processed as such by the core audio engine (using standard DCT/IDCT). The JavaScript spectrogram editor maps this linear data to a logarithmic scale for visualization and interaction. * **Future Optimization (TODO):** The `.spec` file format will be revisited to: * Store frequencies logarithmically. * Use `uint16_t` instead of `float` for spectral values. * **Impact:** This aims to achieve better compression while retaining fine frequency resolution relevant to human perception. It will primarily affect the code responsible for saving to and reading from `.spec` files, requiring conversions between the new format and the linear float format used internally by the audio engine. ### Development Workflow: - **Testing**: Comprehensive test suite including `AssetManagerTest`, `SynthEngineTest`, `HammingWindowTest`, and `SpectoolEndToEndTest`. All tests are verified before committing. ### Platform & Windowing - **High-DPI Fix**: Resolved a long-standing issue on high-DPI (Retina) displays where the viewport would be 'squished' in a corner. The platform layer now correctly queries framebuffer dimensions in pixels instead of window dimensions in points. - **Stateless Refactor**: The entire platform layer (`platform.h`, `platform.cc`) was refactored to be stateless. All global variables were encapsulated into a `PlatformState` struct, which is now passed to all platform functions (`platform_init`, `platform_poll`, etc.). This improves modularity and removes scattered global state. - **Custom Resolution**: Added a `--resolution WxH` command-line option to allow specifying a custom window size at startup (e.g., `./build/demo64k --resolution 1024x768`). This feature is disabled in `STRIP_ALL` builds to save space.