# 64k Demo Project Goal: - Produce a <=64k native demo binary - Same C++ codebase for Windows, macOS, Linux Graphics: - WebGPU via wgpu-native - WGSL shaders - Single fullscreen pass initially Audio: - 32 kHz, 16-bit mono - Procedurally generated samples - No decoding, no assets Constraints: - Size-sensitive - Minimal dependencies - Explicit control over all allocations Style: - Demoscene - No engine abstractions Incoming tasks in no particular order: - [x] 1. add a fullscreen mode (as command line option) - [x] 2. parse the keyboard key. Exit the demo when 'esc' is pressed. Toggle full-screen when 'f' is pressed. - [x] 3. add binary crunchers for all platforms - [ ] 4. add cross-compilation for PC+linux (x86_64) and PC+Windows (.exe binary) - [x] PC+Windows (.exe binary) via MinGW - [ ] PC+linux (x86_64) - 5. implement a spectrogram editor for representing .spec with elementary shapes (bezier curves, lines, random noise, rectangles...) as a mean of compression / spec generation - [x] 6. add a scripting tool to edit the demo (compiled into the binary at the end) - [x] 7. compile wgpu-native in optimized mode (not unoptimized) - [x] 8. add a #define STRIP_ALL to remove all unnecessary code for the final build (for instance, command-line args parsing, or unnecessary options, constant parameters to function calls, etc.) - [x] 9. work on the compact in-line and off-line asset system (@ASSET_SYSTEM.md) - 10. optimize spectool to remove trailing zeroes from a spec file before saving it - [x] 11. implement a general [time / timer / beat] system in the demo, for effects timing - [x] 12. Implement 'Sequence' and 'Effect' system: - `Effect` interface: `init`, `start`, `compute`, `render`, `end`, `priority`. - `Sequence` manager: Handles a list of temporally sorted `SequenceItem`s. - `MainSequence`: Manages global resources (`device`, `queue`) and orchestrates the frame render. - Refactored `gpu.cc` to use `MainSequence`. - Moved concrete effects to `src/gpu/demo_effects.cc`. - 13. Implement Procedural Texture system (Task #3 -> #1): - `src/procedural/` module (Done). - Integration with WebGPU: Generate textures at runtime and upload to GPU (Done). - `src/gpu/texture_manager` implemented and tested. - 14. Implement 3D system (Task #1 -> #2): - `src/3d/` module (math, camera, scene, renderer) (Done). - `Renderer3D` implemented with basic cube rendering (Done). - `test_3d_render` mini-demo created (Debugging crash in `wgpuInstanceWaitAny`). - 15. Shader Logic (Task #2 -> #3): - Ray-object intersection & SDF rendering in WGSL (Done). - SDF Box, Sphere, Torus implemented. - Hybrid normal calculation (Analytical + Numerical) (Done). - Bump mapping with procedural noise (Done). - Periodic texture generation (Done). - 16. Integrate 3D Renderer into Main Demo: - Update `main.cc` / `gpu.cc` to use `Renderer3D`. - Apply Gaussian Blur and Chromatic Aberration post-processing. * **17. Implement Asset Manager Caching & Procedural Generation** * 17a. **(Task a)** Implemented array-based caching in `asset_manager.cc` for `GetAsset` (Done). * 17b. **(Task b)** Modify `asset_packer` to parse a new `PROC(function_name)` compression method. This will generate a record mapping an Asset ID to the procedural function's name. * 17c. **(Task b)** Implement a runtime dispatcher in `GetAsset`. When a `PROC` asset is requested, the dispatcher will look up the function by its name and execute it. The result will then be cached. * 17d. **(Task c)** Update the asset management test suite to include a test case with `PROC(function_name)` in `test_assets_list.txt` to verify the generation and caching. * 17e. **(Task d)** Integrate into `demo64k` by adding a procedural noise texture to `demo_assets.txt` and using this new mechanism in the `Hybrid3DEffect` for bump mapping. ## Session Decisions and Current State ### 3D Renderer Implementation - **Renderer3D**: Encapsulates pipeline creation and render pass recording. - **Shader**: Currently uses a simple vertex/fragment shader for cubes. - **Storage**: Objects are stored in a `StorageBuffer` (instance data). - **Test Tool**: `test_3d_render` created to isolate 3D development. - **Issue**: `test_3d_render` crashes with "not implemented" in `wgpuInstanceWaitAny` on macOS. ### Procedural Textures - **TextureManager**: Handles CPU generation -> GPU upload. - **API**: Supports `WGPUTexelCopyTextureInfo` (new API) with fallback macros for older headers. ### Sequence & Effect System - **Architecture**: Implemented a hierarchical sequencing system. - **Effect**: Abstract base for visual elements. Supports `compute` (physics) and `render` (draw) phases. Idempotent `init` for shared asset loading. - **Sequence**: Manages a timeline of effects with `start_time` and `end_time`. Handles activation/deactivation and sorting by priority. - **MainSequence**: The top-level coordinator. Holds the WebGPU device/queue/surface. Manages multiple overlapping `Sequence` layers (sorted by priority). - **Sequence Compiler**: Implemented a C++ tool (`seq_compiler`) that transpiles a textual timeline description (`assets/demo.seq`) into a generated C++ file (`timeline.cc`). - **Workflow**: Allows rapid experimentation with timing, layering, and effect parameters without touching engine code. - **Flexibility**: Supports passing arbitrary constructor arguments from the script directly to C++ classes. - **Implementation**: - `src/gpu/effect.h/cc`: Core logic. - `src/gpu/demo_effects.h/cc`: Concrete implementations of `HeptagonEffect` and `ParticlesEffect`. - `src/gpu/gpu.cc`: Simplified to be a thin wrapper initializing and driving `MainSequence`. - **Integration**: The main loop now calculates fractional beats and passes them along with `time` and `aspect_ratio` to the rendering system. ### Debugging Features - **Fast Forward / Seek**: Implemented `--seek ` CLI option. - **Mechanism**: Simulates logic, audio (silent render), and GPU compute (physics) frame-by-frame from `t=0` to `target_time` before starting real-time playback. - **Audio Refactor**: Split `audio_init()` (resource allocation) from `audio_start()` (device callback start) to allow manual `audio_render_silent()` during the seek phase. ### Asset Management System: - **Architecture**: Implemented a C++ tool (`asset_packer`) that bundles external files into hex-encoded C arrays. - **Lookup**: Uses a highly efficient array-based lookup (AssetRecord) for O(1) retrieval of raw byte data at runtime. - **Descriptors**: Assets are defined in `assets/final/demo_assets.txt` (for the demo) and `assets/final/test_assets_list.txt` (for tests). - **Organization**: Common retrieval logic is decoupled and located in `src/util/asset_manager.cc`, ensuring generated data files only contain raw binary blobs. - **Lazy Decompression**: Scaffolding implemented for future on-demand decompression support. ### Build System: - **Production Pipeline**: Automated the entire assembly process via a `final` CMake target (`make final`). - **Automation**: This target builds the tools, runs the `gen_assets.sh` script to re-analyze audio and regenerate sources, and then performs final binary stripping and crunching (using `strip` and `gzexe` on macOS). - **Windows Cross-Compilation**: Implemented a full pipeline from macOS to Windows x86_64 using MinGW. - `scripts/fetch_win_deps.sh`: Downloads pre-compiled GLFW and `wgpu-native` binaries. - `scripts/build_win.sh`: Cross-compiles the demo, bundles MinGW DLLs, and crunches the binary. - `scripts/run_win.sh`: Executes the resulting `.exe` using Wine. - `scripts/crunch_win.sh`: Strips and packs the Windows binary using UPX (LZMA). - `scripts/analyze_win_bloat.sh`: Reports section sizes and top symbols for size optimization. ### Audio Engine (Synth): - **Architecture**: Real-time additive synthesis from spectrograms using Inverse Discrete Cosine Transform (IDCT). - **Dynamic Updates**: Implemented a double-buffering (flip-flop) mechanism for thread-safe, real-time updates of spectrogram data. - **Peak Detection**: Real-time output peak detection with exponential decay for smooth visual synchronization. - **Procedural Melody**: Implemented a shared library (`src/audio/gen.cc`) for generating note spectrograms at runtime. Supports melody pasting with overlapping frames and spectral post-processing. - **Spectral Filters**: Implemented runtime spectral effects including noise (grit), lowpass filtering, and comb filtering (phaser/flanger effects). - **Timing System**: Implemented a beat-based timing system (BPM) used for both audio generation and visual synchronization. ### WebGPU Integration: - **Resource Management**: Introduced `GpuBuffer`, `RenderPass`, and `ComputePass` abstractions to group pipeline and bind group resources. - **Compute Shaders**: Implemented a high-performance particle system (10,000 particles) using compute shaders for physics and audio-reactive updates. - **Visuals**: Pulsating heptagon and a compute-driven particle field synchronized with audio peaks and global time. ### WebGPU Integration Fixes: Several critical issues were resolved to ensure stable WebGPU operation across platforms: - **Surface Creation (macOS)**: Fixed a `g_surface` assertion failure by adding platform-specific compile definitions (`-DGLFW_EXPOSE_NATIVE_COCOA`) to `CMakeLists.txt`. This allows `glfw3webgpu` to access the native window handles required for surface creation. - **Texture Usage**: Resolved a validation error (`RENDER_ATTACHMENT` usage missing) by explicitly setting `g_config.usage = WGPUTextureUsage_RenderAttachment` in the surface configuration. - **Render Pass Validation**: Fixed a "Depth slice provided but view is not 3D" error by ensuring `WGPURenderPassColorAttachment` is correctly initialized, specifically setting `resolveTarget = nullptr` and `depthSlice = WGPU_DEPTH_SLICE_UNDEFINED`. ### Optimizations: - **Audio Decoding**: Disabled FLAC, WAV, MP3, and all encoding features in `miniaudio` for the runtime demo build (via `MA_NO_FLAC`, `MA_NO_WAV`, etc.). This reduced the packed Windows binary size by ~100KB (461KB -> 356KB). `spectool` retains full decoding capabilities. - **Build Stripping**: Implemented `DEMO_STRIP_ALL` CMake option to remove command-line parsing, debug info, and non-essential error handling strings. ### Future Optimizations (Phase 2): - **Windows Platform Layer**: Replace the static GLFW library with a minimal, native Windows API implementation (`CreateWindow`, `PeekMessage`) to significantly reduce binary size. - **Asset Compression**: Implement logarithmic frequency storage and quantization for `.spec` files. - **CRT Replacement**: investigate minimal C runtime alternatives. ### WebGPU Portability Layer: To maintain a single codebase while supporting different `wgpu-native` versions (Native macOS/Linux headers vs. Windows/MinGW v0.19.4.1), a portability layer was implemented in `src/gpu/gpu.cc`: - **Header Mapping**: Conditional inclusion of `` for Windows vs. `` for native builds. - **Type Shims**: Implementation of `WGPUStringView` as a simple `const char*` for older APIs. - **Callback Signatures**: Handles the difference between 4-argument (Windows/Old) and 5-argument (macOS/New) callback signatures for `wgpuInstanceRequestAdapter` and `wgpuAdapterRequestDevice`, including the use of callback info structs on newer APIs. - **API Lifecycle**: - **Wait Mechanism**: Abstraction of `wgpuInstanceWaitAny` (new) vs. `wgpuInstanceProcessEvents` (old). - **Struct Differences**: - **Color Attachments**: Conditional removal of the `depthSlice` member in `WGPURenderPassColorAttachment`, which is not present in v0.19. - **Error Handling**: Abstracted `wgpuDeviceSetUncapturedErrorCallback` usage. - **Surface Creation**: Custom logic in `glfw3webgpu.c` to handle `WGPUSurfaceSourceWindowsHWND` (new) vs. `WGPUSurfaceDescriptorFromWindowsHWND` (old). ### Coding Style: - **Standard**: Strictly enforced project-specific rules in `CONTRIBUTING.md`. - **Cleanup**: Automate removal of trailing whitespaces and addition of missing newlines at EOF across all source files. - **Constraints**: No `auto`, no C++ style casts (`static_cast`, etc.), mandatory 3-line headers. ### Design Decision: Spectrogram Data Representation * **Current State:** Spectrogram frequency bins are stored linearly in `.spec` files and processed as such by the core audio engine (using standard DCT/IDCT). The JavaScript spectrogram editor maps this linear data to a logarithmic scale for visualization and interaction. * **Future Optimization (TODO):** The `.spec` file format will be revisited to: * Store frequencies logarithmically. * Use `uint16_t` instead of `float` for spectral values. * **Impact:** This aims to achieve better compression while retaining fine frequency resolution relevant to human perception. It will primarily affect the code responsible for saving to and reading from `.spec` files, requiring conversions between the new format and the linear float format used internally by the audio engine. ### Development Workflow: - **Testing**: Comprehensive test suite including `AssetManagerTest`, `SynthEngineTest`, `HammingWindowTest`, and `SpectoolEndToEndTest`. All tests are verified before committing.