1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
|
# 64k Demo Project
Goal:
- Produce a <=64k native demo binary
- Same C++ codebase for Windows, macOS, Linux
Graphics:
- WebGPU via wgpu-native
- WGSL shaders
- Single fullscreen pass initially
Audio:
- 32 kHz, 16-bit mono
- Procedurally generated samples
- No decoding, no assets
Constraints:
- Size-sensitive
- Minimal dependencies
- Explicit control over all allocations
Style:
- Demoscene
- No engine abstractions
---
## Project Roadmap
### Next Up
- **Task #8: Implement Final Build Stripping**
- [ ] Define and document a consistent set of rules for code stripping under the `STRIP_ALL` macro.
- [ ] Example sub-tasks: remove unused functions, strip debug fields from structs, simplify code paths where possible.
### Future Goals
- **Task #5: Implement Spectrogram Editor**
- [ ] Develop a web-based tool (`tools/editor`) for creating and editing `.spec` files visually.
- [ ] The tool should support generating `.spec` files from elementary shapes (lines, curves) for extreme compression.
- **Task #18: 3D System Enhancements**
- [ ] **Visual Debug Mode**: Implement a debug overlay (removable with `STRIP_ALL`) to render wireframe bounding volumes, object trajectories, and light source representations.
- [ ] **Blender Exporter**: Create a tool to convert simple Blender scenes into the demo's internal asset format.
- [ ] **GPU BVH & Shadows**: Implement a GPU-based Bounding Volume Hierarchy (BVH) to optimize scene queries (shadows, AO) from the shader, replacing the current O(N) loop.
- [ ] **Texture and binding groups**: currently we can only bind one texture. Support arbitrary number of textures / binding in a sane way. Avoid immediate 'hacks' and 'fixes'
- **Phase 2: Advanced Size Optimization**
- [x] PC+Windows (.exe binary) via MinGW
- [ ] Task #4a: Linux Cross-Compilation
- [ ] Replace GLFW with a minimal native Windows API layer.
- [ ] Investigate and implement advanced asset compression techniques.
- [ ] Explore replacing the standard C/C++ runtime with a more lightweight alternative.
### Recently Completed
- **Task #4b: Create `check_all.sh` script** to build and test all platform targets.
- **Task #10: Optimized `spectool`** to trim leading/trailing silent frames from `.spec` files.
---
*For a detailed list of all completed tasks, see the git history.*
## Architectural Overview
### Sequence & Effect System
- **Core Idea**: The demo is orchestrated by a powerful sequence and effect system.
- **`Effect`**: An abstract base class for any visual element (e.g., 3D scene, post-processing). Effects are managed within a timeline.
- **`Sequence`**: A timeline that manages the start and end times of multiple `Effect`s.
- **`MainSequence`**: The top-level manager that holds core WebGPU resources and renders the active `Sequence`s for each frame.
- **`seq_compiler`**: A custom tool that transpiles a simple text file (`assets/demo.seq`) into a C++ timeline (`timeline.cc`), allowing for rapid iteration on the demo's choreography without recompiling the engine.
### Asset & Build System
- **`asset_packer`**: A tool that packs binary assets (like `.spec` files) into C++ arrays.
- **Runtime Manager**: A highly efficient, array-based lookup system (`GetAsset`) provides O(1) access to all packed assets.
- **Procedural Assets**: The system supports `PROC(function, params...)` syntax in asset lists, enabling runtime generation of assets like noise textures. The `asset_packer` stores the function name, and a runtime dispatcher executes the corresponding C++ function.
- **Automation**: The entire build, asset generation, and packaging process is automated via CMake and helper scripts (`gen_assets.sh`, `build_win.sh`).
### Audio Engine
- **Real-time Synthesis**: The engine uses an additive synthesizer that generates audio in real-time from spectrograms (`.spec` files) via an Inverse Discrete Cosine Transform (IDCT).
- **Dynamic Updates**: Employs a thread-safe double-buffer mechanism to allow spectrogram data to be modified live for dynamic and evolving soundscapes.
- **Procedural Generation**: Includes a library for generating note spectrograms at runtime, complete with spectral filters for effects like noise and phasing.
### Platform & Windowing
- **Stateless Refactor**: The entire platform layer (`platform.h`, `platform.cc`) is stateless. All window and input state is encapsulated in a `PlatformState` struct, which is passed to all platform functions.
- **High-DPI Aware**: The platform layer correctly queries framebuffer dimensions in pixels, resolving rendering issues on high-DPI (Retina) displays.
- **Cross-Platform Surface**: Uses `glfw3webgpu` to create a WebGPU-compatible surface on Windows, macOS, and Linux.
### Procedural Textures
- **TextureManager**: Handles CPU generation -> GPU upload.
- **API**: Supports `WGPUTexelCopyTextureInfo` (new API) with fallback macros for older headers.
### Sequence & Effect System
- **Architecture**: Implemented a hierarchical sequencing system.
- **Effect**: Abstract base for visual elements. Supports `compute` (physics) and `render` (draw) phases. Idempotent `init` for shared asset loading.
- **Sequence**: Manages a timeline of effects with `start_time` and `end_time`. Handles activation/deactivation and sorting by priority.
- **MainSequence**: The top-level coordinator. Holds the WebGPU device/queue/surface. Manages multiple overlapping `Sequence` layers (sorted by priority).
- **Sequence Compiler**: Implemented a C++ tool (`seq_compiler`) that transpiles a textual timeline description (`assets/demo.seq`) into a generated C++ file (`timeline.cc`).
- **Workflow**: Allows rapid experimentation with timing, layering, and effect parameters without touching engine code.
- **Flexibility**: Supports passing arbitrary constructor arguments from the script directly to C++ classes.
- **Implementation**:
- `src/gpu/effect.h/cc`: Core logic.
- `src/gpu/demo_effects.h/cc`: Concrete implementations of `HeptagonEffect` and `ParticlesEffect`.
- `src/gpu/gpu.cc`: Simplified to be a thin wrapper initializing and driving `MainSequence`.
- **Integration**: The main loop now calculates fractional beats and passes them along with `time` and `aspect_ratio` to the rendering system.
### Debugging Features
- **Fast Forward / Seek**: Implemented `--seek <seconds>` CLI option.
- **Mechanism**: Simulates logic, audio (silent render), and GPU compute (physics) frame-by-frame from `t=0` to `target_time` before starting real-time playback.
- **Audio Refactor**: Split `audio_init()` (resource allocation) from `audio_start()` (device callback start) to allow manual `audio_render_silent()` during the seek phase.
### Asset Management System:
- **Architecture**: Implemented a C++ tool (`asset_packer`) that bundles external files into hex-encoded C arrays.
- **Lookup**: Uses a highly efficient array-based lookup (AssetRecord) for O(1) retrieval of raw byte data at runtime.
- **Descriptors**: Assets are defined in `assets/final/demo_assets.txt` (for the demo) and `assets/final/test_assets_list.txt` (for tests).
- **Organization**: Common retrieval logic is decoupled and located in `src/util/asset_manager.cc`, ensuring generated data files only contain raw binary blobs.
- **Lazy Decompression**: Scaffolding implemented for future on-demand decompression support.
### Build System:
- **Production Pipeline**: Automated the entire assembly process via a `final` CMake target (`make final`).
- **Automation**: This target builds the tools, runs the `gen_assets.sh` script to re-analyze audio and regenerate sources, and then performs final binary stripping and crunching (using `strip` and `gzexe` on macOS).
- **Windows Cross-Compilation**: Implemented a full pipeline from macOS to Windows x86_64 using MinGW.
- `scripts/fetch_win_deps.sh`: Downloads pre-compiled GLFW and `wgpu-native` binaries.
- `scripts/build_win.sh`: Cross-compiles the demo, bundles MinGW DLLs, and crunches the binary.
- `scripts/run_win.sh`: Executes the resulting `.exe` using Wine.
- `scripts/crunch_win.sh`: Strips and packs the Windows binary using UPX (LZMA).
- `scripts/analyze_win_bloat.sh`: Reports section sizes and top symbols for size optimization.
### Audio Engine (Synth):
- **Architecture**: Real-time additive synthesis from spectrograms using Inverse Discrete Cosine Transform (IDCT).
- **Dynamic Updates**: Implemented a double-buffering (flip-flop) mechanism for thread-safe, real-time updates of spectrogram data.
- **Peak Detection**: Real-time output peak detection with exponential decay for smooth visual synchronization.
- **Procedural Melody**: Implemented a shared library (`src/audio/gen.cc`) for generating note spectrograms at runtime. Supports melody pasting with overlapping frames and spectral post-processing.
- **Spectral Filters**: Implemented runtime spectral effects including noise (grit), lowpass filtering, and comb filtering (phaser/flanger effects).
- **Timing System**: Implemented a beat-based timing system (BPM) used for both audio generation and visual synchronization.
### WebGPU Integration:
- **Resource Management**: Introduced `GpuBuffer`, `RenderPass`, and `ComputePass` abstractions to group pipeline and bind group resources.
- **Compute Shaders**: Implemented a high-performance particle system (10,000 particles) using compute shaders for physics and audio-reactive updates.
- **Visuals**: Pulsating heptagon and a compute-driven particle field synchronized with audio peaks and global time.
### WebGPU Integration Fixes:
Several critical issues were resolved to ensure stable WebGPU operation across platforms:
- **Surface Creation (macOS)**: Fixed a `g_surface` assertion failure by adding platform-specific compile definitions (`-DGLFW_EXPOSE_NATIVE_COCOA`) to `CMakeLists.txt`. This allows `glfw3webgpu` to access the native window handles required for surface creation.
- **Texture Usage**: Resolved a validation error (`RENDER_ATTACHMENT` usage missing) by explicitly setting `g_config.usage = WGPUTextureUsage_RenderAttachment` in the surface configuration.
- **Render Pass Validation**: Fixed a "Depth slice provided but view is not 3D" error by ensuring `WGPURenderPassColorAttachment` is correctly initialized, specifically setting `resolveTarget = nullptr` and `depthSlice = WGPU_DEPTH_SLICE_UNDEFINED`.
### Optimizations:
- **Audio Decoding**: Disabled FLAC, WAV, MP3, and all encoding features in `miniaudio` for the runtime demo build (via `MA_NO_FLAC`, `MA_NO_WAV`, etc.). This reduced the packed Windows binary size by ~100KB (461KB -> 356KB). `spectool` retains full decoding capabilities.
- **Build Stripping**: Implemented `DEMO_STRIP_ALL` CMake option to remove command-line parsing, debug info, and non-essential error handling strings.
### Future Optimizations (Phase 2)
- **Task #4a: Linux Cross-Compilation**: Implement Linux x86_64 cross-compilation from macOS.
- **Windows Platform Layer**: Replace the static GLFW library with a minimal, native Windows API implementation (`CreateWindow`, `PeekMessage`) to significantly reduce binary size.
- **Asset Compression**: Implement logarithmic frequency storage and quantization for `.spec` files.
- **CRT Replacement**: investigate minimal C runtime alternatives.
### WebGPU Portability Layer:
To maintain a single codebase while supporting different `wgpu-native` versions (Native macOS/Linux headers vs. Windows/MinGW v0.19.4.1), a portability layer was implemented in `src/gpu/gpu.cc`:
- **Header Mapping**: Conditional inclusion of `<webgpu/webgpu.h>` for Windows vs. `<webgpu.h>` for native builds.
- **Type Shims**: Implementation of `WGPUStringView` as a simple `const char*` for older APIs.
- **Callback Signatures**: Handles the difference between 4-argument (Windows/Old) and 5-argument (macOS/New) callback signatures for `wgpuInstanceRequestAdapter` and `wgpuAdapterRequestDevice`, including the use of callback info structs on newer APIs.
- **API Lifecycle**:
- **Wait Mechanism**: Abstraction of `wgpuInstanceWaitAny` (new) vs. `wgpuInstanceProcessEvents` (old).
- **Struct Differences**:
- **Color Attachments**: Conditional removal of the `depthSlice` member in `WGPURenderPassColorAttachment`, which is not present in v0.19.
- **Error Handling**: Abstracted `wgpuDeviceSetUncapturedErrorCallback` usage.
- **Surface Creation**: Custom logic in `glfw3webgpu.c` to handle `WGPUSurfaceSourceWindowsHWND` (new) vs. `WGPUSurfaceDescriptorFromWindowsHWND` (old).
### Coding Style:
- **Standard**: Strictly enforced project-specific rules in `CONTRIBUTING.md`.
- **Cleanup**: Automate removal of trailing whitespaces and addition of missing newlines at EOF across all source files.
- **Constraints**: No `auto`, no C++ style casts (`static_cast`, etc.), mandatory 3-line headers.
### Design Decision: Spectrogram Data Representation
* **Current State:** Spectrogram frequency bins are stored linearly in `.spec` files and processed as such by the core audio engine (using standard DCT/IDCT). The JavaScript spectrogram editor maps this linear data to a logarithmic scale for visualization and interaction.
* **Future Optimization (TODO):** The `.spec` file format will be revisited to:
* Store frequencies logarithmically.
* Use `uint16_t` instead of `float` for spectral values.
* **Impact:** This aims to achieve better compression while retaining fine frequency resolution relevant to human perception. It will primarily affect the code responsible for saving to and reading from `.spec` files, requiring conversions between the new format and the linear float format used internally by the audio engine.
### Development Workflow:
- **Testing**: Comprehensive test suite including `AssetManagerTest`, `SynthEngineTest`, `HammingWindowTest`, and `SpectoolEndToEndTest`. All tests are verified before committing.
### Platform & Windowing
- **High-DPI Fix**: Resolved a long-standing issue on high-DPI (Retina) displays where the viewport would be 'squished' in a corner. The platform layer now correctly queries framebuffer dimensions in pixels instead of window dimensions in points.
- **Stateless Refactor**: The entire platform layer (`platform.h`, `platform.cc`) was refactored to be stateless. All global variables were encapsulated into a `PlatformState` struct, which is now passed to all platform functions (`platform_init`, `platform_poll`, etc.). This improves modularity and removes scattered global state.
- **Custom Resolution**: Added a `--resolution WxH` command-line option to allow specifying a custom window size at startup (e.g., `./build/demo64k --resolution 1024x768`). This feature is disabled in `STRIP_ALL` builds to save space.
|