diff options
| -rw-r--r-- | PROJECT_CONTEXT.md | 178 | ||||
| -rw-r--r-- | TODO.md | 91 |
2 files changed, 64 insertions, 205 deletions
diff --git a/PROJECT_CONTEXT.md b/PROJECT_CONTEXT.md index 59c7ee7..5dab2b3 100644 --- a/PROJECT_CONTEXT.md +++ b/PROJECT_CONTEXT.md @@ -7,12 +7,12 @@ Goal: Graphics: - WebGPU via wgpu-native - WGSL shaders -- Single fullscreen pass initially +- Hybrid rendering: Rasterized proxy geometry + SDF raymarching Audio: - 32 kHz, 16-bit mono - Procedurally generated samples -- No decoding, no assets +- Real-time additive synthesis from spectrograms (IDCT) Constraints: - Size-sensitive @@ -29,161 +29,57 @@ Style: ### Next Up - **Task #8: Implement Final Build Stripping** - [ ] Define and document a consistent set of rules for code stripping under the `STRIP_ALL` macro. - - [ ] Example sub-tasks: remove unused functions, strip debug fields from structs, simplify code paths where possible. - - [ ] Verify that there's no useless printf() or std::cout in the final code (in particular during error trapping, which should be reduced to minimal code) + - [ ] Remove unused functions, strip debug fields from structs, simplify code paths. + - [ ] Verify no useless printf() or std::cout in final code. + +- **Task #20: Code & Platform Hygiene** + - [ ] Gather all cross-compile and platform-specific conditional code into `platform.h`. + - [ ] Refactor `platform_init()` to return `PlatformState` directly. + - [ ] Consolidate WebGPU header inclusions. + +- **Task #21: Shader Optimization** + - [ ] Use macros or code generation to factorize common WGSL code (normals, bump, lighting). + - [ ] Implement tight ray-marching bounds (min/max t) derived from proxy hull hits. ### Future Goals - **Task #5: Implement Spectrogram Editor** - [ ] Develop a web-based tool (`tools/editor`) for creating and editing `.spec` files visually. - - [ ] The tool should support generating `.spec` files from elementary shapes (lines, curves) for extreme compression. - **Task #18: 3D System Enhancements** -- **Visual Debug Mode**: Implement a debug overlay (removable with `STRIP_ALL`) that includes: - - Wireframe bounding volumes (boxes, spheres, etc.) - - Object and camera trajectories - - Collision points visualization - - Interactive ray/object intersection visualization - - Light sources (direction, cone) and their shadow maps (3D and on-screen 2D). - - [ ] **Blender Exporter**: Create a tool to convert simple Blender scenes into the demo's internal asset format. - - [ ] **GPU BVH & Shadows**: Implement a GPU-based Bounding Volume Hierarchy (BVH) to optimize scene queries (shadows, AO) from the shader, replacing the current O(N) loop. - - [ ] **Texture and binding groups**: currently we can only bind one texture. Support arbitrary number of textures / binding in a sane way. Avoid immediate 'hacks' and 'fixes' + - [ ] **Blender Exporter**: Convert Blender scenes to internal asset format. + - [ ] **GPU BVH & Shadows**: Optimize scene queries with a GPU-based BVH. - **Phase 2: Advanced Size Optimization** - - [x] PC+Windows (.exe binary) via MinGW - - [ ] Task #4a: Linux Cross-Compilation - - [ ] Replace GLFW with a minimal native Windows API layer. - - [ ] Investigate and implement advanced asset compression techniques. - - [ ] Explore replacing the standard C/C++ runtime with a more lightweight alternative. + - [ ] Replace GLFW with minimal native Windows API. + - [ ] Quantize spectrograms to logarithmic frequency and uint16_t. + - [ ] CRT replacement investigation. ### Recently Completed -- **Task #4b: Create `check_all.sh` script** to build and test all platform targets. -- **Task #10: Optimized `spectool`** to trim leading/trailing silent frames from `.spec` files. +- **High-DPI Fix**: Resolved viewport "squishing" via dynamic resolution uniforms and explicit viewports. +- **Unified 3D Shadows**: Implemented robust SDF shadows across all objects using `inv_model` transforms. +- **Tight Proxy Hulls**: Optimized Torus proxy geometry and debug wireframes. +- **Procedural Textures**: Restored floor grid and SDF bump mapping. --- *For a detailed list of all completed tasks, see the git history.* ## Architectural Overview +### Hybrid 3D Renderer +- **Core Idea**: Uses standard rasterization to draw proxy hulls (boxes), then raymarches inside the fragment shader to find the exact SDF surface. +- **Transforms**: Uses `inv_model` matrices to perform all raymarching in local object space, handling rotation and non-uniform scaling correctly. +- **Shadows**: Instance-based shadow casting with self-shadowing prevention (`skip_idx`). + ### Sequence & Effect System -- **Core Idea**: The demo is orchestrated by a powerful sequence and effect system. -- **`Effect`**: An abstract base class for any visual element (e.g., 3D scene, post-processing). Effects are managed within a timeline. -- **`Sequence`**: A timeline that manages the start and end times of multiple `Effect`s. -- **`MainSequence`**: The top-level manager that holds core WebGPU resources and renders the active `Sequence`s for each frame. -- **`seq_compiler`**: A custom tool that transpiles a simple text file (`assets/demo.seq`) into a C++ timeline (`timeline.cc`), allowing for rapid iteration on the demo's choreography without recompiling the engine. +- **Effect**: Abstract base for visual elements. Supports `compute` and `render` phases. +- **Sequence**: Timeline of effects with start/end times. +- **MainSequence**: Top-level coordinator and framebuffer manager. +- **seq_compiler**: Transpiles `assets/demo.seq` into C++ `timeline.cc`. ### Asset & Build System -- **`asset_packer`**: A tool that packs binary assets (like `.spec` files) into C++ arrays. -- **Runtime Manager**: A highly efficient, array-based lookup system (`GetAsset`) provides O(1) access to all packed assets. -- **Procedural Assets**: The system supports `PROC(function, params...)` syntax in asset lists, enabling runtime generation of assets like noise textures. The `asset_packer` stores the function name, and a runtime dispatcher executes the corresponding C++ function. -- **Automation**: The entire build, asset generation, and packaging process is automated via CMake and helper scripts (`gen_assets.sh`, `build_win.sh`). +- **asset_packer**: Embeds binary assets (like `.spec` files) into C++ arrays. +- **Runtime Manager**: O(1) retrieval with lazy procedural generation support. +- **Automation**: `gen_assets.sh`, `build_win.sh`, and `check_all.sh` for multi-platform validation. ### Audio Engine -- **Real-time Synthesis**: The engine uses an additive synthesizer that generates audio in real-time from spectrograms (`.spec` files) via an Inverse Discrete Cosine Transform (IDCT). -- **Dynamic Updates**: Employs a thread-safe double-buffer mechanism to allow spectrogram data to be modified live for dynamic and evolving soundscapes. -- **Procedural Generation**: Includes a library for generating note spectrograms at runtime, complete with spectral filters for effects like noise and phasing. - -### Platform & Windowing -- **Stateless Refactor**: The entire platform layer (`platform.h`, `platform.cc`) is stateless. All window and input state is encapsulated in a `PlatformState` struct, which is passed to all platform functions. -- **High-DPI Aware**: The platform layer correctly queries framebuffer dimensions in pixels, resolving rendering issues on high-DPI (Retina) displays. -- **Cross-Platform Surface**: Uses `glfw3webgpu` to create a WebGPU-compatible surface on Windows, macOS, and Linux. - - -### Procedural Textures -- **TextureManager**: Handles CPU generation -> GPU upload. -- **API**: Supports `WGPUTexelCopyTextureInfo` (new API) with fallback macros for older headers. - -### Sequence & Effect System -- **Architecture**: Implemented a hierarchical sequencing system. - - **Effect**: Abstract base for visual elements. Supports `compute` (physics) and `render` (draw) phases. Idempotent `init` for shared asset loading. - - **Sequence**: Manages a timeline of effects with `start_time` and `end_time`. Handles activation/deactivation and sorting by priority. - - **MainSequence**: The top-level coordinator. Holds the WebGPU device/queue/surface. Manages multiple overlapping `Sequence` layers (sorted by priority). -- **Sequence Compiler**: Implemented a C++ tool (`seq_compiler`) that transpiles a textual timeline description (`assets/demo.seq`) into a generated C++ file (`timeline.cc`). - - **Workflow**: Allows rapid experimentation with timing, layering, and effect parameters without touching engine code. - - **Flexibility**: Supports passing arbitrary constructor arguments from the script directly to C++ classes. -- **Implementation**: - - `src/gpu/effect.h/cc`: Core logic. - - `src/gpu/demo_effects.h/cc`: Concrete implementations of `HeptagonEffect` and `ParticlesEffect`. - - `src/gpu/gpu.cc`: Simplified to be a thin wrapper initializing and driving `MainSequence`. -- **Integration**: The main loop now calculates fractional beats and passes them along with `time` and `aspect_ratio` to the rendering system. - -### Debugging Features -- **Fast Forward / Seek**: Implemented `--seek <seconds>` CLI option. - - **Mechanism**: Simulates logic, audio (silent render), and GPU compute (physics) frame-by-frame from `t=0` to `target_time` before starting real-time playback. - - **Audio Refactor**: Split `audio_init()` (resource allocation) from `audio_start()` (device callback start) to allow manual `audio_render_silent()` during the seek phase. - -### Asset Management System: -- **Architecture**: Implemented a C++ tool (`asset_packer`) that bundles external files into hex-encoded C arrays. -- **Lookup**: Uses a highly efficient array-based lookup (AssetRecord) for O(1) retrieval of raw byte data at runtime. -- **Descriptors**: Assets are defined in `assets/final/demo_assets.txt` (for the demo) and `assets/final/test_assets_list.txt` (for tests). -- **Organization**: Common retrieval logic is decoupled and located in `src/util/asset_manager.cc`, ensuring generated data files only contain raw binary blobs. -- **Lazy Decompression**: Scaffolding implemented for future on-demand decompression support. - -### Build System: -- **Production Pipeline**: Automated the entire assembly process via a `final` CMake target (`make final`). -- **Automation**: This target builds the tools, runs the `gen_assets.sh` script to re-analyze audio and regenerate sources, and then performs final binary stripping and crunching (using `strip` and `gzexe` on macOS). -- **Windows Cross-Compilation**: Implemented a full pipeline from macOS to Windows x86_64 using MinGW. - - `scripts/fetch_win_deps.sh`: Downloads pre-compiled GLFW and `wgpu-native` binaries. - - `scripts/build_win.sh`: Cross-compiles the demo, bundles MinGW DLLs, and crunches the binary. - - `scripts/run_win.sh`: Executes the resulting `.exe` using Wine. - - `scripts/crunch_win.sh`: Strips and packs the Windows binary using UPX (LZMA). - - `scripts/analyze_win_bloat.sh`: Reports section sizes and top symbols for size optimization. - -### Audio Engine (Synth): -- **Architecture**: Real-time additive synthesis from spectrograms using Inverse Discrete Cosine Transform (IDCT). -- **Dynamic Updates**: Implemented a double-buffering (flip-flop) mechanism for thread-safe, real-time updates of spectrogram data. -- **Peak Detection**: Real-time output peak detection with exponential decay for smooth visual synchronization. -- **Procedural Melody**: Implemented a shared library (`src/audio/gen.cc`) for generating note spectrograms at runtime. Supports melody pasting with overlapping frames and spectral post-processing. -- **Spectral Filters**: Implemented runtime spectral effects including noise (grit), lowpass filtering, and comb filtering (phaser/flanger effects). -- **Timing System**: Implemented a beat-based timing system (BPM) used for both audio generation and visual synchronization. - -### WebGPU Integration: -- **Resource Management**: Introduced `GpuBuffer`, `RenderPass`, and `ComputePass` abstractions to group pipeline and bind group resources. -- **Compute Shaders**: Implemented a high-performance particle system (10,000 particles) using compute shaders for physics and audio-reactive updates. -- **Visuals**: Pulsating heptagon and a compute-driven particle field synchronized with audio peaks and global time. - -### WebGPU Integration Fixes: -Several critical issues were resolved to ensure stable WebGPU operation across platforms: -- **Surface Creation (macOS)**: Fixed a `g_surface` assertion failure by adding platform-specific compile definitions (`-DGLFW_EXPOSE_NATIVE_COCOA`) to `CMakeLists.txt`. This allows `glfw3webgpu` to access the native window handles required for surface creation. -- **Texture Usage**: Resolved a validation error (`RENDER_ATTACHMENT` usage missing) by explicitly setting `g_config.usage = WGPUTextureUsage_RenderAttachment` in the surface configuration. -- **Render Pass Validation**: Fixed a "Depth slice provided but view is not 3D" error by ensuring `WGPURenderPassColorAttachment` is correctly initialized, specifically setting `resolveTarget = nullptr` and `depthSlice = WGPU_DEPTH_SLICE_UNDEFINED`. - -### Optimizations: -- **Audio Decoding**: Disabled FLAC, WAV, MP3, and all encoding features in `miniaudio` for the runtime demo build (via `MA_NO_FLAC`, `MA_NO_WAV`, etc.). This reduced the packed Windows binary size by ~100KB (461KB -> 356KB). `spectool` retains full decoding capabilities. -- **Build Stripping**: Implemented `DEMO_STRIP_ALL` CMake option to remove command-line parsing, debug info, and non-essential error handling strings. - -### Future Optimizations (Phase 2) -- **Task #4a: Linux Cross-Compilation**: Implement Linux x86_64 cross-compilation from macOS. -- **Windows Platform Layer**: Replace the static GLFW library with a minimal, native Windows API implementation (`CreateWindow`, `PeekMessage`) to significantly reduce binary size. -- **Asset Compression**: Implement logarithmic frequency storage and quantization for `.spec` files. -- **CRT Replacement**: investigate minimal C runtime alternatives. - -### WebGPU Portability Layer: -To maintain a single codebase while supporting different `wgpu-native` versions (Native macOS/Linux headers vs. Windows/MinGW v0.19.4.1), a portability layer was implemented in `src/gpu/gpu.cc`: -- **Header Mapping**: Conditional inclusion of `<webgpu/webgpu.h>` for Windows vs. `<webgpu.h>` for native builds. -- **Type Shims**: Implementation of `WGPUStringView` as a simple `const char*` for older APIs. -- **Callback Signatures**: Handles the difference between 4-argument (Windows/Old) and 5-argument (macOS/New) callback signatures for `wgpuInstanceRequestAdapter` and `wgpuAdapterRequestDevice`, including the use of callback info structs on newer APIs. -- **API Lifecycle**: - - **Wait Mechanism**: Abstraction of `wgpuInstanceWaitAny` (new) vs. `wgpuInstanceProcessEvents` (old). -- **Struct Differences**: - - **Color Attachments**: Conditional removal of the `depthSlice` member in `WGPURenderPassColorAttachment`, which is not present in v0.19. - - **Error Handling**: Abstracted `wgpuDeviceSetUncapturedErrorCallback` usage. -- **Surface Creation**: Custom logic in `glfw3webgpu.c` to handle `WGPUSurfaceSourceWindowsHWND` (new) vs. `WGPUSurfaceDescriptorFromWindowsHWND` (old). - -### Coding Style: -- **Standard**: Strictly enforced project-specific rules in `CONTRIBUTING.md`. -- **Cleanup**: Automate removal of trailing whitespaces and addition of missing newlines at EOF across all source files. -- **Constraints**: No `auto`, no C++ style casts (`static_cast`, etc.), mandatory 3-line headers. - -### Design Decision: Spectrogram Data Representation - -* **Current State:** Spectrogram frequency bins are stored linearly in `.spec` files and processed as such by the core audio engine (using standard DCT/IDCT). The JavaScript spectrogram editor maps this linear data to a logarithmic scale for visualization and interaction. -* **Future Optimization (TODO):** The `.spec` file format will be revisited to: - * Store frequencies logarithmically. - * Use `uint16_t` instead of `float` for spectral values. - * **Impact:** This aims to achieve better compression while retaining fine frequency resolution relevant to human perception. It will primarily affect the code responsible for saving to and reading from `.spec` files, requiring conversions between the new format and the linear float format used internally by the audio engine. - -### Development Workflow: -- **Testing**: Comprehensive test suite including `AssetManagerTest`, `SynthEngineTest`, `HammingWindowTest`, and `SpectoolEndToEndTest`. All tests are verified before committing. - -### Platform & Windowing -- **High-DPI Fix**: Resolved a long-standing issue on high-DPI (Retina) displays where the viewport would be 'squished' in a corner. The platform layer now correctly queries framebuffer dimensions in pixels instead of window dimensions in points. -- **Stateless Refactor**: The entire platform layer (`platform.h`, `platform.cc`) was refactored to be stateless. All global variables were encapsulated into a `PlatformState` struct, which is now passed to all platform functions (`platform_init`, `platform_poll`, etc.). This improves modularity and removes scattered global state. -- **Custom Resolution**: Added a `--resolution WxH` command-line option to allow specifying a custom window size at startup (e.g., `./build/demo64k --resolution 1024x768`). This feature is disabled in `STRIP_ALL` builds to save space.
\ No newline at end of file +- **Synthesis**: Real-time additive synthesis from spectrograms via IDCT. +- **Dynamic Updates**: Double-buffered spectrograms for live thread-safe updates. +- **Procedural Library**: Melodies and spectral filters (noise, comb) generated at runtime. @@ -1,72 +1,35 @@ # To-Do List -This file tracks the next set of immediate, actionable tasks for the project. +This file tracks prioritized tasks with detailed attack plans. -## Next Up +## Priority 1: Final Build Stripping (Task #8) +**Goal:** Reduce binary size by removing all non-essential code under the `STRIP_ALL` macro. +- [ ] **Attack Plan - Rules:** Document stripping rules in `doc/STRIPPING.md`. +- [ ] **Attack Plan - Error Strings:** Wrap all non-critical `printf`, `std::cerr`, and error strings in `STRIP_ALL`. +- [ ] **Attack Plan - CLI Parsing:** Disable all non-essential CLI arguments (seek, resolution, debug) in `STRIP_ALL`. +- [ ] **Attack Plan - Struct Hygiene:** Remove debug-only fields (like `label` or `name`) from core structs. -- **Task #8: Implement Final Build Stripping** - - [ ] Define and document a consistent set of rules for code stripping under the `STRIP_ALL` macro. - - [ ] Example sub-tasks: remove unused functions, strip debug fields from structs, simplify code paths where possible. - -- **Task #19: Update README.md with Quick Start** - - [ ] Add a top-level "Quick Start" section to `README.md` with brief build and run instructions for `demo64k`. - -- ** Task #?: scripts/build_win.sh is always copying MinGW DLLs file ("Copy MinGW DLLs"). Add a check on date or file size to prevent useless systematic copy. - -- ** Task #?: Code hygiene - - [ ] make a pass on the code and make sure all useless code is protected by STRIP_ALL #ifdef's - - [ ] analyze the #include and check if some standard header inclusion could be removed (<algorithm>, etc.) - - [ ] see if all usage of std structs, container, etc. are appropriate: - == is this std::map<> needed? - == could we remove this std::set<>? - == Can this std::vector<T> be replaced by a simple C-like "const T*" array? - == are these std::string needed or can they be replaced by some 'const char*' ? - == do we need these std::cout, std::cerr, etc. instead of printf()'s? - == etc. - - [ ] the inclusion of gpu.h (either "gpu.h" or <webgpu/gpu.h>) seems to be a recurring compilation and portability issue. Can we have a single inclusion of gpu.h in some platform header instead of scattered inclusion in .cc files? This would reduce the single-point-of-compilation failures during compilation and portability checks. - -- ** Task #?: platform-specific code hygiene - There's several platform-specific code scattered over several source files (.cc and .h) - For instance: -```cpp -#if defined(DEMO_CROSS_COMPILE_WIN32) -#include <webgpu/webgpu.h> -#else -#include <webgpu.h> -#endif /* defined(DEMO_CROSS_COMPILE_WIN32) */ -``` - This sort of code should be gathered at once single place (platform.h?) once for all. - - [ ] => Make all cross-compile and platform-specific conditional code sit in one single header file. - -- ** Task #?: platform_init() should return a PlatformState directly instead of taking a PlatformState& parameter to write into - - [ ] maybe incorporate the platform time (platform_get_time()) into the PlatformState directly during platform_poll() call? - - [ ] same with aspect_ratio (platform_get_aspect_ratio()) unless it's not an invariant and the function needs to be called each time - -- ** Task #?: shader code factorization with macros - The shader code is rapidly growing big and hairy. We can probably use macros to factorize most common code (normal calc? bump mapping? sampling?) to reduce the code boilerplate. - -- ** Task #?: the SDF calculation should be passed min-distance and max-distance parameters derived from the bounding box hit. This is to narrow the ray-marching and reduce the number of iterations if possible. +## Priority 2: Platform & Code Hygiene (Task #20) +**Goal:** Clean up the codebase for easier cross-platform maintenance and CRT replacement. +- [ ] **Attack Plan - Header Consolidation:** Move all `#ifdef` logic for WebGPU headers and platform-specific shims into `src/platform.h`. +- [ ] **Attack Plan - Refactor platform_init:** Change `void platform_init(PlatformState* state, ...)` to `PlatformState platform_init(...)`. +- [ ] **Attack Plan - Unified Poll:** Incorporate `platform_get_time()` and `platform_get_aspect_ratio()` updates into `platform_poll()`. +- [ ] **Attack Plan - Standard Container Removal:** Replace `std::map`, `std::string`, and `std::vector` in performance-critical or size-sensitive paths with simpler C-style alternatives. +## Priority 3: Shader Optimization (Task #21) +**Goal:** Improve GPU performance and reduce shader source bloat. +- [ ] **Attack Plan - Normal Factorization:** Create a standard WGSL helper for normal calculation to avoid duplicate code in every effect. +- [ ] **Attack Plan - Ray Bounds:** Derive `t_min` and `t_max` for ray-marching from the proxy hull entry/exit points to minimize iterations. +- [ ] **Attack Plan - SDF Macros:** Use `#define` macros in WGSL to simplify sampling and bump mapping logic. ## Future Goals +- [ ] **Task #5: Spectrogram Editor**: Web-based visual tool for extreme audio compression. +- [ ] **Task #18: 3D System Enhancements**: Blender exporter and GPU-based BVH for complex scenes. +- [ ] **Task #22: Windows Native Platform**: Replace GLFW with direct Win32 API calls for the final 64k push. -- **Task #5: Implement Spectrogram Editor** - - [ ] Develop a web-based tool (`tools/editor`) for creating and editing `.spec` files visually. - - [ ] The tool should support generating `.spec` files from elementary shapes (lines, curves) for extreme compression. -- **Phase 2: Advanced Size Optimization** - - [ ] Replace GLFW with a minimal native Windows API layer. - - [ ] Investigate and implement advanced asset compression techniques (e.g., logarithmic frequency, quantization). - - [ ] Explore replacing the standard C/C++ runtime with a more lightweight alternative. - -## Past Tasks - -- Centralize generated files into `src/generated`. -- Vertically compact C++ source code. -- Create top-level `README.md`. -- Move non-essential documentation to `doc/`. -- **Bug Fixes:** - - Resolved high-DPI "squished" rendering by implementing dynamic resolution uniforms and explicit viewport settings. - - Fixed missing 3D shadows by unifying all objects (including the floor) under the SDF raymarching path, using 'inv_model' for accurate world-to-local transformations, and implementing robust instance-based self-shadowing prevention. -- **Code Hygiene:** Completed a project-wide code formatting pass with `clang-format`. -- **Task #4b:** Create `scripts/check_all.sh` to build and test all platform targets (macOS, Windows, Linux) to ensure stability before commits. -- **Task #10:** Modify `spectool` to trim leading and trailing silent frames from `.spec` files to reduce asset size.
\ No newline at end of file +## Recently Completed +- [x] **High-DPI Fix**: Resolved viewport "squishing" via dynamic resolution uniforms and explicit viewports. +- [x] **Unified 3D Shadows**: Implemented robust SDF shadows across all objects using `inv_model` transforms. +- [x] **Tight Proxy Hulls**: Optimized Torus proxy geometry and debug wireframes. +- [x] **Procedural Textures**: Restored floor grid and SDF bump mapping. +- [x] **Code Hygiene**: Completed a project-wide code formatting pass with `clang-format`. |
