diff options
| -rw-r--r-- | PHASE2_COMPRESSION.md | 18 | ||||
| -rw-r--r-- | PROJECT_CONTEXT.md | 15 | ||||
| -rwxr-xr-x | scripts/crunch_win.sh | 4 | ||||
| -rw-r--r-- | src/main.cc | 17 |
4 files changed, 41 insertions, 13 deletions
diff --git a/PHASE2_COMPRESSION.md b/PHASE2_COMPRESSION.md index a2d19d3..3c83fa4 100644 --- a/PHASE2_COMPRESSION.md +++ b/PHASE2_COMPRESSION.md @@ -1,4 +1,18 @@ # Phase 2 – Compression & Size Reduction -See conversation description for full intent. -Executable and shader compression deferred until visuals/audio stabilize. +This document tracks ideas and strategies for the final optimization phase to reach the <=64k goal. + +## Executable Size + +### Windows +- **Replace GLFW**: For the final build, replace the statically linked GLFW library with a minimal "tiny" implementation using native Windows API (`CreateWindow`, `PeekMessage`, etc.). This is expected to yield significant savings. + - *Status*: Deferred until feature completion. +- **CRT Replacement**: Consider replacing the standard C runtime (CRT) with a minimal startup code (e.g., `tiny_crt` or similar) to avoid linking heavy standard libraries. +- **Import Minimization**: Dynamically load functions via `GetProcAddress` hash lookup to reduce the Import Address Table (IAT) size. + +### General +- **Shader Compression**: Minify WGSL shaders (remove whitespace, rename variables). +- **Asset Compression**: + - Store spectrograms with logarithmic frequency bins. + - Quantize spectral values to `uint16_t` or `uint8_t`. + - Use a custom packer/compressor for the asset blob.
\ No newline at end of file diff --git a/PROJECT_CONTEXT.md b/PROJECT_CONTEXT.md index 0e47b33..ddde463 100644 --- a/PROJECT_CONTEXT.md +++ b/PROJECT_CONTEXT.md @@ -76,16 +76,25 @@ Several critical issues were resolved to ensure stable WebGPU operation across p - **Texture Usage**: Resolved a validation error (`RENDER_ATTACHMENT` usage missing) by explicitly setting `g_config.usage = WGPUTextureUsage_RenderAttachment` in the surface configuration. - **Render Pass Validation**: Fixed a "Depth slice provided but view is not 3D" error by ensuring `WGPURenderPassColorAttachment` is correctly initialized, specifically setting `resolveTarget = nullptr` and `depthSlice = WGPU_DEPTH_SLICE_UNDEFINED`. +### Optimizations: +- **Audio Decoding**: Disabled FLAC, WAV, MP3, and all encoding features in `miniaudio` for the runtime demo build (via `MA_NO_FLAC`, `MA_NO_WAV`, etc.). This reduced the packed Windows binary size by ~100KB (461KB -> 356KB). `spectool` retains full decoding capabilities. +- **Build Stripping**: Implemented `DEMO_STRIP_ALL` CMake option to remove command-line parsing, debug info, and non-essential error handling strings. + +### Future Optimizations (Phase 2): +- **Windows Platform Layer**: Replace the static GLFW library with a minimal, native Windows API implementation (`CreateWindow`, `PeekMessage`) to significantly reduce binary size. +- **Asset Compression**: Implement logarithmic frequency storage and quantization for `.spec` files. +- **CRT Replacement**: investigate minimal C runtime alternatives. + ### WebGPU Portability Layer: To maintain a single codebase while supporting different `wgpu-native` versions (Native macOS/Linux headers vs. Windows/MinGW v0.19.4.1), a portability layer was implemented in `src/gpu/gpu.cc`: - **Header Mapping**: Conditional inclusion of `<webgpu/webgpu.h>` for Windows vs. `<webgpu.h>` for native builds. -- **Type Shims**: Implementation of `WGPUStringView` as a simple `const char*` for older APIs, with `str_view()` and `label_view()` helpers to abstract the transition from raw strings to view structs. +- **Type Shims**: Implementation of `WGPUStringView` as a simple `const char*` for older APIs. +- **Callback Signatures**: Handles the difference between 4-argument (Windows/Old) and 5-argument (macOS/New) callback signatures for `wgpuInstanceRequestAdapter` and `wgpuAdapterRequestDevice`, including the use of callback info structs on newer APIs. - **API Lifecycle**: - **Wait Mechanism**: Abstraction of `wgpuInstanceWaitAny` (new) vs. `wgpuInstanceProcessEvents` (old). - - **Request Methods**: Handling of callback signatures for `wgpuInstanceRequestAdapter` and `wgpuAdapterRequestDevice` which changed from struct-based callbacks to direct function pointers. - **Struct Differences**: - **Color Attachments**: Conditional removal of the `depthSlice` member in `WGPURenderPassColorAttachment`, which is not present in v0.19. - - **Error Handling**: Abstracted `wgpuDeviceSetUncapturedErrorCallback` via a `set_error_callback` helper to account for its relocation into the device descriptor in newer versions. + - **Error Handling**: Abstracted `wgpuDeviceSetUncapturedErrorCallback` usage. - **Surface Creation**: Custom logic in `glfw3webgpu.c` to handle `WGPUSurfaceSourceWindowsHWND` (new) vs. `WGPUSurfaceDescriptorFromWindowsHWND` (old). ### Coding Style: diff --git a/scripts/crunch_win.sh b/scripts/crunch_win.sh index 59d7889..c9d8513 100755 --- a/scripts/crunch_win.sh +++ b/scripts/crunch_win.sh @@ -31,6 +31,6 @@ ls -lh "$INPUT_EXE" ls -lh "$STRIPPED_EXE" ls -lh "$PACKED_EXE" echo "------------------------------------------------" -echo "Top 10 Largest Symbols (from unstripped):" -x86_64-w64-mingw32-nm --print-size --size-sort --radix=d "$INPUT_EXE" | tail -n 10 +echo "Top 20 Largest Symbols (from unstripped):" +x86_64-w64-mingw32-nm --print-size --size-sort --radix=d "$INPUT_EXE" | grep -v debug_ | tail -n 20 echo "------------------------------------------------" diff --git a/src/main.cc b/src/main.cc index 9d7c41b..4f99230 100644 --- a/src/main.cc +++ b/src/main.cc @@ -43,8 +43,8 @@ int register_spec_asset(AssetId id) { return synth_register_spectrogram(&spec); } -static float g_spec_buffer_a[SPEC_FRAMES * DCT_SIZE]; -static float g_spec_buffer_b[SPEC_FRAMES * DCT_SIZE]; +static float *g_spec_buffer_a[SPEC_FRAMES * DCT_SIZE] = { 0 }; +static float *g_spec_buffer_b[SPEC_FRAMES * DCT_SIZE] = { 0 }; // Global storage for the melody to ensure it persists std::vector<float> g_melody_data; @@ -110,8 +110,12 @@ int generate_melody() { return synth_register_spectrogram(&spec); } -void generate_tone(float *buffer, float freq) { - memset(buffer, 0, SPEC_FRAMES * DCT_SIZE * sizeof(float)); +float* generate_tone(float *buffer, float freq) { + if (buffer == nullptr) { + buffer = (float*)calloc(SPEC_FRAMES * DCT_SIZE, sizeof(float)); + } else { + memset(buffer, 0, SPEC_FRAMES * DCT_SIZE * sizeof(float)); + } for (int frame = 0; frame < SPEC_FRAMES; ++frame) { float *spec_frame = buffer + frame * DCT_SIZE; float amplitude = 1000. * powf(1.0f - (float)frame / SPEC_FRAMES, 2.0f); @@ -121,6 +125,7 @@ void generate_tone(float *buffer, float freq) { spec_frame[bin] = amplitude; } } + return buffer; } int main(int argc, char **argv) { @@ -149,8 +154,8 @@ int main(int argc, char **argv) { int hihat_id = register_spec_asset(AssetId::ASSET_HIHAT_1); // Still keep the dynamic tone for bass - generate_tone(g_spec_buffer_a, 110.0f); // A2 - generate_tone(g_spec_buffer_b, 110.0f); + const float* g_spec_buffer_a = generate_tone(nullptr, 110.0f); // A2 + const float* g_spec_buffer_b = generate_tone(nullptr, 110.0f); const Spectrogram bass_spec = {g_spec_buffer_a, g_spec_buffer_b, SPEC_FRAMES}; int bass_id = synth_register_spectrogram(&bass_spec); |
