summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorskal <pascal.massimino@gmail.com>2026-01-31 14:34:59 +0100
committerskal <pascal.massimino@gmail.com>2026-01-31 14:34:59 +0100
commit815c4032813b5aafce09cf4e3731f4b7dfda106a (patch)
tree438a9ecd17500f0dc7d53163a3fb0d3198ed48f7
parent2d760dee6751981db1eac1a22111e597f6bdbbee (diff)
update session with mix fixes
-rw-r--r--PHASE2_COMPRESSION.md18
-rw-r--r--PROJECT_CONTEXT.md15
-rwxr-xr-xscripts/crunch_win.sh4
-rw-r--r--src/main.cc17
4 files changed, 41 insertions, 13 deletions
diff --git a/PHASE2_COMPRESSION.md b/PHASE2_COMPRESSION.md
index a2d19d3..3c83fa4 100644
--- a/PHASE2_COMPRESSION.md
+++ b/PHASE2_COMPRESSION.md
@@ -1,4 +1,18 @@
# Phase 2 – Compression & Size Reduction
-See conversation description for full intent.
-Executable and shader compression deferred until visuals/audio stabilize.
+This document tracks ideas and strategies for the final optimization phase to reach the <=64k goal.
+
+## Executable Size
+
+### Windows
+- **Replace GLFW**: For the final build, replace the statically linked GLFW library with a minimal "tiny" implementation using native Windows API (`CreateWindow`, `PeekMessage`, etc.). This is expected to yield significant savings.
+ - *Status*: Deferred until feature completion.
+- **CRT Replacement**: Consider replacing the standard C runtime (CRT) with a minimal startup code (e.g., `tiny_crt` or similar) to avoid linking heavy standard libraries.
+- **Import Minimization**: Dynamically load functions via `GetProcAddress` hash lookup to reduce the Import Address Table (IAT) size.
+
+### General
+- **Shader Compression**: Minify WGSL shaders (remove whitespace, rename variables).
+- **Asset Compression**:
+ - Store spectrograms with logarithmic frequency bins.
+ - Quantize spectral values to `uint16_t` or `uint8_t`.
+ - Use a custom packer/compressor for the asset blob. \ No newline at end of file
diff --git a/PROJECT_CONTEXT.md b/PROJECT_CONTEXT.md
index 0e47b33..ddde463 100644
--- a/PROJECT_CONTEXT.md
+++ b/PROJECT_CONTEXT.md
@@ -76,16 +76,25 @@ Several critical issues were resolved to ensure stable WebGPU operation across p
- **Texture Usage**: Resolved a validation error (`RENDER_ATTACHMENT` usage missing) by explicitly setting `g_config.usage = WGPUTextureUsage_RenderAttachment` in the surface configuration.
- **Render Pass Validation**: Fixed a "Depth slice provided but view is not 3D" error by ensuring `WGPURenderPassColorAttachment` is correctly initialized, specifically setting `resolveTarget = nullptr` and `depthSlice = WGPU_DEPTH_SLICE_UNDEFINED`.
+### Optimizations:
+- **Audio Decoding**: Disabled FLAC, WAV, MP3, and all encoding features in `miniaudio` for the runtime demo build (via `MA_NO_FLAC`, `MA_NO_WAV`, etc.). This reduced the packed Windows binary size by ~100KB (461KB -> 356KB). `spectool` retains full decoding capabilities.
+- **Build Stripping**: Implemented `DEMO_STRIP_ALL` CMake option to remove command-line parsing, debug info, and non-essential error handling strings.
+
+### Future Optimizations (Phase 2):
+- **Windows Platform Layer**: Replace the static GLFW library with a minimal, native Windows API implementation (`CreateWindow`, `PeekMessage`) to significantly reduce binary size.
+- **Asset Compression**: Implement logarithmic frequency storage and quantization for `.spec` files.
+- **CRT Replacement**: investigate minimal C runtime alternatives.
+
### WebGPU Portability Layer:
To maintain a single codebase while supporting different `wgpu-native` versions (Native macOS/Linux headers vs. Windows/MinGW v0.19.4.1), a portability layer was implemented in `src/gpu/gpu.cc`:
- **Header Mapping**: Conditional inclusion of `<webgpu/webgpu.h>` for Windows vs. `<webgpu.h>` for native builds.
-- **Type Shims**: Implementation of `WGPUStringView` as a simple `const char*` for older APIs, with `str_view()` and `label_view()` helpers to abstract the transition from raw strings to view structs.
+- **Type Shims**: Implementation of `WGPUStringView` as a simple `const char*` for older APIs.
+- **Callback Signatures**: Handles the difference between 4-argument (Windows/Old) and 5-argument (macOS/New) callback signatures for `wgpuInstanceRequestAdapter` and `wgpuAdapterRequestDevice`, including the use of callback info structs on newer APIs.
- **API Lifecycle**:
- **Wait Mechanism**: Abstraction of `wgpuInstanceWaitAny` (new) vs. `wgpuInstanceProcessEvents` (old).
- - **Request Methods**: Handling of callback signatures for `wgpuInstanceRequestAdapter` and `wgpuAdapterRequestDevice` which changed from struct-based callbacks to direct function pointers.
- **Struct Differences**:
- **Color Attachments**: Conditional removal of the `depthSlice` member in `WGPURenderPassColorAttachment`, which is not present in v0.19.
- - **Error Handling**: Abstracted `wgpuDeviceSetUncapturedErrorCallback` via a `set_error_callback` helper to account for its relocation into the device descriptor in newer versions.
+ - **Error Handling**: Abstracted `wgpuDeviceSetUncapturedErrorCallback` usage.
- **Surface Creation**: Custom logic in `glfw3webgpu.c` to handle `WGPUSurfaceSourceWindowsHWND` (new) vs. `WGPUSurfaceDescriptorFromWindowsHWND` (old).
### Coding Style:
diff --git a/scripts/crunch_win.sh b/scripts/crunch_win.sh
index 59d7889..c9d8513 100755
--- a/scripts/crunch_win.sh
+++ b/scripts/crunch_win.sh
@@ -31,6 +31,6 @@ ls -lh "$INPUT_EXE"
ls -lh "$STRIPPED_EXE"
ls -lh "$PACKED_EXE"
echo "------------------------------------------------"
-echo "Top 10 Largest Symbols (from unstripped):"
-x86_64-w64-mingw32-nm --print-size --size-sort --radix=d "$INPUT_EXE" | tail -n 10
+echo "Top 20 Largest Symbols (from unstripped):"
+x86_64-w64-mingw32-nm --print-size --size-sort --radix=d "$INPUT_EXE" | grep -v debug_ | tail -n 20
echo "------------------------------------------------"
diff --git a/src/main.cc b/src/main.cc
index 9d7c41b..4f99230 100644
--- a/src/main.cc
+++ b/src/main.cc
@@ -43,8 +43,8 @@ int register_spec_asset(AssetId id) {
return synth_register_spectrogram(&spec);
}
-static float g_spec_buffer_a[SPEC_FRAMES * DCT_SIZE];
-static float g_spec_buffer_b[SPEC_FRAMES * DCT_SIZE];
+static float *g_spec_buffer_a[SPEC_FRAMES * DCT_SIZE] = { 0 };
+static float *g_spec_buffer_b[SPEC_FRAMES * DCT_SIZE] = { 0 };
// Global storage for the melody to ensure it persists
std::vector<float> g_melody_data;
@@ -110,8 +110,12 @@ int generate_melody() {
return synth_register_spectrogram(&spec);
}
-void generate_tone(float *buffer, float freq) {
- memset(buffer, 0, SPEC_FRAMES * DCT_SIZE * sizeof(float));
+float* generate_tone(float *buffer, float freq) {
+ if (buffer == nullptr) {
+ buffer = (float*)calloc(SPEC_FRAMES * DCT_SIZE, sizeof(float));
+ } else {
+ memset(buffer, 0, SPEC_FRAMES * DCT_SIZE * sizeof(float));
+ }
for (int frame = 0; frame < SPEC_FRAMES; ++frame) {
float *spec_frame = buffer + frame * DCT_SIZE;
float amplitude = 1000. * powf(1.0f - (float)frame / SPEC_FRAMES, 2.0f);
@@ -121,6 +125,7 @@ void generate_tone(float *buffer, float freq) {
spec_frame[bin] = amplitude;
}
}
+ return buffer;
}
int main(int argc, char **argv) {
@@ -149,8 +154,8 @@ int main(int argc, char **argv) {
int hihat_id = register_spec_asset(AssetId::ASSET_HIHAT_1);
// Still keep the dynamic tone for bass
- generate_tone(g_spec_buffer_a, 110.0f); // A2
- generate_tone(g_spec_buffer_b, 110.0f);
+ const float* g_spec_buffer_a = generate_tone(nullptr, 110.0f); // A2
+ const float* g_spec_buffer_b = generate_tone(nullptr, 110.0f);
const Spectrogram bass_spec = {g_spec_buffer_a, g_spec_buffer_b, SPEC_FRAMES};
int bass_id = synth_register_spectrogram(&bass_spec);