diff options
| author | skal <pascal.massimino@gmail.com> | 2026-05-14 19:09:39 +0200 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-05-14 19:11:28 +0200 |
| commit | 6ef8f578817ee0134fd5867ca3b80590e3eb2368 (patch) | |
| tree | 5550607e5c4a16ca237bfa4430ac1ef1f5d80c5d /src/util/asset_manager.h | |
| parent | 4bcbe13dab5ffb64d93cc61956f07ee5168a84c9 (diff) | |
ans: order-0 rANS coder + WGSL asset compression
Adds src/util/ans.{h,cc}, a per-chunk-adaptive order-0 rANS entropy
coder. Decoder is always built; encoder is gated on ANS_ENABLE_ENCODER
(tools only). Both sides take an optional 256-entry initial_counts
table to seed the adaptive model.
The per-chunk initial state is (1 << kBits). Higher initial states
(e.g. with a signature packed into the upper bits) force a renorm-emit
at iter 0 that the decoder never consumes, corrupting multi-chunk
streams once stats become skewed.
Asset pipeline:
- AssetRecord gains 'compression' and 'uncompressed_size' fields.
- asset_packer scans every WGSL file to build a corpus-wide byte
histogram, then ANS-encodes each shader using that histogram as the
seed. Histogram and accessor are emitted alongside the asset table.
Round-trip verification runs at pack time for every compressed
asset; failures fall back to uncompressed storage.
- asset_manager decompresses on first GetAsset(), caches the
heap-allocated buffer, and DropAsset / ReloadAssetsFromFile free it
along with the procedural cache.
- Disk-load (dev) builds are unchanged: WGSL paths stay as filenames.
Tests:
- src/tests/util/test_ans.cc: roundtrip variants (empty, single byte,
single-symbol run, all-zeros, random uniform/skewed, repeated ASCII),
seeded-vs-uniform compression, rejection of mismatched counts /
corruption / truncation, PeekUncompressedSize.
- 37/37 dev, 36/36 STRIP_ALL.
Compression observed: WGSL shaders shrink to ~0.62-0.71x in the main
workspace (81 of 105 assets qualify).
Docs:
- doc/ANS.md (new): algorithm, bitstream, API, asset pipeline
integration, compression numbers, limitations, tests.
- doc/ASSET_SYSTEM.md: new Compression section + updated technical
guarantees for compressed assets.
- doc/COMPLETED.md: May 2026 entry.
- PROJECT_CONTEXT.md: Build status line mentions WGSL ANS compression.
- CLAUDE.md, GEMINI.md: tier-3 build doc list includes ANS.md.
Diffstat (limited to 'src/util/asset_manager.h')
| -rw-r--r-- | src/util/asset_manager.h | 18 |
1 files changed, 17 insertions, 1 deletions
diff --git a/src/util/asset_manager.h b/src/util/asset_manager.h index 786a8db..f6ad244 100644 --- a/src/util/asset_manager.h +++ b/src/util/asset_manager.h @@ -16,10 +16,21 @@ enum class AssetType : uint8_t { PROC_GPU, }; +// Compression scheme applied to the packed bytes (only relevant for static +// embedded assets; disk-load and procedural assets are always NONE). +enum class AssetCompression : uint8_t { + NONE = 0, + // Order-0 rANS, model seeded from a corpus-wide histogram embedded in the + // generated assets blob (see GetAnsAsciiHistogram). Used for WGSL. + ANS_ASCII = 1, +}; + struct AssetRecord { const uint8_t* data; // Pointer to asset data (static or dynamic) - size_t size; // Size of the asset data + size_t size; // Size of 'data' in bytes (compressed size if any) AssetType type; + AssetCompression compression; // How 'data' is encoded; NONE = raw bytes + size_t uncompressed_size; // Size after decompression (== size if NONE) const char* proc_func_name_str; // Name of procedural generation function // (string literal) const float* proc_params; // Parameters for procedural generation (static, @@ -27,6 +38,11 @@ struct AssetRecord { int num_proc_params; // Number of procedural parameters }; +// Initial-state histogram (256 entries) used to seed ANS_ASCII compression. +// Defined in the generated assets blob; computed at pack time over the WGSL +// corpus. +const uint32_t* GetAnsAsciiHistogram(); + // Generic interface // Retrieves a pointer to the asset data. // - Static assets are guaranteed to be 16-byte aligned. |
