# ANS Compression

Order-0 rANS entropy coder used to compress shader assets at build time and
decompress them on first access at runtime.

**Source:** `src/util/ans.{h,cc}`.

---

## Algorithm

Per-chunk adaptive order-0 byte coder.

| Parameter        | Value                                  |
|------------------|----------------------------------------|
| Precision        | 16 bits (`kBits = 16`)                 |
| State range      | `[1 << 16, 1 << 32)` (`uint32_t`)      |
| Renorm I/O width | 16 bits (big-endian)                   |
| Chunk size       | 1024 bytes                             |
| Symbols          | 256 (bytes)                            |
| Initial state    | `1 << 16` (`kInitState`)               |

The encoder iterates each chunk in reverse, the decoder forward. Symbol
counts are mutated on the fly during encode/decode and re-normalized at
each chunk boundary so the cumulative table sums to `1 << 16`.

The chunk-end state always equals `kInitState`; the decoder rejects the
stream if it doesn't. That single check catches both bit-level corruption
and decoder/encoder model divergence (e.g. wrong initial histogram).

The per-chunk initial state must be exactly `1 << kBits`. A higher value
(e.g. with a "signature" packed into the upper bits) forces a renorm-emit
at iter 0 that the decoder never consumes — harmless on a single chunk,
but it corrupts any stream with two or more chunks once the per-chunk
stats become skewed.

---

## Bitstream Format

Big-endian throughout.

```
[u32  uncompressed_size]            // 4 bytes, header
per chunk (uncompressed_size > 0):
  [u32  final_state]                // 4 bytes
  [u16  emitted_words]*             // variable, in stream order
```

Number of emitted words per chunk is implicit — the decoder pulls a word
whenever its state drops at or below `kMask = (1 << kBits) - 1`.

---

## API

```cpp
#include "util/ans.h"

// Always built.
bool ans::Decode(const uint8_t* src, size_t src_size,
                 uint8_t* dst, size_t dst_capacity,
                 size_t* out_size,
                 const uint32_t* initial_counts = nullptr);

uint32_t ans::PeekUncompressedSize(const uint8_t* src, size_t src_size);

// Gated on ANS_ENABLE_ENCODER (tools only).
bool ans::Encode(const uint8_t* src, size_t size,
                 std::vector<uint8_t>* dst,
                 const uint32_t* initial_counts = nullptr);

void ans::Histogram(const uint8_t* src, size_t size, uint32_t* out_counts);
```

`initial_counts` is a 256-entry table that seeds the adaptive model. Both
encoder and decoder must use the same seed — a mismatch trips the chunk-end
state check immediately. Pass `nullptr` for a uniform default (all-ones).

---

## Asset Pipeline Integration

`AssetRecord` carries two extra fields:

```cpp
enum class AssetCompression : uint8_t {
  NONE = 0,
  ANS_ASCII = 1,  // seeded from GetAnsAsciiHistogram()
};

struct AssetRecord {
  ...
  AssetCompression compression;
  size_t uncompressed_size;  // == size if compression == NONE
};
```

### Build time (`tools/asset_packer.cc`)

Embedded (non-disk-load) builds only:

1. Scan every `WGSL` asset to build a corpus-wide 256-entry byte histogram.
2. Emit it as `static const uint32_t kAnsAsciiHistogram[256]` plus a
   `GetAnsAsciiHistogram()` accessor in `assets_data.cc`.
3. For each `WGSL` asset, call `TryAnsCompress()`:
   `ans::Encode(...)` → reject if it's not smaller than the raw input →
   round-trip verify with `ans::Decode(...)` → only then mark the asset
   `ANS_ASCII`.
4. Other asset types (SPEC, TEXTURE, MESH, BINARY, MP3, PROC*) pass
   through uncompressed.

Disk-load (dev) builds skip the encoder entirely: WGSL data is the file
path, never the file contents.

### Runtime (`src/util/asset_manager.cc`)

`GetAsset()` checks `compression` on a cache miss:

- `NONE` → return the static pointer (or hit the existing PROC / disk-load
  branch).
- `ANS_ASCII` → allocate `uncompressed_size + 1` bytes,
  `ans::Decode(..., GetAnsAsciiHistogram())`, NUL-terminate, cache.

`DropAsset()` and `ReloadAssetsFromFile()` free the heap-allocated buffer
when `compression != NONE`, alongside the existing procedural cleanup.

---

## Observed Compression

`workspaces/main`, STRIP_ALL build: WGSL shaders compress to **0.62×–0.71×**
their raw size (81 of 105 assets qualify). Round-trip verification runs
at pack time for every compressed asset; failures abort the build.

---

## Limitations

The encoder returns `false` if it cannot produce a final state above
`kMask` for some chunk. With the corpus-derived ASCII histogram this never
trips on the demo's WGSL corpus, but inputs with a near-monolithic byte
distribution can fail. Such assets fall back to uncompressed storage.

---

## Tests

`src/tests/util/test_ans.cc` (run via `make run_util_tests` or
`./build/test_ans`):

- Roundtrip variants: empty, single byte, single-symbol run, all-zeros,
  random uniform, random skewed, repeated ASCII.
- Seeded-vs-uniform: a corpus-matched histogram compresses at least as
  well as a uniform seed.
- Rejection: mismatched seed model, payload bit-flip, truncated stream.
- `PeekUncompressedSize` returns the header value.

---

## See Also

- `doc/ASSET_SYSTEM.md` — overall asset pipeline.
- `src/util/ans.h` — public API.
- `tools/asset_packer.cc` — corpus scan and per-asset compression.
- `src/util/asset_manager.cc` — runtime decompression.