1 files changed, 229 insertions, 0 deletions
diff --git a/doc/GEOM_BUFFER.md b/doc/GEOM_BUFFER.md
new file mode 100644
index 0000000..0188125
--- /dev/null
+++ b/doc/GEOM_BUFFER.md
@@ -0,0 +1,229 @@
+# Geometry Buffer Design [IN PROGRESS]
+
+**Status:** Ideation phase
+**Goal:** Efficient G-buffer for deferred rendering in 64k demo
+
+---
+
+## Overview
+
+Replace direct rendering with geometry buffer accumulation for advanced post-processing and lighting.
+
+**Target:** 8-10 bytes/pixel, 16-bit precision
+
+---
+
+## Buffer Elements
+
+### Core Attributes
+
+| Attribute | Channels | Precision | Source |
+|-----------|----------|-----------|--------|
+| Albedo (RGB) | 3 | f16 | Material/procedural |
+| Roughness | 1 | u8/u16 | PBR material property |
+| Metallic | 1 | u8/u16 | PBR material property |
+| Normal (XYZ) | 2 | f16 | Octahedral encoding |
+| Depth | 1 | f16/f32 | 1/z for precision |
+| Object/Material ID | 1 | u16 | Rasterization/SDF |
+| Transparency | 1 | u8/u16 | Alpha channel |
+
+### Optional/Derived
+
+| Attribute | Storage | Notes |
+|-----------|---------|-------|
+| Depth gradient | On-demand | Compute from depth (Sobel) |
+| Laplacian | On-demand | Second derivative of depth |
+| Motion vectors | 2×f16 | Screen-space XY |
+| AO | 1×f16 | Ambient occlusion |
+
+**Key insight:** Depth derivatives cheaper to compute than store (2-4 bytes/pixel saved).
+
+---
+
+## Packing Strategies
+
+### Traditional Multi-Render-Target (MRT)
+
+```
+RT0 (RGBA16): Albedo.rgb + Roughness (packed with metallic)
+RT1 (RG16):   Octahedral normal (2 channels encode XYZ)
+RT2 (R32F):   1/z depth (or use hardware depth buffer)
+RT3 (RG16):   Motion vectors XY
+RT4 (R16UI):  Object/Material ID
+```
+
+**Total:** 4-5 render targets = 8-10 bytes/pixel
+
+### Compute Shader + Storage Buffer (RECOMMENDED)
+
+**Advantages:**
+- Custom bit-packing (not bound to RGBA formats)
+- Compute derivatives in-pass (depth gradient, Laplacian)
+- Cache-optimized tiling (Morton order)
+- No MRT limits (store 20+ attributes)
+
+**Tradeoffs:**
+- No hardware depth/early-Z during G-buffer generation
+- Manual atomics if pixel overdraw
+- Lose ROPs hardware optimizations
+
+**Struct Layout:**
+```cpp
+struct GBufferPixel {
+  u32 packed_normal;      // Octahedral 16+16
+  u32 rgba_rough;         // RGBA8 + Roughness8 + Metallic8 (26 bits used)
+  f16 inv_z;              // 1/z depth
+  u16 material_id;        // Object/material
+  // Total: 12 bytes/pixel
+};
+
+// Compressed variant (8 bytes):
+struct CompactGBuffer {
+  u32 normal_depth;       // Oct16 normal + u16 quantized depth
+  u32 rgba_params;        // RGB565 + Rough4 + Metal4 + Flags4
+};
+```
+
+**Access Pattern:**
+```wgsl
+@group(0) @binding(0) var<storage, read_write> g_buffer: array<GBufferPixel>;
+
+fn write_gbuffer(pixel_id: u32, data: SurfaceData) {
+  g_buffer[pixel_id].packed_normal = pack_octahedral(data.normal);
+  g_buffer[pixel_id].rgba_rough = pack_rgba8(data.albedo) | (u32(data.roughness * 255.0) << 24);
+  g_buffer[pixel_id].inv_z = f16(1.0 / data.depth);
+  g_buffer[pixel_id].material_id = data.id;
+}
+```
+
+---
+
+## Normal Encoding
+
+**Octahedral mapping** (most efficient for 2-channel storage):
+- Encodes unit sphere normal to 2D square
+- 16-bit per channel = good precision
+- Fast encode/decode (no trig)
+
+```cpp
+vec2 octahedral_encode(vec3 n) {
+  n /= (abs(n.x) + abs(n.y) + abs(n.z));
+  vec2 p = n.z >= 0.0 ? n.xy : (1.0 - abs(n.yx)) * sign(n.xy);
+  return p * 0.5 + 0.5; // [0, 1]
+}
+
+vec3 octahedral_decode(vec2 p) {
+  p = p * 2.0 - 1.0; // [-1, 1]
+  vec3 n = vec3(p.x, p.y, 1.0 - abs(p.x) - abs(p.y));
+  float t = max(-n.z, 0.0);
+  n.x += n.x >= 0.0 ? -t : t;
+  n.y += n.y >= 0.0 ? -t : t;
+  return normalize(n);
+}
+```
+
+---
+
+## Depth Storage
+
+**1/z (inverse depth):**
+- Better precision distribution (more bits near camera)
+- Linear in screen space
+- Matches perspective projection
+
+**Alternatives:**
+- Logarithmic depth (even better precision)
+- Hardware depth buffer (R32F, free with render targets)
+
+---
+
+## Material Properties
+
+**Roughness/Metallic are NOT geometry:**
+- **Source:** Texture lookups, procedural noise, or constants
+- **Not bump-mapping:** Bump/normal maps perturb normals (geometry)
+- **PBR properties:** Control light interaction (0=smooth/dielectric, 1=rough/metal)
+
+**Demoscene approach:** Procedural generation or baked constants (avoid textures).
+
+---
+
+## Post-Processing Derivatives
+
+**Compute on-demand** (cheaper than storing):
+
+```wgsl
+// Depth gradient (Sobel filter)
+fn depth_gradient(uv: vec2f) -> vec2f {
+  let dx = textureLoad(depth, uv + vec2(1,0)) - textureLoad(depth, uv - vec2(1,0));
+  let dy = textureLoad(depth, uv + vec2(0,1)) - textureLoad(depth, uv - vec2(0,1));
+  return vec2(dx, dy) * 0.5;
+}
+
+// Laplacian (edge detection)
+fn laplacian(uv: vec2f) -> f32 {
+  let c = textureLoad(depth, uv);
+  let n = textureLoad(depth, uv + vec2(0,1));
+  let s = textureLoad(depth, uv - vec2(0,1));
+  let e = textureLoad(depth, uv + vec2(1,0));
+  let w = textureLoad(depth, uv - vec2(1,0));
+  return (n + s + e + w) - 4.0 * c;
+}
+```
+
+---
+
+## Integration with Hybrid Renderer
+
+**Current:** Hybrid SDF raymarching + rasterized proxy geometry
+**Future:** Both write to unified G-buffer
+
+```cpp
+// Rasterization pass
+void rasterize_geometry() {
+  // Vertex shader → fragment shader
+  // Write to G-buffer (compute or MRT)
+}
+
+// SDF raymarching pass (compute)
+void raymarch_sdf() {
+  // Per-pixel ray march
+  // Write to same G-buffer at hit points
+}
+
+// Deferred lighting pass
+void deferred_lighting() {
+  // Read G-buffer
+  // Apply PBR lighting, shadows, etc.
+}
+```
+
+**Atomics handling:** Use depth test or tile-based sorting to avoid conflicts.
+
+---
+
+## Size Budget
+
+**Target:** 1920×1080 @ 8 bytes/pixel = **16 MB**
+**Compressed:** 1920×1080 @ 6 bytes/pixel = **12 MB**
+
+**Acceptable for 64k demo:** RAM usage OK, not binary size.
+
+---
+
+## Next Steps
+
+1. Prototype compute shader G-buffer writer
+2. Implement octahedral normal encoding
+3. Test SDF + raster unified writes
+4. Add deferred lighting pass
+5. Validate depth derivative quality (gradient/Laplacian)
+6. Optimize packing (aim for 6-8 bytes/pixel)
+
+---
+
+## References
+
+- Octahedral mapping: "Survey of Efficient Representations for Independent Unit Vectors" (Meyer et al.)
+- PBR theory: "Physically Based Rendering" (Pharr, Jakob, Humphreys)
+- G-buffer design: "Deferred Rendering in Killzone 2" (Valient, 2007)