diff options
Diffstat (limited to 'doc/GEOM_BUFFER.md')
| -rw-r--r-- | doc/GEOM_BUFFER.md | 229 |
1 files changed, 229 insertions, 0 deletions
diff --git a/doc/GEOM_BUFFER.md b/doc/GEOM_BUFFER.md new file mode 100644 index 0000000..0188125 --- /dev/null +++ b/doc/GEOM_BUFFER.md @@ -0,0 +1,229 @@ +# Geometry Buffer Design [IN PROGRESS] + +**Status:** Ideation phase +**Goal:** Efficient G-buffer for deferred rendering in 64k demo + +--- + +## Overview + +Replace direct rendering with geometry buffer accumulation for advanced post-processing and lighting. + +**Target:** 8-10 bytes/pixel, 16-bit precision + +--- + +## Buffer Elements + +### Core Attributes + +| Attribute | Channels | Precision | Source | +|-----------|----------|-----------|--------| +| Albedo (RGB) | 3 | f16 | Material/procedural | +| Roughness | 1 | u8/u16 | PBR material property | +| Metallic | 1 | u8/u16 | PBR material property | +| Normal (XYZ) | 2 | f16 | Octahedral encoding | +| Depth | 1 | f16/f32 | 1/z for precision | +| Object/Material ID | 1 | u16 | Rasterization/SDF | +| Transparency | 1 | u8/u16 | Alpha channel | + +### Optional/Derived + +| Attribute | Storage | Notes | +|-----------|---------|-------| +| Depth gradient | On-demand | Compute from depth (Sobel) | +| Laplacian | On-demand | Second derivative of depth | +| Motion vectors | 2×f16 | Screen-space XY | +| AO | 1×f16 | Ambient occlusion | + +**Key insight:** Depth derivatives cheaper to compute than store (2-4 bytes/pixel saved). + +--- + +## Packing Strategies + +### Traditional Multi-Render-Target (MRT) + +``` +RT0 (RGBA16): Albedo.rgb + Roughness (packed with metallic) +RT1 (RG16): Octahedral normal (2 channels encode XYZ) +RT2 (R32F): 1/z depth (or use hardware depth buffer) +RT3 (RG16): Motion vectors XY +RT4 (R16UI): Object/Material ID +``` + +**Total:** 4-5 render targets = 8-10 bytes/pixel + +### Compute Shader + Storage Buffer (RECOMMENDED) + +**Advantages:** +- Custom bit-packing (not bound to RGBA formats) +- Compute derivatives in-pass (depth gradient, Laplacian) +- Cache-optimized tiling (Morton order) +- No MRT limits (store 20+ attributes) + +**Tradeoffs:** +- No hardware depth/early-Z during G-buffer generation +- Manual atomics if pixel overdraw +- Lose ROPs hardware optimizations + +**Struct Layout:** +```cpp +struct GBufferPixel { + u32 packed_normal; // Octahedral 16+16 + u32 rgba_rough; // RGBA8 + Roughness8 + Metallic8 (26 bits used) + f16 inv_z; // 1/z depth + u16 material_id; // Object/material + // Total: 12 bytes/pixel +}; + +// Compressed variant (8 bytes): +struct CompactGBuffer { + u32 normal_depth; // Oct16 normal + u16 quantized depth + u32 rgba_params; // RGB565 + Rough4 + Metal4 + Flags4 +}; +``` + +**Access Pattern:** +```wgsl +@group(0) @binding(0) var<storage, read_write> g_buffer: array<GBufferPixel>; + +fn write_gbuffer(pixel_id: u32, data: SurfaceData) { + g_buffer[pixel_id].packed_normal = pack_octahedral(data.normal); + g_buffer[pixel_id].rgba_rough = pack_rgba8(data.albedo) | (u32(data.roughness * 255.0) << 24); + g_buffer[pixel_id].inv_z = f16(1.0 / data.depth); + g_buffer[pixel_id].material_id = data.id; +} +``` + +--- + +## Normal Encoding + +**Octahedral mapping** (most efficient for 2-channel storage): +- Encodes unit sphere normal to 2D square +- 16-bit per channel = good precision +- Fast encode/decode (no trig) + +```cpp +vec2 octahedral_encode(vec3 n) { + n /= (abs(n.x) + abs(n.y) + abs(n.z)); + vec2 p = n.z >= 0.0 ? n.xy : (1.0 - abs(n.yx)) * sign(n.xy); + return p * 0.5 + 0.5; // [0, 1] +} + +vec3 octahedral_decode(vec2 p) { + p = p * 2.0 - 1.0; // [-1, 1] + vec3 n = vec3(p.x, p.y, 1.0 - abs(p.x) - abs(p.y)); + float t = max(-n.z, 0.0); + n.x += n.x >= 0.0 ? -t : t; + n.y += n.y >= 0.0 ? -t : t; + return normalize(n); +} +``` + +--- + +## Depth Storage + +**1/z (inverse depth):** +- Better precision distribution (more bits near camera) +- Linear in screen space +- Matches perspective projection + +**Alternatives:** +- Logarithmic depth (even better precision) +- Hardware depth buffer (R32F, free with render targets) + +--- + +## Material Properties + +**Roughness/Metallic are NOT geometry:** +- **Source:** Texture lookups, procedural noise, or constants +- **Not bump-mapping:** Bump/normal maps perturb normals (geometry) +- **PBR properties:** Control light interaction (0=smooth/dielectric, 1=rough/metal) + +**Demoscene approach:** Procedural generation or baked constants (avoid textures). + +--- + +## Post-Processing Derivatives + +**Compute on-demand** (cheaper than storing): + +```wgsl +// Depth gradient (Sobel filter) +fn depth_gradient(uv: vec2f) -> vec2f { + let dx = textureLoad(depth, uv + vec2(1,0)) - textureLoad(depth, uv - vec2(1,0)); + let dy = textureLoad(depth, uv + vec2(0,1)) - textureLoad(depth, uv - vec2(0,1)); + return vec2(dx, dy) * 0.5; +} + +// Laplacian (edge detection) +fn laplacian(uv: vec2f) -> f32 { + let c = textureLoad(depth, uv); + let n = textureLoad(depth, uv + vec2(0,1)); + let s = textureLoad(depth, uv - vec2(0,1)); + let e = textureLoad(depth, uv + vec2(1,0)); + let w = textureLoad(depth, uv - vec2(1,0)); + return (n + s + e + w) - 4.0 * c; +} +``` + +--- + +## Integration with Hybrid Renderer + +**Current:** Hybrid SDF raymarching + rasterized proxy geometry +**Future:** Both write to unified G-buffer + +```cpp +// Rasterization pass +void rasterize_geometry() { + // Vertex shader → fragment shader + // Write to G-buffer (compute or MRT) +} + +// SDF raymarching pass (compute) +void raymarch_sdf() { + // Per-pixel ray march + // Write to same G-buffer at hit points +} + +// Deferred lighting pass +void deferred_lighting() { + // Read G-buffer + // Apply PBR lighting, shadows, etc. +} +``` + +**Atomics handling:** Use depth test or tile-based sorting to avoid conflicts. + +--- + +## Size Budget + +**Target:** 1920×1080 @ 8 bytes/pixel = **16 MB** +**Compressed:** 1920×1080 @ 6 bytes/pixel = **12 MB** + +**Acceptable for 64k demo:** RAM usage OK, not binary size. + +--- + +## Next Steps + +1. Prototype compute shader G-buffer writer +2. Implement octahedral normal encoding +3. Test SDF + raster unified writes +4. Add deferred lighting pass +5. Validate depth derivative quality (gradient/Laplacian) +6. Optimize packing (aim for 6-8 bytes/pixel) + +--- + +## References + +- Octahedral mapping: "Survey of Efficient Representations for Independent Unit Vectors" (Meyer et al.) +- PBR theory: "Physically Based Rendering" (Pharr, Jakob, Humphreys) +- G-buffer design: "Deferred Rendering in Killzone 2" (Valient, 2007) |
