From a160cc797afb4291d356bdc0cbcf0f110e3ef8a9 Mon Sep 17 00:00:00 2001 From: skal Date: Thu, 19 Mar 2026 23:11:33 +0100 Subject: docs(cnn_v3): full design doc — U-Net + FiLM architecture plan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - CNN_V3.md: complete design document - U-Net enc_channels=[4,8], ~5 KB f16 weights - FiLM conditioning (5D → γ/β per level, CPU-side MLP) - 20-channel feature buffer, 32 bytes/pixel: two rgba32uint textures - feat_tex0: albedo.rgb, normal.xy, depth, depth_grad.xy (f16) - feat_tex1: mat_id, prev.rgb, mip1.rgb, mip2.rgb, shadow, transp (u8) - 4-pass G-buffer: raster MRT + SDF compute + lighting + pack - Per-pixel parity framework: PyTorch / HTML WebGPU / C++ WebGPU (≤1/255) - Training pipelines: Blender full G-buffer + photo-only (channel dropout) - train_cnn_v3_full.sh spec (modelled on v2 script) - HTML tool adaptation plan from cnn_v2/tools/cnn_v2_test/index.html - Binary format v3 header spec - 8-phase ordered implementation checklist - TODO.md: add CNN v3 U-Net+FiLM future task with phases - cnn_v3/README.md: update status to design phase handoff(Gemini): CNN v3 design complete. Phase 0 (stub G-buffer) unblocks all other phases — one compute shader writing feat_tex0+feat_tex1 with synthetic values from the current framebuffer. See cnn_v3/docs/CNN_V3.md Implementation Checklist. --- TODO.md | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) (limited to 'TODO.md') diff --git a/TODO.md b/TODO.md index aca1257..0ced5e8 100644 --- a/TODO.md +++ b/TODO.md @@ -60,7 +60,23 @@ Ongoing shader code hygiene for granular, reusable snippets. --- -## Future: CNN v3 8-bit Quantization +## Future: CNN v3 — U-Net + FiLM + +U-Net architecture with FiLM conditioning. Runtime style control via beat/audio. +Richer G-buffer input (normals, depth, material IDs). Per-pixel testability across +PyTorch / HTML WebGPU / C++ WebGPU. + +**Prerequisites:** G-buffer implementation (GEOM_BUFFER.md) +**Design:** `cnn_v3/docs/CNN_V3.md` + +**Phases:** +1. G-buffer prerequisite +2. Training infrastructure (Blender exporter + photo pipeline) +3. WGSL shaders (enc/dec/bottleneck, deterministic ops) +4. C++ effect class + FiLM uniform upload +5. Parity validation (test vectors, ≤1/255 per pixel) + +## Future: CNN v2 8-bit Quantization Reduce weights from f16 (~3.2 KB) to i8 (~1.6 KB). -- cgit v1.2.3