diff options
| author | skal <pascal.massimino@gmail.com> | 2026-03-20 09:22:18 +0100 |
|---|---|---|
| committer | skal <pascal.massimino@gmail.com> | 2026-03-20 09:22:18 +0100 |
| commit | a10cabbe3a5ae05730c2e76493e42554ee6037ba (patch) | |
| tree | 5c2fbb717f53dc701b1536b4f98685bc042820e1 /cnn_v3 | |
| parent | f74bcd843c631f82daefe543fca7741fb5bb71f4 (diff) | |
feat(cnn_v3): Phase 1 complete - GBufferEffect integrated + HOWTO playbook
- Wire GBufferEffect into demo build: assets.txt, DemoSourceLists.cmake,
demo_effects.h, shaders.h/cc. ShaderComposer::Compose() applied to
gbuf_raster.wgsl (resolves #include "common_uniforms").
- Add GBufferEffect construction test. 35/35 passing.
- Write cnn_v3/docs/HOWTO.md: G-buffer wiring, training data prep,
training plan, per-pixel validation workflow, phase status table,
troubleshooting guide.
- Add project hooks: remind to update HOWTO.md on cnn_v3/ edits;
warn on direct str_view(*_wgsl) usage bypassing ShaderComposer.
- Update PROJECT_CONTEXT.md and TODO.md: Phase 1 done,
Phase 3 (WGSL U-Net shaders) is next active.
handoff(Gemini): CNN v3 Phase 3 is next - WGSL enc/dec/bottleneck/FiLM
shaders in cnn_v3/shaders/. See cnn_v3/docs/CNN_V3.md Architecture
section and cnn_v3/docs/HOWTO.md section 3 for spec. GBufferEffect
outputs feat_tex0 + feat_tex1 (rgba32uint, 20ch, 32 bytes/pixel).
C++ CNNv3Effect (Phase 4) takes those as input nodes.
Diffstat (limited to 'cnn_v3')
| -rw-r--r-- | cnn_v3/README.md | 4 | ||||
| -rw-r--r-- | cnn_v3/docs/HOWTO.md | 235 | ||||
| -rw-r--r-- | cnn_v3/src/gbuffer_effect.cc | 10 |
3 files changed, 246 insertions, 3 deletions
diff --git a/cnn_v3/README.md b/cnn_v3/README.md index a22d823..f161bf4 100644 --- a/cnn_v3/README.md +++ b/cnn_v3/README.md @@ -31,7 +31,9 @@ Add images directly to these directories and commit them. ## Status -**Design phase.** Architecture defined, G-buffer prerequisite pending. +**Phase 1 complete.** G-buffer integrated (raster + pack), 35/35 tests pass. +Training infrastructure ready. U-Net WGSL shaders are next. +See `cnn_v3/docs/HOWTO.md` for the practical playbook. See `cnn_v3/docs/CNN_V3.md` for full design. See `cnn_v2/` for reference implementation. diff --git a/cnn_v3/docs/HOWTO.md b/cnn_v3/docs/HOWTO.md new file mode 100644 index 0000000..88d4bbc --- /dev/null +++ b/cnn_v3/docs/HOWTO.md @@ -0,0 +1,235 @@ +# CNN v3 How-To + +Practical playbook for the CNN v3 pipeline: G-buffer effect, training data, +training the U-Net+FiLM network, and wiring everything into the demo. + +See `CNN_V3.md` for the full architecture design. + +--- + +## 1. Using GBufferEffect in the Demo + +`GBufferEffect` is a full-class effect (Path B in `doc/EFFECT_WORKFLOW.md`). +It rasterizes proxy geometry to MRT G-buffer textures and packs them into two +`rgba32uint` feature textures (`feat_tex0`, `feat_tex1`) consumed by the CNN. + +### Registration (already done) + +- Shaders in `assets.txt`: `SHADER_GBUF_RASTER`, `SHADER_GBUF_PACK` +- Source in `cmake/DemoSourceLists.cmake`: `cnn_v3/src/gbuffer_effect.cc` +- Header included in `src/gpu/demo_effects.h` +- Test in `src/tests/gpu/test_demo_effects.cc` + +### Adding to a Sequence + +`GBufferEffect` does not exist in `seq_compiler.py` as a named effect yet +(no `.seq` syntax integration for Phase 1). Wire it directly in C++ alongside +your scene code, or add it to the timeline when the full CNNv3Effect is ready. + +**C++ wiring example** (e.g. inside a Sequence or main.cc): + +```cpp +#include "../../cnn_v3/src/gbuffer_effect.h" + +// Allocate once alongside your scene +auto gbuf = std::make_shared<GBufferEffect>( + ctx, /*inputs=*/{"prev_cnn"}, // or any dummy node + /*outputs=*/{"gbuf_feat0", "gbuf_feat1"}, + /*start=*/0.0f, /*end=*/60.0f); + +gbuf->set_scene(&my_scene, &my_camera); + +// In render loop, call before CNN pass: +gbuf->render(encoder, params, nodes); +``` + +### Internal passes + +Each frame, `GBufferEffect::render()` executes: + +1. **Pass 1 — MRT rasterization** (`gbuf_raster.wgsl`) + - Proxy box (36 verts) × N objects, instanced + - MRT outputs: `gbuf_albedo` (rgba16float), `gbuf_normal_mat` (rgba16float) + - Depth test + write into `gbuf_depth` (depth32float) + +2. **Pass 2/3 — SDF + Lighting** — TODO (placeholder: shadow=1, transp=0) + +3. **Pass 4 — Pack compute** (`gbuf_pack.wgsl`) + - Reads all G-buffer textures + `prev_cnn` input + - Writes `feat_tex0` + `feat_tex1` (rgba32uint, 20 channels, 32 bytes/pixel) + +### Output node names + +By default the outputs are named from the `outputs` vector passed to the +constructor. Use these names when binding the CNN effect input: + +``` +outputs[0] → feat_tex0 (rgba32uint: albedo.rgb, normal.xy, depth, depth_grad.xy) +outputs[1] → feat_tex1 (rgba32uint: mat_id, prev.rgb, mip1.rgb, mip2.rgb, shadow, transp) +``` + +### Scene data + +Call `set_scene(scene, camera)` before the first render. The effect uploads +`GlobalUniforms` (view-proj, camera pos, resolution) and `ObjectData` (model +matrix, color) to GPU storage buffers each frame. + +--- + +## 2. Preparing Training Data + +CNN v3 supports two data sources: Blender renders and real photos. + +### 2a. From Blender Renders + +```bash +# 1. In Blender: run the export script (requires Blender 3.x+) +blender --background scene.blend --python cnn_v3/training/blender_export.py \ + -- --output /tmp/renders/ --frames 200 + +# 2. Pack into sample directory +python3 cnn_v3/training/pack_blender_sample.py \ + --render-dir /tmp/renders/frame_0001/ \ + --output dataset/blender/sample_0001/ +``` + +Each sample directory contains: +``` +sample_XXXX/ + albedo.png — RGB uint8 (material color, pre-lighting) + normal.png — RG uint8 (oct-encoded XY, remap [0,1]) + depth.png — R uint16 (1/z normalized, 16-bit) + matid.png — R uint8 (object index / 255) + shadow.png — R uint8 (0=dark, 255=lit) + transp.png — R uint8 (0=opaque, 255=transparent) + target.png — RGB/RGBA (stylized ground truth) +``` + +### 2b. From Real Photos + +Geometric channels are zeroed; the network degrades gracefully due to +channel-dropout training. + +```bash +python3 cnn_v3/training/pack_photo_sample.py \ + --photo cnn_v3/training/input/photo1.jpg \ + --output dataset/photos/sample_001/ +``` + +The output `target.png` defaults to the input photo (no style). Copy in +your stylized version as `target.png` before training. + +### Dataset layout + +``` +dataset/ + blender/ + sample_0001/ sample_0002/ ... + photos/ + sample_001/ sample_002/ ... +``` + +Mix freely; the dataloader treats all sample directories uniformly. + +--- + +## 3. Training + +*(Network not yet implemented — this section will be filled as Phase 3+ lands.)* + +**Planned command:** +```bash +python3 cnn_v3/training/train_cnn_v3.py \ + --dataset dataset/ \ + --epochs 500 \ + --output cnn_v3/weights/cnn_v3_weights.bin +``` + +**FiLM conditioning** during training: +- Beat/audio inputs are randomized per sample +- Network learns to produce varied styles from same geometry + +**Validation:** +```bash +python3 cnn_v3/training/train_cnn_v3.py --validate \ + --checkpoint cnn_v3/weights/cnn_v3_weights.bin \ + --input test_frame.png +``` + +--- + +## 4. Running the CNN v3 Effect (Future) + +Once the C++ CNNv3Effect exists: + +```seq +# BPM 120 +SEQUENCE 0 0 "Scene with CNN v3" + EFFECT + GBufferEffect prev_cnn -> gbuf_feat0 gbuf_feat1 0 60 + EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> sink 0 60 +``` + +FiLM parameters are uploaded via uniform each frame: +```cpp +cnn_v3_effect->set_film_params( + params.beat_phase, params.beat_time / 8.0f, params.audio_intensity, + style_p0, style_p1); +``` + +--- + +## 5. Per-Pixel Validation + +The CNN v3 design requires exact parity between PyTorch, WGSL (HTML), and C++. + +*(Validation tooling not yet implemented.)* + +**Planned workflow:** +1. Export test input + weights as JSON +2. Run Python reference → save per-pixel output +3. Run HTML WebGPU tool → compare against Python +4. Run C++ `cnn_v3_test` tool → compare against Python +5. All comparisons must pass at ≤ 1/255 per pixel + +--- + +## 6. Phase Status + +| Phase | Status | Notes | +|-------|--------|-------| +| 1 — G-buffer (raster + pack) | ✅ Done | Integrated, 35/35 tests pass | +| 1 — G-buffer (SDF + shadow passes) | TODO | Placeholder in place | +| 2 — Training infrastructure | ✅ Done | blender_export.py, pack_*_sample.py | +| 3 — WGSL U-Net shaders | TODO | enc/dec/bottleneck/FiLM | +| 4 — C++ CNNv3Effect | TODO | FiLM uniform upload | +| 5 — Parity validation | TODO | Test vectors, ≤1/255 | + +--- + +## 7. Quick Troubleshooting + +**GBufferEffect renders nothing / albedo is black** +- Check `set_scene()` was called before `render()` +- Verify scene has at least one object +- Check camera matrix is not degenerate (near/far, aspect) + +**Pack shader fails to compile** +- `gbuf_pack.wgsl` uses no `#include`s; ShaderComposer compose is a no-op +- Check `ASSET_SHADER_GBUF_PACK` resolves in assets.txt + +**Raster shader fails with `#include "common_uniforms"` error** +- `ShaderComposer::Get().Compose({"common_uniforms"}, src)` must be called + before passing to `wgpuDeviceCreateShaderModule` — already done in effect.cc + +**G-buffer outputs wrong resolution** +- `resize()` is not yet implemented in GBufferEffect; textures are fixed + at construction size. Will be added when resize support is needed. + +--- + +## See Also + +- `cnn_v3/docs/CNN_V3.md` — Full architecture design (U-Net, FiLM, feature layout) +- `doc/EFFECT_WORKFLOW.md` — General effect integration guide +- `cnn_v2/docs/CNN_V2.md` — Reference implementation (simpler, operational) +- `src/tests/gpu/test_demo_effects.cc` — GBufferEffect construction test diff --git a/cnn_v3/src/gbuffer_effect.cc b/cnn_v3/src/gbuffer_effect.cc index fb0146e..750188f 100644 --- a/cnn_v3/src/gbuffer_effect.cc +++ b/cnn_v3/src/gbuffer_effect.cc @@ -4,6 +4,7 @@ #include "gbuffer_effect.h" #include "3d/object.h" #include "gpu/gpu.h" +#include "gpu/shader_composer.h" #include "util/fatal_error.h" #include "util/mini_math.h" #include <cstring> @@ -390,9 +391,12 @@ void GBufferEffect::create_raster_pipeline() { return; // Asset not loaded yet; pipeline creation deferred. } + const std::string composed = + ShaderComposer::Get().Compose({"common_uniforms"}, src); + WGPUShaderSourceWGSL wgsl_src = {}; wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL; - wgsl_src.code = str_view(src); + wgsl_src.code = str_view(composed.c_str()); WGPUShaderModuleDescriptor shader_desc = {}; shader_desc.nextInChain = &wgsl_src.chain; @@ -466,9 +470,11 @@ void GBufferEffect::create_pack_pipeline() { return; } + const std::string composed = ShaderComposer::Get().Compose({}, src); + WGPUShaderSourceWGSL wgsl_src = {}; wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL; - wgsl_src.code = str_view(src); + wgsl_src.code = str_view(composed.c_str()); WGPUShaderModuleDescriptor shader_desc = {}; shader_desc.nextInChain = &wgsl_src.chain; |
