summaryrefslogtreecommitdiff
path: root/cnn_v3
diff options
context:
space:
mode:
Diffstat (limited to 'cnn_v3')
-rw-r--r--cnn_v3/README.md4
-rw-r--r--cnn_v3/docs/HOWTO.md235
-rw-r--r--cnn_v3/src/gbuffer_effect.cc10
3 files changed, 246 insertions, 3 deletions
diff --git a/cnn_v3/README.md b/cnn_v3/README.md
index a22d823..f161bf4 100644
--- a/cnn_v3/README.md
+++ b/cnn_v3/README.md
@@ -31,7 +31,9 @@ Add images directly to these directories and commit them.
## Status
-**Design phase.** Architecture defined, G-buffer prerequisite pending.
+**Phase 1 complete.** G-buffer integrated (raster + pack), 35/35 tests pass.
+Training infrastructure ready. U-Net WGSL shaders are next.
+See `cnn_v3/docs/HOWTO.md` for the practical playbook.
See `cnn_v3/docs/CNN_V3.md` for full design.
See `cnn_v2/` for reference implementation.
diff --git a/cnn_v3/docs/HOWTO.md b/cnn_v3/docs/HOWTO.md
new file mode 100644
index 0000000..88d4bbc
--- /dev/null
+++ b/cnn_v3/docs/HOWTO.md
@@ -0,0 +1,235 @@
+# CNN v3 How-To
+
+Practical playbook for the CNN v3 pipeline: G-buffer effect, training data,
+training the U-Net+FiLM network, and wiring everything into the demo.
+
+See `CNN_V3.md` for the full architecture design.
+
+---
+
+## 1. Using GBufferEffect in the Demo
+
+`GBufferEffect` is a full-class effect (Path B in `doc/EFFECT_WORKFLOW.md`).
+It rasterizes proxy geometry to MRT G-buffer textures and packs them into two
+`rgba32uint` feature textures (`feat_tex0`, `feat_tex1`) consumed by the CNN.
+
+### Registration (already done)
+
+- Shaders in `assets.txt`: `SHADER_GBUF_RASTER`, `SHADER_GBUF_PACK`
+- Source in `cmake/DemoSourceLists.cmake`: `cnn_v3/src/gbuffer_effect.cc`
+- Header included in `src/gpu/demo_effects.h`
+- Test in `src/tests/gpu/test_demo_effects.cc`
+
+### Adding to a Sequence
+
+`GBufferEffect` does not exist in `seq_compiler.py` as a named effect yet
+(no `.seq` syntax integration for Phase 1). Wire it directly in C++ alongside
+your scene code, or add it to the timeline when the full CNNv3Effect is ready.
+
+**C++ wiring example** (e.g. inside a Sequence or main.cc):
+
+```cpp
+#include "../../cnn_v3/src/gbuffer_effect.h"
+
+// Allocate once alongside your scene
+auto gbuf = std::make_shared<GBufferEffect>(
+ ctx, /*inputs=*/{"prev_cnn"}, // or any dummy node
+ /*outputs=*/{"gbuf_feat0", "gbuf_feat1"},
+ /*start=*/0.0f, /*end=*/60.0f);
+
+gbuf->set_scene(&my_scene, &my_camera);
+
+// In render loop, call before CNN pass:
+gbuf->render(encoder, params, nodes);
+```
+
+### Internal passes
+
+Each frame, `GBufferEffect::render()` executes:
+
+1. **Pass 1 — MRT rasterization** (`gbuf_raster.wgsl`)
+ - Proxy box (36 verts) × N objects, instanced
+ - MRT outputs: `gbuf_albedo` (rgba16float), `gbuf_normal_mat` (rgba16float)
+ - Depth test + write into `gbuf_depth` (depth32float)
+
+2. **Pass 2/3 — SDF + Lighting** — TODO (placeholder: shadow=1, transp=0)
+
+3. **Pass 4 — Pack compute** (`gbuf_pack.wgsl`)
+ - Reads all G-buffer textures + `prev_cnn` input
+ - Writes `feat_tex0` + `feat_tex1` (rgba32uint, 20 channels, 32 bytes/pixel)
+
+### Output node names
+
+By default the outputs are named from the `outputs` vector passed to the
+constructor. Use these names when binding the CNN effect input:
+
+```
+outputs[0] → feat_tex0 (rgba32uint: albedo.rgb, normal.xy, depth, depth_grad.xy)
+outputs[1] → feat_tex1 (rgba32uint: mat_id, prev.rgb, mip1.rgb, mip2.rgb, shadow, transp)
+```
+
+### Scene data
+
+Call `set_scene(scene, camera)` before the first render. The effect uploads
+`GlobalUniforms` (view-proj, camera pos, resolution) and `ObjectData` (model
+matrix, color) to GPU storage buffers each frame.
+
+---
+
+## 2. Preparing Training Data
+
+CNN v3 supports two data sources: Blender renders and real photos.
+
+### 2a. From Blender Renders
+
+```bash
+# 1. In Blender: run the export script (requires Blender 3.x+)
+blender --background scene.blend --python cnn_v3/training/blender_export.py \
+ -- --output /tmp/renders/ --frames 200
+
+# 2. Pack into sample directory
+python3 cnn_v3/training/pack_blender_sample.py \
+ --render-dir /tmp/renders/frame_0001/ \
+ --output dataset/blender/sample_0001/
+```
+
+Each sample directory contains:
+```
+sample_XXXX/
+ albedo.png — RGB uint8 (material color, pre-lighting)
+ normal.png — RG uint8 (oct-encoded XY, remap [0,1])
+ depth.png — R uint16 (1/z normalized, 16-bit)
+ matid.png — R uint8 (object index / 255)
+ shadow.png — R uint8 (0=dark, 255=lit)
+ transp.png — R uint8 (0=opaque, 255=transparent)
+ target.png — RGB/RGBA (stylized ground truth)
+```
+
+### 2b. From Real Photos
+
+Geometric channels are zeroed; the network degrades gracefully due to
+channel-dropout training.
+
+```bash
+python3 cnn_v3/training/pack_photo_sample.py \
+ --photo cnn_v3/training/input/photo1.jpg \
+ --output dataset/photos/sample_001/
+```
+
+The output `target.png` defaults to the input photo (no style). Copy in
+your stylized version as `target.png` before training.
+
+### Dataset layout
+
+```
+dataset/
+ blender/
+ sample_0001/ sample_0002/ ...
+ photos/
+ sample_001/ sample_002/ ...
+```
+
+Mix freely; the dataloader treats all sample directories uniformly.
+
+---
+
+## 3. Training
+
+*(Network not yet implemented — this section will be filled as Phase 3+ lands.)*
+
+**Planned command:**
+```bash
+python3 cnn_v3/training/train_cnn_v3.py \
+ --dataset dataset/ \
+ --epochs 500 \
+ --output cnn_v3/weights/cnn_v3_weights.bin
+```
+
+**FiLM conditioning** during training:
+- Beat/audio inputs are randomized per sample
+- Network learns to produce varied styles from same geometry
+
+**Validation:**
+```bash
+python3 cnn_v3/training/train_cnn_v3.py --validate \
+ --checkpoint cnn_v3/weights/cnn_v3_weights.bin \
+ --input test_frame.png
+```
+
+---
+
+## 4. Running the CNN v3 Effect (Future)
+
+Once the C++ CNNv3Effect exists:
+
+```seq
+# BPM 120
+SEQUENCE 0 0 "Scene with CNN v3"
+ EFFECT + GBufferEffect prev_cnn -> gbuf_feat0 gbuf_feat1 0 60
+ EFFECT + CNNv3Effect gbuf_feat0 gbuf_feat1 -> sink 0 60
+```
+
+FiLM parameters are uploaded via uniform each frame:
+```cpp
+cnn_v3_effect->set_film_params(
+ params.beat_phase, params.beat_time / 8.0f, params.audio_intensity,
+ style_p0, style_p1);
+```
+
+---
+
+## 5. Per-Pixel Validation
+
+The CNN v3 design requires exact parity between PyTorch, WGSL (HTML), and C++.
+
+*(Validation tooling not yet implemented.)*
+
+**Planned workflow:**
+1. Export test input + weights as JSON
+2. Run Python reference → save per-pixel output
+3. Run HTML WebGPU tool → compare against Python
+4. Run C++ `cnn_v3_test` tool → compare against Python
+5. All comparisons must pass at ≤ 1/255 per pixel
+
+---
+
+## 6. Phase Status
+
+| Phase | Status | Notes |
+|-------|--------|-------|
+| 1 — G-buffer (raster + pack) | ✅ Done | Integrated, 35/35 tests pass |
+| 1 — G-buffer (SDF + shadow passes) | TODO | Placeholder in place |
+| 2 — Training infrastructure | ✅ Done | blender_export.py, pack_*_sample.py |
+| 3 — WGSL U-Net shaders | TODO | enc/dec/bottleneck/FiLM |
+| 4 — C++ CNNv3Effect | TODO | FiLM uniform upload |
+| 5 — Parity validation | TODO | Test vectors, ≤1/255 |
+
+---
+
+## 7. Quick Troubleshooting
+
+**GBufferEffect renders nothing / albedo is black**
+- Check `set_scene()` was called before `render()`
+- Verify scene has at least one object
+- Check camera matrix is not degenerate (near/far, aspect)
+
+**Pack shader fails to compile**
+- `gbuf_pack.wgsl` uses no `#include`s; ShaderComposer compose is a no-op
+- Check `ASSET_SHADER_GBUF_PACK` resolves in assets.txt
+
+**Raster shader fails with `#include "common_uniforms"` error**
+- `ShaderComposer::Get().Compose({"common_uniforms"}, src)` must be called
+ before passing to `wgpuDeviceCreateShaderModule` — already done in effect.cc
+
+**G-buffer outputs wrong resolution**
+- `resize()` is not yet implemented in GBufferEffect; textures are fixed
+ at construction size. Will be added when resize support is needed.
+
+---
+
+## See Also
+
+- `cnn_v3/docs/CNN_V3.md` — Full architecture design (U-Net, FiLM, feature layout)
+- `doc/EFFECT_WORKFLOW.md` — General effect integration guide
+- `cnn_v2/docs/CNN_V2.md` — Reference implementation (simpler, operational)
+- `src/tests/gpu/test_demo_effects.cc` — GBufferEffect construction test
diff --git a/cnn_v3/src/gbuffer_effect.cc b/cnn_v3/src/gbuffer_effect.cc
index fb0146e..750188f 100644
--- a/cnn_v3/src/gbuffer_effect.cc
+++ b/cnn_v3/src/gbuffer_effect.cc
@@ -4,6 +4,7 @@
#include "gbuffer_effect.h"
#include "3d/object.h"
#include "gpu/gpu.h"
+#include "gpu/shader_composer.h"
#include "util/fatal_error.h"
#include "util/mini_math.h"
#include <cstring>
@@ -390,9 +391,12 @@ void GBufferEffect::create_raster_pipeline() {
return; // Asset not loaded yet; pipeline creation deferred.
}
+ const std::string composed =
+ ShaderComposer::Get().Compose({"common_uniforms"}, src);
+
WGPUShaderSourceWGSL wgsl_src = {};
wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL;
- wgsl_src.code = str_view(src);
+ wgsl_src.code = str_view(composed.c_str());
WGPUShaderModuleDescriptor shader_desc = {};
shader_desc.nextInChain = &wgsl_src.chain;
@@ -466,9 +470,11 @@ void GBufferEffect::create_pack_pipeline() {
return;
}
+ const std::string composed = ShaderComposer::Get().Compose({}, src);
+
WGPUShaderSourceWGSL wgsl_src = {};
wgsl_src.chain.sType = WGPUSType_ShaderSourceWGSL;
- wgsl_src.code = str_view(src);
+ wgsl_src.code = str_view(composed.c_str());
WGPUShaderModuleDescriptor shader_desc = {};
shader_desc.nextInChain = &wgsl_src.chain;