summaryrefslogtreecommitdiff
path: root/doc/MASKING_SYSTEM.md
blob: d468d4879e6f5470f1d1b8e0c6e5e2f299ede52d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
# Auxiliary Texture Masking System

## Overview

The auxiliary texture masking system allows effects to share textures within a single frame render. Primary use case: **screen-space partitioning** where multiple effects render to complementary regions of the framebuffer.

## Use Case

**Problem:** Render two different 3D scenes to different regions of the screen (split-screen, portals, picture-in-picture).

**Solution:**
- Effect1 generates a mask (1 = Effect1's region, 0 = Effect2's region)
- Effect1 renders scene A where mask = 1
- Effect2 reuses the mask and renders scene B where mask = 0
- Both render to the same framebuffer in the same frame

## Architecture Choice: Mask Texture vs Stencil Buffer

### Option 1: Stencil Buffer (NOT CHOSEN)
**Pros:** Hardware-accelerated, fast early-rejection
**Cons:** 8-bit limitation, complex pipeline config, hard to debug

### Option 2: Mask Texture (CHOSEN)
**Pros:**
- Flexible (soft edges, gradients, any format)
- Debuggable (visualize mask as texture)
- Reusable (multiple effects can read same mask)
- Simple pipeline setup

**Cons:**
- Requires auxiliary texture (~4 MB for 1280x720 RGBA8)
- Fragment shader discard (slightly slower than stencil)

**Verdict:** Mask texture flexibility and debuggability outweigh performance cost.

## Implementation

### MainSequence Auxiliary Texture Registry

```cpp
class MainSequence {
 public:
  // Register a named auxiliary texture (call once in effect init)
  void register_auxiliary_texture(const char* name, int width, int height);

  // Get texture view for reading/writing (call every frame)
  WGPUTextureView get_auxiliary_view(const char* name);

 private:
  struct AuxiliaryTexture {
    WGPUTexture texture;
    WGPUTextureView view;
    int width, height;
  };
  std::map<std::string, AuxiliaryTexture> auxiliary_textures_;
};
```

### Effect Lifecycle

```
Init Phase (once):
  Effect1.init(): register_auxiliary_texture("mask_1", width, height)
  Effect2.init(): (no registration - reuses Effect1's mask)

Compute Phase (every frame):
  Effect1.compute(): Generate mask to get_auxiliary_view("mask_1")

Scene Pass (every frame, shared render pass):
  Effect1.render(): Sample mask, discard if < 0.5, render scene A
  Effect2.render(): Sample mask, discard if > 0.5, render scene B
```

### Render Flow Diagram

```
Frame N:
┌───────────────────────────────────────────────────────────┐
│ Compute Phase:                                            │
│   Effect1.compute()                                       │
│     └─ Generate mask → auxiliary_textures_["mask_1"]      │
│                                                            │
├───────────────────────────────────────────────────────────┤
│ Scene Pass (all effects share framebuffer A + depth):     │
│   Effect1.render() [priority 5]                           │
│     ├─ Sample auxiliary_textures_["mask_1"]               │
│     ├─ Discard fragments where mask < 0.5                 │
│     └─ Render 3D scene A → framebuffer A                  │
│                                                            │
│   Effect2.render() [priority 10]                          │
│     ├─ Sample auxiliary_textures_["mask_1"]               │
│     ├─ Discard fragments where mask > 0.5                 │
│     └─ Render 3D scene B → framebuffer A                  │
│                                                            │
│ Result: framebuffer A contains both scenes, partitioned   │
├───────────────────────────────────────────────────────────┤
│ Post-Process Chain:                                       │
│   A ⟷ B ⟷ Screen                                          │
└───────────────────────────────────────────────────────────┘
```

## Example: Circular Portal Effect

### Effect1: Render Scene A (inside portal)

```cpp
class PortalSceneEffect : public Effect {
 public:
  PortalSceneEffect(const GpuContext& ctx) : Effect(ctx) {}

  void init(MainSequence* demo) override {
    demo_ = demo;
    demo->register_auxiliary_texture("portal_mask", width_, height_);
    // ... create pipelines
  }

  void compute(WGPUCommandEncoder encoder, ...) override {
    // Generate circular mask (portal region)
    WGPUTextureView mask_view = demo_->get_auxiliary_view("portal_mask");
    // ... render fullscreen quad with circular mask shader
  }

  void render(WGPURenderPassEncoder pass, ...) override {
    // Render 3D scene, discard outside portal
    WGPUTextureView mask_view = demo_->get_auxiliary_view("portal_mask");
    // ... bind mask, render scene with mask test
  }
};
```

### Effect2: Render Scene B (outside portal)

```cpp
class OutsideSceneEffect : public Effect {
 public:
  OutsideSceneEffect(const GpuContext& ctx) : Effect(ctx) {}

  void init(MainSequence* demo) override {
    demo_ = demo;
    // Don't register - reuse PortalSceneEffect's mask
  }

  void render(WGPURenderPassEncoder pass, ...) override {
    // Render 3D scene, discard inside portal
    WGPUTextureView mask_view = demo_->get_auxiliary_view("portal_mask");
    // ... bind mask, render scene with inverted mask test
  }
};
```

### Mask Generation Shader

```wgsl
// portal_mask.wgsl
@group(0) @binding(0) var<uniform> uniforms: MaskUniforms;

@fragment fn fs_main(@builtin(position) p: vec4<f32>) -> @location(0) vec4<f32> {
    let uv = p.xy / uniforms.resolution;
    let center = vec2<f32>(0.5, 0.5);
    let radius = 0.3;

    let dist = length(uv - center);
    let mask = f32(dist < radius); // 1.0 inside circle, 0.0 outside

    return vec4<f32>(mask, mask, mask, 1.0);
}
```

### Scene Rendering with Mask

```wgsl
// scene_with_mask.wgsl
@group(0) @binding(2) var mask_sampler: sampler;
@group(0) @binding(3) var mask_texture: texture_2d<f32>;

@fragment fn fs_main(in: VertexOutput) -> @location(0) vec4<f32> {
    // Sample mask
    let screen_uv = in.position.xy / uniforms.resolution;
    let mask_value = textureSample(mask_texture, mask_sampler, screen_uv).r;

    // Effect1: Discard outside portal (mask = 0)
    if (mask_value < 0.5) {
        discard;
    }

    // Effect2: Invert test - discard inside portal (mask = 1)
    // if (mask_value > 0.5) { discard; }

    // Render scene
    return compute_scene_color(in);
}
```

## Memory Impact

Each auxiliary texture: **width × height × 4 bytes**
- 1280×720 RGBA8: ~3.7 MB
- 1920×1080 RGBA8: ~8.3 MB

For 2-3 masks: 10-25 MB total (acceptable overhead).

## Use Cases

1. **Split-screen**: Vertical/horizontal partition
2. **Portals**: Circular/arbitrary shape windows to other scenes
3. **Picture-in-picture**: Small viewport in corner
4. **Masked transitions**: Wipe effects between scenes
5. **Shadow maps**: Pre-generated in compute, used in render
6. **Reflection probes**: Generated once, reused by multiple objects

## Alternatives Considered

### Effect-Owned Texture (No MainSequence changes)
```cpp
auto effect1 = std::make_shared<Effect1>(...);
auto effect2 = std::make_shared<Effect2>(...);
effect2->set_mask_source(effect1->get_mask_view());
```

**Pros:** No MainSequence changes
**Cons:** Manual wiring, effects coupled, less flexible

**Verdict:** Not chosen. Registry approach is cleaner and more maintainable.

## Future Extensions

- **Multi-channel masks**: RGBA mask for 4 independent regions
- **Mipmap support**: For hierarchical queries
- **Compression**: Quantize masks to R8 (1 byte per pixel)
- **Enum-based lookup**: Replace string keys for size optimization

## Size Impact

- MainSequence changes: ~100 lines (~500 bytes code)
- std::map usage: ~1 KB overhead (low priority for CRT removal)
- Runtime memory: 4-8 MB per mask (acceptable)

---

*Document created: February 8, 2026*