1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
|
# GPU Procedural Texture Generation - Phase 4: Texture Composition
## Overview
Enable compute shaders to read existing procedural textures as input samplers, allowing multi-stage texture generation (blend, mask, modulate).
## Design
### Extended API
```cpp
struct GpuProceduralInputs {
std::vector<std::string> input_texture_names; // Names of existing textures
std::vector<WGPUTextureView> input_views; // Resolved views (internal)
};
void TextureManager::create_gpu_composite_texture(
const std::string& name,
const std::string& shader_func,
const GpuProceduralParams& params,
const GpuProceduralInputs& inputs);
```
### Shader Pattern
```wgsl
// gen_blend.wgsl - Blend two textures
@group(0) @binding(0) var output_tex: texture_storage_2d<rgba8unorm, write>;
@group(0) @binding(1) var<uniform> params: BlendParams;
@group(0) @binding(2) var input_a: texture_2d<f32>;
@group(0) @binding(3) var input_b: texture_2d<f32>;
@group(0) @binding(4) var tex_sampler: sampler;
struct BlendParams {
width: u32,
height: u32,
blend_factor: f32, // 0.0 = all A, 1.0 = all B
_pad0: f32,
}
@compute @workgroup_size(8, 8, 1)
fn main(@builtin(global_invocation_id) id: vec3<u32>) {
if (id.x >= params.width || id.y >= params.height) { return; }
let uv = vec2<f32>(f32(id.x) / f32(params.width),
f32(id.y) / f32(params.height));
let color_a = textureSampleLevel(input_a, tex_sampler, uv, 0.0);
let color_b = textureSampleLevel(input_b, tex_sampler, uv, 0.0);
let blended = mix(color_a, color_b, params.blend_factor);
textureStore(output_tex, id.xy, blended);
}
```
### Bind Group Layout Changes
**Current (single-input generators):**
- Binding 0: Storage texture (write)
- Binding 1: Uniform buffer
**New (multi-input generators):**
- Binding 0: Storage texture (write)
- Binding 1: Uniform buffer
- Binding 2+: Input textures (read, texture_2d<f32>)
- Binding N: Sampler (shared across all inputs)
### Implementation
#### 1. Extend ComputePipelineInfo
```cpp
struct ComputePipelineInfo {
WGPUComputePipeline pipeline;
const char* shader_code;
size_t uniform_size;
int num_input_textures; // NEW: 0 for gen_noise/perlin/grid, 2+ for composite
};
```
#### 2. Update get_or_create_compute_pipeline
```cpp
WGPUComputePipeline get_or_create_compute_pipeline(
const std::string& func_name,
const char* shader_code,
size_t uniform_size,
int num_input_textures = 0); // Default 0 (backward compatible)
```
Dynamically create bind group layout based on `num_input_textures`:
```cpp
// Binding 0: output texture
// Binding 1: uniform buffer
// Binding 2 to (2 + num_input_textures - 1): input textures
// Binding (2 + num_input_textures): sampler
```
#### 3. New dispatch_composite
```cpp
void dispatch_composite(const std::string& func_name,
WGPUTexture target,
const GpuProceduralParams& params,
const void* uniform_data,
size_t uniform_size,
const std::vector<WGPUTextureView>& input_views);
```
Create bind group with:
- Output storage texture (binding 0)
- Uniform buffer (binding 1)
- Input texture views (binding 2+)
- Linear sampler (binding N)
#### 4. Convenience Wrapper
```cpp
void create_gpu_composite_texture(const std::string& name,
const std::string& shader_func,
const GpuProceduralParams& params,
const std::vector<std::string>& input_names);
```
Resolve `input_names` → `WGPUTextureView[]` via `get_texture_view()`.
### Example Shaders
**gen_blend.wgsl** (~150 bytes)
- Blend two textures with lerp factor
**gen_mask.wgsl** (~180 bytes)
- Multiply texture A by texture B (use grid as mask)
**gen_modulate.wgsl** (~200 bytes)
- Multiply texture color by noise intensity
**gen_fbm_noise.wgsl** (~250 bytes)
- FBM using multiple octaves of pre-generated noise textures
### Usage Example
```cpp
// Generate base textures
GpuProceduralParams noise_params = {256, 256, {123.0f, 4.0f}, 2};
tex_mgr.create_gpu_noise_texture("noise_a", noise_params);
GpuProceduralParams grid_params = {256, 256, {32.0f, 2.0f}, 2};
tex_mgr.create_gpu_grid_texture("grid", grid_params);
// Composite: Apply grid as mask to noise
float blend_vals[1] = {0.5f};
GpuProceduralParams composite = {256, 256, blend_vals, 1};
std::vector<std::string> inputs = {"noise_a", "grid"};
tex_mgr.create_gpu_composite_texture("masked_noise", "gen_mask", composite, inputs);
```
### Asset Packer Syntax
```
# Phase 1-3: Single-input generators
NOISE_GPU, PROC_GPU(gen_noise, 1234, 16), _, "GPU noise"
# Phase 4: Multi-input composites
MASKED_NOISE, PROC_GPU(gen_mask, NOISE_GPU, GRID_GPU), _, "Masked noise"
```
**Syntax:** `PROC_GPU(shader_func, input_asset_1, input_asset_2, ...)`
- First arg: Shader function name
- Remaining args: Asset IDs of input textures (or scalar params if no uppercase)
**asset_packer changes:**
1. Parse input asset dependencies
2. Set `depends_on` field in AssetRecord
3. Generate init-time ordering (topological sort)
4. Pass input texture names to create_gpu_composite_texture
### Size Impact
**Code additions:**
- Extended dispatch_composite: ~250 bytes
- Dynamic bind group layout: ~150 bytes
- create_gpu_composite_texture: ~100 bytes
- gen_blend.wgsl shader: ~150 bytes
- gen_mask.wgsl shader: ~180 bytes
**Total Phase 4:** ~830 bytes for 2 composite shaders
**Benefits:**
- Eliminate CPU-side texture compositing
- Zero memory for intermediate buffers
- Enables complex multi-stage effects (FBM, domain warping)
### Testing
**Unit test:**
```cpp
// Create base textures
tex_mgr.create_gpu_noise_texture("noise_a", {256, 256, {1.0f, 4.0f}, 2});
tex_mgr.create_gpu_grid_texture("grid", {256, 256, {32.0f, 2.0f}, 2});
// Composite
std::vector<std::string> inputs = {"noise_a", "grid"};
tex_mgr.create_gpu_composite_texture("masked", "gen_mask", {256, 256, {}, 0}, inputs);
// Verify
WGPUTextureView view = tex_mgr.get_texture_view("masked");
assert(view != nullptr);
```
**Integration test:**
- Visual comparison of CPU vs GPU compositing
- Verify dependency ordering (inputs generated before composite)
## Future Extensions
**Domain Warping:**
```wgsl
// Use noise texture to distort UVs of another texture
let offset = textureSampleLevel(noise, sampler, uv, 0.0).rg * 0.1;
let warped_uv = uv + offset;
let color = textureSampleLevel(base, sampler, warped_uv, 0.0);
```
**Multi-octave FBM:**
```cpp
// Generate octaves at different frequencies
tex_mgr.create_gpu_noise_texture("octave_0", {256, 256, {0.0f, 2.0f}, 2});
tex_mgr.create_gpu_noise_texture("octave_1", {256, 256, {0.0f, 4.0f}, 2});
tex_mgr.create_gpu_noise_texture("octave_2", {256, 256, {0.0f, 8.0f}, 2});
// Composite with amplitude decay
std::vector<std::string> octaves = {"octave_0", "octave_1", "octave_2"};
tex_mgr.create_gpu_composite_texture("fbm", "gen_fbm", {256, 256, {}, 0}, octaves);
```
**Mipmap Generation:**
- Use compute shaders to generate mipmaps
- Downsample with box/gaussian filter
## Architecture Notes
**Backward Compatibility:**
- Phase 1-3 generators unchanged (num_input_textures = 0)
- Existing API remains valid
- Optional feature (can defer to Phase 5+)
**Dependency Ordering:**
- Asset packer performs topological sort
- GPU init generates textures in dependency order
- Circular dependencies rejected at compile-time
**Sampler Reuse:**
- Single linear sampler shared across all composite shaders
- Created once in TextureManager::init()
- Saves ~50 bytes per shader
## Critical Files
**New:**
- `assets/final/shaders/compute/gen_blend.wgsl`
- `assets/final/shaders/compute/gen_mask.wgsl`
- `src/tests/test_gpu_composite.cc`
**Modified:**
- `src/gpu/texture_manager.h` - Add composite API (~40 lines)
- `src/gpu/texture_manager.cc` - Implement dispatch_composite (~200 lines)
- `tools/asset_packer.cc` - Parse composite syntax (~80 lines)
**Total:** ~320 lines code + 2 shaders (~330 bytes)
|