summaryrefslogtreecommitdiff
path: root/training/README.md
blob: 0a46718c0cdb3acb810d1b5f078469c6e9eeabd1 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
# CNN Training Tools

Tools for training and preparing data for the CNN post-processing effect.

---

## train_cnn.py

PyTorch-based training script for image-to-image stylization.

### Basic Usage

```bash
python3 train_cnn.py --input <input_dir> --target <target_dir> [options]
```

### Examples

**Single layer, 3×3 kernel:**
```bash
python3 train_cnn.py --input training/input --target training/output \
  --layers 1 --kernel-sizes 3 --epochs 500
```

**Multi-layer, mixed kernels:**
```bash
python3 train_cnn.py --input training/input --target training/output \
  --layers 3 --kernel-sizes 3,5,3 --epochs 1000
```

**With checkpointing:**
```bash
python3 train_cnn.py --input training/input --target training/output \
  --epochs 500 --checkpoint-every 50
```

**Resume from checkpoint:**
```bash
python3 train_cnn.py --input training/input --target training/output \
  --resume training/checkpoints/checkpoint_epoch_200.pth
```

### Options

| Option | Default | Description |
|--------|---------|-------------|
| `--input` | *required* | Input image directory |
| `--target` | *required* | Target image directory |
| `--layers` | 1 | Number of CNN layers |
| `--kernel-sizes` | 3 | Comma-separated kernel sizes (auto-repeats if single value) |
| `--epochs` | 100 | Training epochs |
| `--batch-size` | 4 | Batch size |
| `--learning-rate` | 0.001 | Learning rate |
| `--output` | `workspaces/main/shaders/cnn/cnn_weights_generated.wgsl` | Output WGSL file |
| `--checkpoint-every` | 0 | Save checkpoint every N epochs (0=disabled) |
| `--checkpoint-dir` | `training/checkpoints` | Checkpoint directory |
| `--resume` | None | Resume from checkpoint file |

### Architecture

- **Layer 0:** `CoordConv2d` - accepts (x,y) patch center + 3×3 RGBA samples
- **Layers 1+:** Standard `Conv2d` - 3×3 RGBA samples only
- **Activation:** Tanh between layers
- **Output:** Residual connection (30% stylization blend)

### Requirements

```bash
pip install torch torchvision pillow
```

---

## image_style_processor.py

Generates stylized target images from raw renders.

### Usage

```bash
python3 image_style_processor.py <input_dir> <output_dir> <style>
```

### Available Styles

**Sketch:**
- `pencil_sketch` - Dense cross-hatching
- `ink_drawing` - Bold outlines, comic style
- `charcoal_pastel` - Soft, dramatic contrasts
- `conte_crayon` - Directional strokes
- `gesture_sketch` - Loose, energetic lines

**Futuristic:**
- `circuit_board` - Tech blueprint
- `glitch_art` - Digital corruption
- `wireframe_topo` - Topographic contours
- `data_mosaic` - Voronoi fragmentation
- `holographic_scan` - CRT/HUD aesthetic

### Examples

```bash
# Generate pencil sketch targets
python3 image_style_processor.py input/ output/ pencil_sketch

# Generate glitch art targets
python3 image_style_processor.py input/ output/ glitch_art
```

### Requirements

```bash
pip install opencv-python numpy
```

---

## Workflow

### 1. Render Raw Frames

Generate raw 3D renders as input:
```bash
./build/demo64k --headless --duration 5 --output training/input/
```

### 2. Generate Stylized Targets

Apply artistic style:
```bash
python3 training/image_style_processor.py training/input/ training/output/ pencil_sketch
```

### 3. Train CNN

Train network to reproduce the style:
```bash
python3 training/train_cnn.py \
  --input training/input \
  --target training/output \
  --epochs 500 \
  --checkpoint-every 50
```

### 4. Rebuild Demo

Weights auto-exported to `cnn_weights_generated.wgsl`:
```bash
cmake --build build -j4
./build/demo64k
```

---

## Tips

- **Training data:** 10-50 image pairs recommended
- **Resolution:** 256×256 (auto-resized during training)
- **Checkpoints:** Save every 50-100 epochs for long runs
- **Loss plateaus:** Try lower learning rate (0.0001) or more layers
- **Residual connection:** Prevents catastrophic divergence (input always blended in)

---

## Coordinate-Aware Layer 0

Layer 0 receives normalized (x,y) patch center coordinates, enabling position-dependent effects:

- **Vignetting:** Darker edges
- **Radial gradients:** Center-focused stylization
- **Corner effects:** Edge-specific treatments

Training coordinate grid is auto-generated during forward pass. No manual intervention needed.

Size impact: +32B coord weights (kernel-agnostic).

---

## References

- **CNN Effect Documentation:** `doc/CNN_EFFECT.md`
- **Training Architecture:** See `train_cnn.py` (CoordConv2d class)