docs(cnn_v3): add Windows 10 + CUDA training section to HOW_TO_CNN §2

author: skal <pascal.massimino@gmail.com> 2026-03-22 10:59:21 +0100
committer: skal <pascal.massimino@gmail.com> 2026-03-22 10:59:21 +0100
commit: 0d255535bbc135b5455a21701c31fdeecbe812d9 (patch)
tree: 7ccfe629c8dfd0bc0268df6718a1322a3423c92a /cnn_v3
parent: f768bd3b09e33691148a1b4ebaae5e2d94b8accc (diff)
1 files changed, 47 insertions, 1 deletions
diff --git a/cnn_v3/docs/HOW_TO_CNN.md b/cnn_v3/docs/HOW_TO_CNN.md
index 56ee101..4e64d23 100644
--- a/cnn_v3/docs/HOW_TO_CNN.md
+++ b/cnn_v3/docs/HOW_TO_CNN.md
@@ -346,11 +346,57 @@ python3 train_cnn_v3.py \
 
 The model prints its parameter count:
 ```
-Model: enc=[4, 8]  film_cond_dim=5  params=2097  (~3.9 KB f16)
+Model: enc=[4, 8]  film_cond_dim=5  params=2740  (~5.4 KB f16)
 ```
 
 If `params` is much higher, `--enc-channels` was changed; update C++ constants accordingly.
 
+### Windows 10 + CUDA
+
+**Prerequisites — run once in a CMD or PowerShell prompt:**
+
+1. Install [Python 3.11](https://www.python.org/downloads/) (add to PATH).
+2. Install the CUDA-enabled PyTorch wheel (pick the CUDA version that matches your driver — check with `nvidia-smi`):
+   ```bat
+   :: CUDA 12.1 (most common for RTX 20/30/40 series)
+   pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
+
+   :: CUDA 11.8 (older drivers / GTX 10xx)
+   pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
+   ```
+3. Install remaining deps:
+   ```bat
+   pip install pillow numpy opencv-python
+   ```
+4. Verify GPU is visible:
+   ```bat
+   python -c "import torch; print(torch.cuda.get_device_name(0))"
+   ```
+
+**Training — from the repo root in CMD:**
+
+```bat
+cd cnn_v3\training
+python train_cnn_v3.py --input dataset/ --epochs 200
+```
+
+The script auto-detects CUDA (`Device: cuda`). Paths use forward slashes on Windows — Python handles both.
+
+**Copying the dataset from macOS/Linux:**
+
+Use `scp`, a USB drive, or any file share. The dataset is plain PNG files — no conversion needed.
+
+```bat
+:: example: copy from a network share
+robocopy \\mac\share\cnn_v3\training\dataset dataset /E
+```
+
+**Tips:**
+
+- If you get `CUDA OOM`: add `--batch-size 4 --patch-size 32`
+- `nvidia-smi` in a second window shows live VRAM usage
+- Checkpoints are `.pth` files — copy them back to macOS for export (`export_cnn_v3_weights.py` runs on any platform)
+
 ### FiLM joint training
 
 The conditioning vector `cond` is **randomised per sample** during training:
author	skal <pascal.massimino@gmail.com>	2026-03-22 10:59:21 +0100
committer	skal <pascal.massimino@gmail.com>	2026-03-22 10:59:21 +0100
commit	0d255535bbc135b5455a21701c31fdeecbe812d9 (patch)
tree	7ccfe629c8dfd0bc0268df6718a1322a3423c92a /cnn_v3
parent	f768bd3b09e33691148a1b4ebaae5e2d94b8accc (diff)