Train microWakeWord detection models using a simple web-based recorder + trainer UI, packaged in a Docker container.
No Jupyter notebooks required. No manual cell execution. Just record your voice (optional) and train.
microWakeWord_Trainer-Nvidia is available in the Unraid Community Apps store. Install directly from the Unraid App Store with a one-click template.
docker pull ghcr.io/tatertotterson/microwakeword:latestdocker run -d \
--gpus all \
-p 8888:8888 \
-v $(pwd):/data \
ghcr.io/tatertotterson/microwakeword:latestWhat these flags do:
--gpus all→ Enables GPU acceleration-p 8888:8888→ Exposes the Recorder + Trainer WebUI-v $(pwd):/data→ Persists all models, datasets, and cache
Open your browser and go to:
You’ll see the microWakeWord Recorder & Trainer UI.
Personal voice recordings are optional.
- You may record your own voice for better accuracy
- Or simply click “Train” without recording anything
If no recordings are present, training will proceed using synthetic TTS samples only.
If you are running this on a remote PC / server, browser-based recording will not work unless:
- You use a reverse proxy (HTTPS + mic permissions), or
- You access the UI via localhost on the same machine
Training itself works fine remotely — only recording requires local microphone access.
- Enter your wake word
- Test pronunciation with Test TTS
- Choose:
- Number of speakers (e.g. family members)
- Takes per speaker (default: 10)
- Click Begin recording
- Speak naturally — recording:
- Starts when you talk
- Stops automatically after silence
- Repeat for each speaker
Files are saved automatically to:
personal_samples/
speaker01_take01.wav
speaker01_take02.wav
speaker02_take01.wav
...
The first time you click Train, the system will download large training datasets (background noise, speech corpora, etc.).
- This can take several minutes
- This happens only once
- Data is cached inside
/data
You will NOT need to download these again unless you delete /data.
- You can train multiple wake words back-to-back
- You do NOT need to clear any folders between runs
- Old models are preserved in timestamped output directories
- All required cleanup and reuse logic is handled automatically
When training completes, you’ll get:
<wake_word>.tflite– quantized streaming model<wake_word>.json– ESPHome-compatible metadata
Both are saved under:
/data/output/
Each run is placed in its own timestamped folder.
If you record personal samples:
- They are automatically augmented
- They are up-weighted during training
- This significantly improves real-world accuracy
No configuration required — detection is automatic.
If you want a completely clean slate:
Delete the /data folder
Then restart the container.
- Remove cached datasets
- Require re-downloading training data
- Delete trained models
Built on top of the excellent
https://github.com/kahrendt/microWakeWord
Huge thanks to the original authors ❤️
