Custom microWakeWord models for on-device wake word detection on Home Assistant voice hardware.
These models run directly on the ESP32-S3 chip — no cloud services, no streaming to a server. Wake word detection happens entirely on the device itself.
| Wake Word | Language | Model |
|---|---|---|
| Nocturna | English | models/nocturna/ |
Add the model to your Voice PE's ESPHome configuration by overriding the micro_wake_word section:
substitutions:
name: home-assistant-voice-XXXXXX
friendly_name: My Voice PE
packages:
Nabu Casa.Home Assistant Voice PE:
github://esphome/home-assistant-voice-pe/home-assistant-voice.yaml
esphome:
name: ${name}
name_add_mac_suffix: false
friendly_name: ${friendly_name}
api:
encryption:
key: "YOUR_API_KEY"
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
micro_wake_word:
id: mww
models:
- model: https://github.com/MorningstarOwl/wake-word-models/raw/main/models/nocturna/nocturna.json
id: nocturnaReplace the name, friendly_name, and api.encryption.key with the values from your existing Voice PE config.
After flashing, go to Settings → Devices & Services → ESPHome, find your Voice PE, and select Nocturna from the wake word dropdown.
Any ESPHome device with microWakeWord support (ESP32-S3 with PSRAM) can use these models. Add the model to your micro_wake_word configuration:
micro_wake_word:
models:
- model: https://github.com/MorningstarOwl/wake-word-models/raw/main/models/nocturna/nocturna.jsonSee the ESPHome microWakeWord documentation for full configuration options.
If the wake word triggers too easily or not reliably enough, you can override the detection parameters in your ESPHome YAML without re-training:
micro_wake_word:
id: mww
models:
- model: https://github.com/MorningstarOwl/wake-word-models/raw/main/models/nocturna/nocturna.json
id: nocturna
probability_cutoff: 80% # Lower = more sensitive (default from manifest: 97%)
sliding_window_size: 8 # Higher = smoother detection, slightly more latencyModels are trained locally using the microWakeWord Trainer Docker container, which automates the full pipeline:
- Synthetic speech samples are generated using Piper TTS
- Samples are augmented with room simulation, background noise, and speed variation
- A streaming MixConv neural network is trained with TensorFlow
- The model is quantized and exported to TFLite for deployment on microcontrollers
Personal voice recordings can optionally be added to improve detection accuracy for specific speakers.
- microWakeWord — The training framework
- ESPHome micro_wake_word docs — Configuration reference
- Home Assistant Voice PE — Official Voice PE firmware
- ESPHome wake word models — Official model repository
This project is licensed under the Apache License 2.0.