This repository provides a set of ROS 2 packages to integrate piper TTS (Text-to-Speech) into ROS 2 using audio_common.
| ROS 2 Distro | Branch | Build status | Docker Image |
|---|---|---|---|
| Humble | main |
||
| Iron | main |
||
| Jazzy | main |
||
| Kilted | main |
||
| Rolling | main |
To run piper_ros follow the next commands:
cd ~/ros2_ws/src
git clone https://github.com/mgonzs13/piper_ros.git
cd ~/ros2_ws
vcs import src < src/whisper_ros/dependencies.repos
rosdep install --from-paths src --ignore-src -r -y
colcon buildYou can build the piper_ros docker:
docker build -t piper_ros .Then, you can run the docker container:
docker run -it --rm --device /dev/snd piper_rosros2 launch piper_bringup piper.launch.pyros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World from ros 2'}"ros2 launch piper_bringup piper_spanish.launch.pyros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hola Mundo desde ros 2'}"| Param | Type | Default | Description |
|---|---|---|---|
chunk |
int32 |
512 |
Chunk size in samples for audio publication. |
frame_id |
string |
"" |
Frame ID attached to published AudioStamped headers. |
| Param | Type | Default | Description |
|---|---|---|---|
model.repo |
string |
"rhasspy/piper-voices" |
HuggingFace repository for model download. |
model.filename |
string |
"en/en_US/lessac/low/en_US-lessac-low.onnx" |
Filename of the model in the repository. |
model.path |
string |
"" |
Local path to the voice model file. If empty, the model is downloaded from model.repo. |
model.config_repo |
string |
"rhasspy/piper-voices" |
HuggingFace repository for the model config download. |
model.config_filename |
string |
"en/en_US/lessac/low/en_US-lessac-low.onnx.json" |
Filename of the model config in the repository. |
model.config_path |
string |
"" |
Local path to the JSON voice config file. If empty, the config is downloaded from model.config_repo. |
| Param | Type | Default | Description |
|---|---|---|---|
synthesis.speaker_id |
int32 |
0 |
Numerical speaker ID for multi-speaker voices. |
synthesis.noise_scale |
float |
0.667 |
Amount of noise added during audio generation. |
synthesis.length_scale |
float |
1.0 |
Speed of speaking (1 = normal, < 1 faster, > 1 slower). |
synthesis.noise_w_scale |
float |
0.8 |
Variation in phoneme lengths during synthesis. |
synthesis.sentence_silence_seconds |
float |
0.2 |
Seconds of silence inserted between sentences. |