Audio Service
The audio service is responsible for handling TTS and simple sounds playback
Two TTS plugins may be loaded at once, if the primary plugin fails for some reason the second plugin will be used.
This allows you to have a lower quality offline voice as fallback to account for internet outages, this ensures your device can always give you feedback
"tts": {
"pulse_duck": false,
// plugins to load
"module": "ovos-tts-plugin-server",
"fallback_module": "ovos-tts-plugin-mimic",
// individual plugin configs
"ovos-tts-plugin-server": {
"host": "",
"v2": true,
"verify_ssl": true,
"tts_timeout": 5,
Skill Methods
skills can use self.play_audio
, self.acknowledge
, self.speak
and self.speak_dialog
methods to interact with ovos-audio
def play_audio(self, filename: str, instant: bool = False):
Queue and audio file for playback
@param filename: File to play
@param instant: if True audio will be played instantly
instead of queued with TTS
def acknowledge(self):
Acknowledge a successful request.
This method plays a sound to acknowledge a request that does not
require a verbal response. This is intended to provide simple feedback
to the user that their request was handled successfully.
def speak(self, utterance: str, expect_response: bool = False, wait: Union[bool, int] = False):
"""Speak a sentence.
utterance (str): sentence mycroft should speak
expect_response (bool): set to True if Mycroft should listen
for a response immediately after
speaking the utterance.
wait (Union[bool, int]): set to True to block while the text
is being spoken for 15 seconds. Alternatively, set
to an integer to specify a timeout in seconds.
def speak_dialog(self, key: str, data: Optional[dict] = None,
expect_response: bool = False, wait: Union[bool, int] = False):
Speak a random sentence from a dialog file.
key (str): dialog file key (e.g. "hello" to speak from the file
data (dict): information used to populate sentence
expect_response (bool): set to True if Mycroft should listen
for a response immediately after
speaking the utterance.
wait (Union[bool, int]): set to True to block while the text
is being spoken for 15 seconds. Alternatively, set
to an integer to specify a timeout in seconds.
to play sounds via bus messages emit ""
or ""
with data {"uri": "path/sound.mp3"}
implements a queue for sounds, any OVOS component can queue a sound for playback.
Usually only TTS speech is queue for playback, but sounds effects may also be queued for richer experiences, for example in a story telling skill
The PlaybackThread ensures sounds don't play over each other but instead sequentially, listening might be triggered after TTS finishes playing if requested in the "speak"
shorts sounds can be played outside the PlaybackThread, usually when instant feedback is required, such as in the listening sound or on error sounds
You can configure default sounds and the playback commands under mycroft.conf
// File locations of sounds to play for default events
"sounds": {
"start_listening": "snd/start_listening.wav",
"end_listening": "snd/end_listening.wav",
"acknowledge": "snd/acknowledge.mp3",
"error": "snd/error.mp3"
// Mechanism used to play WAV audio files
// by default ovos-utils will try to detect best player
"play_wav_cmdline": "paplay %1 --stream-name=mycroft-voice",
// Mechanism used to play MP3 audio files
// by default ovos-utils will try to detect best player
"play_mp3_cmdline": "mpg123 %1",
// Mechanism used to play OGG audio files
// by default ovos-utils will try to detect best player
"play_ogg_cmdline": "ogg123 -q %1",
NOTE: by default the playback commands are not set and OVOS will try to determine the best way to play a sound automatically
Native playback
Usually playback is triggered in response to originating bus message, eg "recognizer_loop:utterance"
, this message contains
metadata that is used to determine if playback should happen.
may contain a source and destination, playback is only triggered if a message destination is a
or if missing (considered a broadcast).
This separation of native sources allows remote clients such as an android app to interact with OVOS without the actual device where ovos-core is running repeating all TTS and music playback out loud
You can learn more about message targeting here
By default, only utterances originating from the speech client are considered native
"Audio": {
"native_sources": ["debug_cli", "audio"]
Transformer Plugins
NEW in ovos-core
version 0.0.8
Similarly to audio transformers in ovos-dinkum-listener
, the utterance and audio data generated by TTS are exposed to a set
of plugins that can transform them before playback
Dialog Transformers
Similarly to utterance transformers in core, ovos-audio
exposes utterance
and message.context
to a set
of plugins that can transform it before TTS stage
The utterance
to be spoken is sent sequentially to all transformer plugins, ordered by priority (developer defined),
until finally it is sent to the TTS stage
To enable a transformer add it to mycroft.conf
// To enable a dialog transformer plugin just add it's name with any relevant config
// these plugins can mutate utterances before TTS
"dialog_transformers": {
"ovos-dialog-translation-plugin": {},
"ovos-dialog-transformer-openai-plugin": {
"rewrite_prompt": "rewrite the text as if you were explaining it to a 5 year old"
TTS Transformers
The audio to be spoken is sent sequentially to all transformer plugins, ordered by priority (developer defined), until finally it played back to the user
NOTE: Does not work with StreamingTTS
To enable a transformer add it to mycroft.conf
// To enable a tts transformer plugin just add it's name with any relevant config
// these plugins can mutate audio after TTS
"tts_transformers": {
"ovos-tts-transformer-sox-plugin": {
"default_effects": {
"speed": {"factor": 1.1}