# Multimedia

# Volume

The framework supports some base functions (opens new window) to control the audio sinks' volume.

# Actions

  • setMasterVolume(float volume) : Sets the volume of the host machine (volume in range 0-1)
  • setMasterVolume(PercentType percent) : Sets the volume of the host machine
  • increaseMasterVolume(float percent) : Increases the volume by the given percent
  • decreaseMasterVolume(float percent) : Decreases the volume by the given percent
  • float getMasterVolume() : Returns the current volume as a float between 0 and 1

# Audio Capture

openHAB is able to capture audio.

There are different options for input devices (so called audio sources):

The distribution comes with these options built-in:

Output Device Audio Source Description
javasound System Microphone This uses the Java Sound API for audio capture.

Additionally, certain bindings register their supported devices as audio sources, e.g. PulseAudio.

# Console commands

To check which audio sources are available, you can use the console:

openhab> openhab:audio sources
* System Microphone (javasound)

You can define the default audio source either by textual configuration in $OPENHAB_CONF/services/runtime.cfg or in the UI in Settings->Audio.

# Audio Playback

openHAB is able to play sound either from the file system (files need to be put in the folder $OPENHAB_CONF/sounds), from URLs (e.g. Internet radio streams) or generated by text-to-speech engines (which are available as optional Voice add-ons).

There are different options for output devices (so called audio sinks):

The distribution comes with these options built-in:

Output Device Audio Sink Description
enhancedjavasound System Speaker (with mp3 support) This uses the JRE sound drivers plus an additional 3rd party library, which adds support for mp3 files.
webaudio Web Audio Convenient, if sounds should not be played on the server, but on the client: This sink sends the audio stream through HTTP to web clients, which then cause it to be played back by the browser. Obviously, the browser needs to be opened and have a compatible openHAB UI running. Currently, this feature is supported by Main UI, Basic UI and HABPanel.

Additionally, certain bindings register their supported devices as audio sinks, e.g. Sonos speakers.

# Console commands

To check which audio sinks are available, you can use the console:

openhab> openhab:audio sinks
* System Speaker (enhancedjavasound)
  Web Audio (webaudio)

You can define the default audio sink either by textual configuration in $OPENHAB_CONF/services/runtime.cfg or in the UI in Settings->Audio.

In order to play a sound, you can use the following commands on the console:

openhab> openhab:audio play doorbell.mp3
openhab> openhab:audio play sonos:PLAY5:kitchen doorbell.mp3
openhab> openhab:audio play sonos:PLAY5:kitchen doorbell.mp3 25

openhab> openhab:audio stream example.com
openhab> openhab:audio stream sonos:PLAY5:kitchen example.com

You can optionally specify the audio sink between the play parameter and the file name and between the stream parameter and the URL. This parameter can even be a pattern including * and ? placeholders; in this case, the sound is played to all audio sinks matching the pattern. If this parameter is not provided, the sound is played to the default audio sink. The command to play a file accepts an optional last parameter to specify the volume of playback.

# Actions

Alternatively the playSound() (opens new window) or playStream() (opens new window) functions can be used in DSL rules:

  • playSound(String filename) : plays a sound from the sounds folder to the default sink

  • playSound(String filename, PercentType volume) : plays a sound with the given volume from the sounds folder to the default sink

  • playSound(String sink, String filename) : plays a sound from the sounds folder to the given sink(s)

  • playSound(String sink, String filename, PercentType volume) : plays a sound with the given volume from the sounds folder to the given sink(s)

  • playStream(String url) : plays an audio stream from an url to the default sink (set url to null if streaming should be stopped)

  • playStream(String sink, String url) : plays an audio stream from an url to the given sink(s) (set url to null if streaming should be stopped)

# Examples

playSound("doorbell.mp3")
playSound("doorbell.mp3", new PercentType(25))
playSound("sonos:PLAY5:kitchen", "doorbell.mp3")
playSound("sonos:PLAY5:kitchen", "doorbell.mp3", new PercentType(25))

playStream("example.com")
playStream("sonos:PLAY5:kitchen", "example.com")

# Voice

# Text-to-Speech

In order to use text-to-speech, you need to install at least one TTS service.

# Console Commands

To check which Text-to-Speech services are available, you can use the console:

openhab> openhab:voice ttsservices
* VoiceRSS (voicerss)

Once you have installed at least one text-to-speech service, you will find voices available in your system:

openhab> openhab:voice voices
  VoiceRSS - allemand (Allemagne) - Hanna (voicerss:deDE_Hanna)
  VoiceRSS - allemand (Allemagne) - Jonas (voicerss:deDE_Jonas)
  VoiceRSS - allemand (Allemagne) - Lina (voicerss:deDE_Lina)
  VoiceRSS - allemand (Allemagne) - default (voicerss:deDE)
  VoiceRSS - allemand (Autriche) - Lukas (voicerss:deAT_Lukas)
  VoiceRSS - allemand (Autriche) - default (voicerss:deAT)
  VoiceRSS - allemand (Suisse) - Tim (voicerss:deCH_Tim)
  VoiceRSS - allemand (Suisse) - default (voicerss:deCH)
...
  VoiceRSS - français (France) - Axel (voicerss:frFR_Axel)
  VoiceRSS - français (France) - Bette (voicerss:frFR_Bette)
  VoiceRSS - français (France) - Iva (voicerss:frFR_Iva)
* VoiceRSS - français (France) - Zola (voicerss:frFR_Zola)
  VoiceRSS - français (France) - default (voicerss:frFR)
...
  VoiceRSS - vietnamien (Vietnam) - Chi (voicerss:viVN_Chi)
  VoiceRSS - vietnamien (Vietnam) - default (voicerss:viVN)

You can define a default TTS service and a default voice to use either by textual configuration in $OPENHAB_CONF/services/runtime.cfg or in the UI in Settings->Voice.

In order to say a text, you can enter such a command on the console (The default voice and default audio sink will be used):

openhab> openhab:voice say Hello world!

# Actions

Alternatively you can execute such commands within DSL rules by using the say() (opens new window) function:

  • say(Object text) : says a given text with the default voice
  • say(Object text, PercentType volume) : says a given text with the default voice and the given volume
  • say(Object text, String voice) : says a given text with a given voice
  • say(Object text, String voice, PercentType volume) : says a given text with a given voice and the given volume
  • say(Object text, String voice, String sink) : says a given text with a given voice through the given sink
  • say(Object text, String voice, String sink, PercentType volume) : says a given text with a given voice and the given volume through the given sink

You can select a particular voice (second parameter) and a particular audio sink (third parameter). If no voice or no audio sink is provided, the default voice and default audio sink will be used.

# Examples
say("Hello world!")
say("Hello world!", new PercentType(25))
say("Hello world!", "voicerss:enGB")
say("Hello world!", "voicerss:enGB", new PercentType(25))
say("Hello world!", "voicerss:enUS", "sonos:PLAY5:kitchen")
say("Hello world!", "voicerss:enUS", "sonos:PLAY5:kitchen", new PercentType(25))

# Speech-to-Text

In order to use Speech-to-Text, you need to install at least one STT service.

# Console Commands

To check which Speech-to-Text services are available, you can use the console:

openhab> openhab:voice sttservices
* Vosk (voskstt)

You can define a default STT service to use either by textual configuration in $OPENHAB_CONF/services/runtime.cfg or in the UI in Settings->Voice.

# Keyword Spotter

Spotting a keyword is usually the first step to trigger a dialogue with a voice assistant. In order to spot keyword, you need to install at least one Keyword Spotter service.

# Console Commands

To check which Keyword Spotter services are available, you can use the console:

openhab> openhab:voice keywordspotters
* Porcupine (porcupineks)

You can define a default Keyword Spotter service to use either by textual configuration in $OPENHAB_CONF/services/runtime.cfg or in the UI in Settings->Voice.

# Human Language Interpreter

Human language interpreters are meant to process prose that e.g. is a result of voice recognition or from other sources.

There are two implementations available by default:

Interpreter Type Description
rulehli Rule-based Interpreter This mimics the behavior of the Android app - it sends the string as a command to a (configurable, default is "VoiceCommand") item and expects a rule to pick it up and further process it.
system Built-in Interpreter This is a simple implementation that understands basic home automation commands like "turn on the light" or "stop the music". It currently supports only English, German, French and Spanish and the vocabulary is still very limited. The exact syntax still needs to be documented, for the moment you need to refer to the source code (opens new window).
opennlp HABot OpenNLP Interpreter A machine-learning natural language processor based on Apache OpenNLP for intent classification and entity extraction.

# Console Commands

To check which human language interpreters are available, you can use the console:

openhab> openhab:voice interpreters
  Built-in Interpreter (system)
* Rule-based Interpreter (rulehli)

You can define a default human language interpreter to use either by textual configuration in $OPENHAB_CONF/services/runtime.cfg or in the UI in Settings->Voice.

To test the interpreter, you can enter such a command on the console (assuming you have an item with label 'light'):

openhab> openhab:voice interpret turn on the light

The default human language interpreter will be used. In case of interpretation error, the error message will be said using the default voice and default audio sink.

# Actions

Alternatively you can execute such commands within DSL rules using the interpret() (opens new window) function:

  • interpret(Object text) : interprets a given text by the default human language interpreter
  • interpret(Object text, String interpreters) : interprets given text by given human language interpreter(s)
  • interpret(Object text, String interpreters, String sink) : interprets a given text by given human language interpreter(s) and using the given sink

You can select particular human language interpreter(s) (second parameter) and a particular audio sink (third parameter). If no human language interpreter or no audio sink is provided, the default human language interpreter and default audio sink will be used.

The human language interpreter(s) parameter must be the ID of an installed interpreter or a comma separated list of interpreter IDs; each provided interpreter is executed in the provided order until one is able to interpret the command.

The audio sink parameter is used when the interpretation fails; in this case, the error message is said using the default voice and the provided audio sink. If the provided audio sink is set to null, the error message will not be said.

The interpretation result is returned as a string. Note that this result is always a null string with the rule-based Interpreter (rulehli).

# Examples
interpret("turn on the light")
var String result = interpret("turn on the light", "system")
result = interpret("turn on the light", "system", null)
result = interpret("turn on the light", "system,rulehli")
result = interpret(VoiceCommand.state, "system", "sonos:PLAY5:kitchen")

# Voice Assistant

openHAB embeds a dialog processor based on the services previously presented on this page. With this dialog processor and these services, openHAB can become a voice assistant dedicated to home automation. Here are the components needed to instantiate a voice assistant:

  • an audio source: the audio device that will listen for user speaking,
  • a keyword spotter: this will detect the keyword defined by the user to start a dialogue,
  • a Speech-to-Text service: captured audio will be converted into text,
  • one (or more) interpreter(s): the text will be analyzed and converted into commands in the automation system and a response will be produced,
  • a Text-to-Speech service: the text response will be converted into an audio file,
  • an audio sink: the audio file will be played to be heard by the user.

The quality of the voice assistant will of course depend on the quality of each of the selected components.

Your openHAB server can run multiple voice assistants but can only run one voice assistant for a given audio source.

After you start a voice assistant, it will live until you stop it, which means it will continue to detect keyword and handle dialogues.

However, there is a special mode that allows handling a single dialogue, bypassing keyword detection and starting to listen for user request immediately after running it. You do not need to stop it, it stops automatically after handling the user request. It's something you could run in a rule triggered by a particular user action, for example. This mode is executed using the listenAndAnswercommand.

# Console Commands

To start and stop a voice assistant, you can enter such commands on the console:

openhab> openhab:voice startdialog
openhab> openhab:voice startdialog javasound
openhab> openhab:voice startdialog javasound sonos:PLAY5:kitchen
openhab> openhab:voice startdialog javasound sonos:PLAY5:kitchen system
openhab> openhab:voice startdialog javasound sonos:PLAY5:kitchen system voicerss
openhab> openhab:voice startdialog javasound sonos:PLAY5:kitchen system,rulehli voicerss voskstt
openhab> openhab:voice startdialog javasound sonos:PLAY5:kitchen system,rulehli voicerss voskstt porcupineks
openhab> openhab:voice startdialog javasound sonos:PLAY5:kitchen system voicerss voskstt porcupineks voicerss:frFR_Zola
openhab> openhab:voice startdialog javasound sonos:PLAY5:kitchen system voicerss voskstt porcupineks voicerss:frFR_Zola terminator

openhab> openhab:voice stopdialog
openhab> openhab:voice stopdialog javasound

openhab> openhab:voice listenandanswer
openhab> openhab:voice listenandanswer javasound
openhab> openhab:voice listenandanswer javasound sonos:PLAY5:kitchen
openhab> openhab:voice listenandanswer javasound sonos:PLAY5:kitchen system
openhab> openhab:voice listenandanswer javasound sonos:PLAY5:kitchen system voicerss
openhab> openhab:voice listenandanswer javasound sonos:PLAY5:kitchen system,rulehli voicerss voskstt
openhab> openhab:voice listenandanswer javasound sonos:PLAY5:kitchen system voicerss voskstt voicerss:frFR_Axel

The commands expect parameters in a specific order; so for example, if you want to provide the interpreter as a parameter, you will have to provide the audio source and the audio sink before.

When a parameter is not provided in the command line, the default from the voice settings is used. If no default value is set in voice settings, the command will fail.

You can select particular human language interpreter(s). This parameter must be the ID of an installed interpreter or a comma separated list of interpreter IDs; each provided interpreter is executed in the provided order until one is able to interpret the command.

If the language is defined in the regional settings, it is used as the language for the voice assistant; if not set, the system default locale is assumed. To not fail, the keyword spotter, the Speech-to-Text and Text-to-Speech services, and the interpreters must support this language.

You can select a particular voice for the Text-to-Speech service. If no voice is provided, the voice defined in the regional settings is preferred. If this voice is not associated with the selected Text-to-Speech service or not applicable to the language used, any voice from the selected Text-to-Speech service applicable to the language being used will be selected.

if the default listening switch is set in the voice settings, it is used.

# Actions

Alternatively you can execute such commands within DSL rules using the startDialog() (opens new window), stopDialog() (opens new window) and listenAndAnswer() (opens new window) functions:

  • startDialog(String source, String sink) : starts dialog processing for a given audio source
  • startDialog(String ks, String stt, String tts, String voice, String interpreters, String source, String sink, String locale, String keyword, String listeningItem) : starts dialog processing for a given audio source
  • stopDialog(String source) : stops dialog processing for a given audio source
  • listenAndAnswer(String source, String sink) : executes a simple dialog sequence without keyword spotting for a given audio source
  • listenAndAnswer(String stt, String tts, String voice, String interpreters, String source, String sink, String locale, String listeningItem) : executes a simple dialog sequence without keyword spotting for a given audio source

Each parameter can be null; in this case, the default from the voice settings is used. If no default value is set in the voice settings, the action will fail.

You can select particular human language interpreter(s). The interpreters parameter for startDialog and listenAndAnswer must be the ID of an installed interpreter or a comma separated list of interpreter IDs; each provided interpreter is executed in the provided order until one is able to interpret the command.

The locale parameter for startDialog and listenAndAnswer is the language to be used by the voice assistant. If null is provided, the language defined in the regional settings is used; if not set, the system default locale is assumed. To not fail, the keyword spotter, the Speech-to-Text and Text-to-Speech services, and the interpreters must support this language.

The voice parameter for startDialog and listenAndAnswer is the voice to be used by the Text-to-Speech service. If null is provided, the voice defined in the regional settings is preferred. If this voice is not associated with the selected Text-to-Speech service or not applicable to the language used, any voice from the selected Text-to-Speech service applicable to the language being used will be selected.

The listeningItem parameter for startDialog and listenAndAnswer is the item name of the listening switch. This item is switched on during the period when the dialog processor has spotted the keyword and is listening for commands. If null is provided, the default item from the voice settings is used. If not set, no item will be switched on and off.

# Examples
startDialog(null, null)
stopDialog(null)

startDialog("javasound", "sonos:PLAY5:kitchen")
stopDialog("javasound")

startDialog("porcupineks", "voskstt", "voicerss", "voicerss:frFR_Zola", "system,rulehli", "javasound", "sonos:PLAY5:kitchen", "fr-FR", "terminator", "listeningItem")
stopDialog("javasound")

listenAndAnswer(null, null)
listenAndAnswer("javasound", "sonos:PLAY5:kitchen")
listenAndAnswer("voskstt", "voicerss", "voicerss:frFR_Axel", "system,rulehli", "javasound", "sonos:PLAY5:kitchen", "fr-FR", "listeningItem")