Best Voice Recognition Software for Raspberry Pi
This tutorial is about the best voice recognition software for Raspberry Pi and how to use it. I was able to install and test three different voice recognition systems for Raspberry Pi. Two of these softwares were dependent on the internet and were online, however, the third one was offline.The three softwares which were tested were:
- Jasper – Voice Recognition Software.
- Raspberry Pi Voice Recognition by Oscar Liang.
- Raspberry Pi Voice Control by Steven Hickson.
Out of these three, I rate the Voice Control software created by Steven Hickson to be the most precise and potent. The Jasper system, even though it works offline, it compromises on accuracy and speed. This would be useful for systems that have no access to the internet, but a small caveat, the system takes up almost a whole 4GB memory card, so use at least an 8GB card with it.Some of its services are cumbersome and take a lot of effort from the user to pronounce repeatedly until the system picks it up.
The softwares presented by Oscar and Steven use google voice APIs, they are very accurate and precise. Both of them also use google speech, so that the system can be manipulated to talk back and respond to your commands and queries.But, I prefer the third software because it has a simple and straightforward interface. Here, you will be able to define each of your voice commands and link them to particular tasks in the form of bash commands.These are defined inside a configuration file. A detailed tutorial explaining the installation and use of this voice recognition software for raspberry pi is given below by DIY Hacking and the video at the bottom lets you get a feeling of the voice control software before you install it.
Bill of Materials
- Raspberry Pi model B with memory card preloaded with an OS.
- WiFi dongle (Optional) : Edimax EW 7811UN / LAN network cable.
- A USB webcam with microphone / USB microphone.
You cannot use normal microphones with audio jacks because the raspberry pi does not have a sound card. Hence, only use USB webcams with inbuilt mic or USB microphones. I am using a cheap USB webcam with an inbuilt mic.
How Does it Work?
The software being described here uses Google Voice and speech APIs. The voice command from the user is captured by the microphone.This is then converted to text by using Google voice API. The text is then compared with the other previously defined commands inside the commands configuration file. If it matches with any of them, then the bash command associated with it will be executed. You can also use this system as an interactive voice response system by making the raspberry pi respond to your commands via speech. This is be achieved by using the Google speech API, which converts the text into speech. Here’s a block diagram showing you the basic working of the voice recognition software for raspberry pi:
Step 1: Checking Your Microphone
You need to first check whether your microphone records properly and if the mic volumes, etc are high. First, check if your webcam or microphone is listed using the command “lsusb“. Check if your mic/webcam comes up on the list.
Next, we need to set the mic recording volume high. To do this, enter the command “alsamixer” in the terminal. A neat graphical interface shows up, press the up/down arrow keys to set the volume. Press F6 (all), then select the webcam or mic from the list. Then again use the up arrow key to set the recording volume too high.
Now, you need to check if the recording takes place properly. Use the command “arecord -l” to check if your mic/webcam is listed. Then, use the command “arecord -D plughw:1,0 test.wav” to record sound. The sound will be recorded in the file “test.wav”. To listen to it, plug in your headphones to the pi and enter the command “aplay test.wav” in the terminal. If you’re able to hear the sound, your microphones works perfectly, else try adjusting the volumes and repeat the previous steps.
STEP 2: Installing the Voice Recognition Software for Raspberry Pi
This software was created by Steven Hickson and utilizes Google voice API. To install this software, execute the following commands one after the other:
- wget –no-check-certificate “http://goo.gl/KrwrBa” -O PiAUISuite.tar.gz
- tar -xvzf PiAUISuite.tar.gz
- cd PiAUISuite/Install/
- sudo ./InstallAUISuite.sh
Please, not that the wget command in the first line uses two dashes (- -) before “no-check”. During the installation, several questions shall pop up. You need to read these carefully and press y/n accordingly. I would recommend you to press y for all of them.Some of the questions include: Do you want to set a keyword? (Keyword is a voice command like a name, the system gets activated only when first use this command), Do you want to set filler flag to zero? (Press y, else you will always hear “Filler Fill” before every speech response from the pi) , Do you want to install youtube-dl? (A terminal service for playing youtube videos), etc. Options for changing the listening duration and system response is also presented during the installation. Read carefully each of these questions and respond accordingly.
STEP 3: Using the Voice Control Software and Setting up Your Own Commands
You can verify the voice to text conversion by running “./speech-recog.sh” in the directory: /home/pi/PiAUISuite/VoiceCommand. The software is activated to run continuously when you execute the command “sudo voicecommand -c” in the terminal. By default, the keyword used to activate it is “Pi” , only when you say this when its listening can you execute the other commands.Check the video below to get a feel of the software.
I would recommend changing the keyword from “Pi” to something else, as the system usually interpreted it for me as “Hi”.You can change the keyword and other voice commands and actions by opening the commands configuration file. This can be done by entering the command “voicecommand -e“.Inside this, you can see various options for setting the keyword, the speech response , : etc. Please remove the “#” before the lines in the file while changing them.
Here, each command is linked to a particular action. Eg: “Youtube==youtube-search …” , here when you say for example “Youtube android” , it runs the command “youtube-search android” in the bash. The “…” stands for anything you say after the command “Youtube”. In case of the voice command definition : “play $1 season $2 episode $3 =playvideo -s $2 -e $3 $1” , here when you say for example “play Big bang Theory season 1 episode 4”, it executes the command “playvideo -s 1 -e 4 Big Bang Theory“, i.e., it plays the 4th episode of the 1st season of Big Bang Theory.
So, if you want to add a new voice command to this, for examp a voice command “Check internet” that uses “ping” to check your internet connection. Then in the configuration file, enter a line like this “Check internet==ping google.com”. It executes “ping google.com” when you say “Check internet”.
Use this system for home automation, robotics, and other cool stuff. This software is quite accurate and swift for your applications.Now, the video of the voice control software in action: