Make offline speech recognition on Linux easier


For almost any task you want to name, a Linux-based desktop computer can do the job with applications that rival or outperform those on other platforms. That doesn’t mean it’s always easy to get it working, however, and speech recognition is just one of those difficult setups.

A project called Voice2JSON tries to simplify the use of voice workflows. While it doesn’t provide the actual speech recognition, it does make it easier to get things started and then use speech naturally.

The software can be integrated with multiple backends to perform offline speech recognition, including Pocketphinx from CMU, Kaldi from Dan Povey, DeepSpeech 0.9 from Mozilla and Julius from Kyoto University. However, the code is more than just a thin wrapper around these tools. The rapid training process creates both a speech recognizer and an intention recognizer. So not only do you know there is a garage door, but you also get an understanding of how to open and close the garage door.

Additionally, the tools are all designed to work in Unix style pipelines, which is refreshing. Here is a sample configuration from the project’s website:

open the garage door
close the garage door

turn on the living room lamp
turn off the living room lamp

There are template functions that allow you to specify optional words and alternate words in a single rule. There are other functions like mapping an object like a living room lamp into something more computer friendly.

Overall, this looks like a fun tool to have in your kit. If you do anything interesting with it, be sure to give us a tip so we can cover it. In the meantime, we’ve been looking at the Linux language for quite a while. What we really want, of course, are voice commands like the USS Enterprise, and we have to admit that it is getting closer.


Leave A Reply