Lab 31b: Training a new grammar - robotic-picker-sp22/fetch-picker GitHub Wiki

Sphinx makes available broad-coverage language and acoustical models, but we probably don't need or want these. In most any robotics application, the vocabulary of possible utterances is much smaller than general language. Certainly, the number of utterances for which we're able to get robots to take meaningful action is quite small.

Because of this, we're going to tailor Sphinx to a small domain of possible utterances by providing it a custom knowledge base.

Building the knowledge base

First you'll need a target set of sentences. We encourage you to think about whatever utterances are most important to your project.

Compile these into a plain text file, then use the online Sphinx lmtool to build the knowledge base files.

Using Sphinx

To avoid running a JVM on the robot, we'll use PocketSphinx, a lightweight, C Sphinx implementation.

We should be able to install PocketSphinx from a package manager

pip install pocketsphinx

Local build (optional)

PocketSphinx (and regular Sphinx) rely on libraries that are distributed in SphinxBase.

Clone and build both of these repos in ~/lib.

Configure your environment (library paths, Python package paths) to pick up on the build output.

PocketSphinx with ROS

A lot of folks have used PocketSphinx on their robots, so we'll use a popular community wrapper, ros-pocketsphinx. Clone this down to your workspace and make sure it builds.

You should be able to run the turtlebot example by running the correct launch file on your lab machine. We'll make microphones (webcams) available so you can test.

Using your custom knowledge base

Sub in your custom model files and modify the turtlebot demo to see how well PocketSphinx works for your sentences. You may need to adjust phrasing if some sentences don't transcribe reliably. Pay attention to the confidence assigned

Note that this testing is just with local audio from your development machine. Naturally, be ready to adjust once your model runs on the actual robot's microphones!

Control Kuri with voice commands

We aren't quite ready to put your model on the robot, but so that you can see a closed loop of voice interaction, modify the ros-pocketsphinx example to make Kuri respond in some way to utterances.

If it's important that your robot be able to speak arbitrary utterances back, consider looking into a lightweight speech synthesis package like espeak. Otherwise, leverage your animations and the robot's built in sounds.