3. Speech Manager - mwgit00/cpox GitHub Wiki

Overview

This is a C# application that uses the Microsoft Speech API to say phrases and recognize when they are spoken. A GUI wrapper provides a convenient interface for testing and viewing status. The application includes a UDP server which can accept commands to say phrases and recognize them. It can be used in standalone mode or with any other application that can send and receive UDP messages.

It has been tested with Windows 10 and Visual Studio 2019.

Command Line Parameters

The default IP settings for the UDP server are 127.0.0.1 for a partner application's IP address, 60000 for the receiving port, and 60001 for the transmission port. The default culture is en-US and the gender is female. These settings can be changed with command line parameters. The parameters are position-dependent. Only the first character of the gender parameter is checked.

speechmanager.exe <IP address> <RX port> <TX port> <culture> <gender>

Usage examples are shown below:

speechmanager.exe
speechmanager.exe 192.168.0.2 61000 61001
speechmanager.exe 192.168.0.2 61000 61001 en-US m

Server Commands

  • cancel -- Halts a recognition that's in progress.
  • load <phrase> -- Load a phrase for repetition and recognition. No quotes are necessary. Use single spaces.
  • say <phrase> -- Repeat any text after the say command. No quotes are necessary. Use single spaces.
  • wav <filename> -- Play a WAV file.
  • rec -- Recognize loaded phrase.
  • repeat -- Repeat loaded phrase.

Server Responses

  • tts 1 -- Speaking of phrase or playing of WAV file has completed.
  • rec 0 -- Recognition has completed with no match to the loaded phrase or a timeout.
  • rec 1 -- Recognition has completed with a match to the loaded phrase.

GUI

The GUI permits testing of all functions. There are check boxes for enabling extra diagnostics when doing speech recognition. A minimum score threshold and a timeout for a recognition operation can also be configured. The bar above the Min. Score label shows audio volume.

The Server Status box will be green if the UDP server is working properly. The boxes by the TTS (text-to-speech) and Recognize buttons will turn green while the respective operations are in progress. The Stop button will halt a recognition operation that's in progress. The phrase to be spoken and recognized can be changed by checking the Edit box. Uncheck it to apply the new phrase. The Select button to the right of the WAV filename box will open a browser for selecting a WAV file. The name of selected file will be displayed in the filename box. The WAV button will play the file. The yellow text window displays status. The right-click menu for that window has a Clear selection that will clear the window. The bottom edge of the GUI window accepts resizing.

⚠️ **GitHub.com Fallback** ⚠️