Speech Recognition APIs - scribear/ScribeAR.github.io GitHub Wiki

Webspeech

An API available through React library, so it is very simple to get working. It is run asynchronously to the rest of the code and because of this, having it communicate with everything else can be a little tricky. The best way I have found to communicate with it is by using stateRef's which react offers with "React.useRef()"

Azure

Requires a key and region authentication. It is run asynchronously so we use "React.useRef()" to communicate with it as well. Azure is much more exciting as a lot of its capabilities are pretty cutting edge, a lot of updates we have planned involve implementing Azure features.

StreamText

StreamText is a website that we actually just render with an Iframe. There is almost no coding involved and because it uses an Iframe, there is also very little communication possible. Anyone looking to help with streamtext would probably need to get comfortable with XML requests.

ScribeAR Server

A backend server that provides self hosted transcriptions using a variety of speech recognition implementations. It also enables Kiosk Mode, a feature to allow other users to receive synced transcriptions on their own devices by scanning a QR code. See Connecting From Frontend for how to connect to ScribeAR server. Check out https://github.com/scribear/ScribeAR-NodeServer to learn more about the backend.

Whisper

An implementation of Open AI's Whisper speech recognition model running inside the browser.

⚠️ **GitHub.com Fallback** ⚠️