Speech Recognition - Kagamma/satania-buddy GitHub Wiki

This page documents various speech recognition backends supported (or previously supported) by satania-buddy.

Vosk

This is the default speech recognition backend. It is accurate, very fast, and consumes few resources.

whisper.cpp

An experimental speech recognition backend, based on whisper.cpp. There’re some disadvantages compared to Vosk
- It does not support streaming mode, so real-time STT requires some tricks at satania-buddy’s end. The current implementation is not really good and limit the maximum speech buffer to 8 seconds only.
- It requires a lot of processing power compared to Vosk.

Microsoft Speech Object Library

A legacy speech recognition backend. Supports Windows only.

CMU Sphinx

An obsolete speech recognition backend, it was removed in favor of Vosk.

⚠️ GitHub.com Fallback ⚠️