Chatfish - cmubuild18/Build18 GitHub Wiki
Chatfish
If you cannot make friends during 0 week or have goldfish memory when it come to putting names to faces, use Chatfish to make ends meet.
"Chatfish" is an innovative project for new students, using a Raspberry Pi 4 Model B, camera, and microphone for facial recognition and interaction. It aims to ease social awkwardness by remembering faces and names, and facilitating engaging conversations in the initial school week. Key concepts include facial recognition, text-to-speech, speech-to-text, text-to-text, and language models. Our team efficiently integrated these concepts during the build week, optimizing algorithms and fine-tuning systems for Chatfish to generate engaging responses.
Behind Chatfish�s conversation engine are three components: a local speech recognition model, a local instruction-tuned LLM, and a text-to-speech API. We use whisper.cpp, an efficient port of OpenAI�s whisper model. We use a 4-bit quantized C++ port of Meta�s llama2-7b-chat model to generate text on-device and the converted gguf format weights and code from the llama-cpp project, which is built on the efficient ggml framework - this allows us to run full inference of a large transformer model with only 8GB of RAM. To convert text into speech, we use the ElevenLabs API.
The project achieved its goals:
- Facial Recognition: Chatfish recognizes known and unknown faces.
- Interactive Conversations: It responds effectively, facilitating smooth interactions.
- Engagement: Chatfish's ability to speak and listen reduces intimidation.
- Personalization: Remembering faces and names creates a welcoming experience.
Team Members
Team Member | Photo |
---|---|
Gina Seo | |
Gio | |
Justin | |
Dylan | |
Aaron |
Photos
https://drive.google.com/drive/folders/1ypGFgv1UAcA2FQH4_R3uQQrDwWPXQ0h2?usp=sharing