AI - kamack38/Essentials GitHub Wiki
AI Tools
Upscaling
- Upscayl - #1 Free and Open Source AI Image Upscaler for Linux, MacOS and Windows.
AlphaGeometry
This article describes how to set it up inside a docker container
- OCRmyPDF - adds an OCR text layer to scanned PDF files, allowing them to be searched
OCR
- Frog - Intuitive text extraction tool for GNOME. Can extract text from any image, video, QR Code and etc.
Deepfake
- facefusion - Industry leading face manipulation platform with docker support
- Deep-Live-Cam - real time face swap and one-click video deepfake with only a single image
Speech to Text
- transcribe-anything - Multi-backend whisper app
Subtitles
-
subsai - Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants
docker pull absadiki/subsai:main docker run --gpus=all -p 8501:8501 -v /path/to/your/media_files/folder:/media_files absadiki/subsai:main
Text to speech
- TTS - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
- Zonos - Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS providers.
- bark - Text-Prompted Generative Audio Model