AI - kamack38/Essentials GitHub Wiki

AI Tools

Upscaling

  • Upscayl - #1 Free and Open Source AI Image Upscaler for Linux, MacOS and Windows.

AlphaGeometry

This article describes how to set it up inside a docker container

Pdf

  • OCRmyPDF - adds an OCR text layer to scanned PDF files, allowing them to be searched

OCR

  • Frog - Intuitive text extraction tool for GNOME. Can extract text from any image, video, QR Code and etc.

Deepfake

  • facefusion - Industry leading face manipulation platform with docker support
  • Deep-Live-Cam - real time face swap and one-click video deepfake with only a single image

Speech to Text

Subtitles

  • subsai - Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants

    docker pull absadiki/subsai:main
    docker run --gpus=all -p 8501:8501 -v /path/to/your/media_files/folder:/media_files absadiki/subsai:main
    

Text to speech

  • TTS - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
  • Zonos - Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS providers.
  • bark - Text-Prompted Generative Audio Model