Machine Learning - romitagl/kgraph GitHub Wiki

Landscape

LF AI & Data Foundation Interactive Landscape

Datasets

Cleanlab: Automatically find and fix errors in your ML datasets.
Label Studio: Label Studio is a multi-type data labeling and annotation tool with standardized output format.
Roboflow: The world's largest collection of open source computer vision datasets and APIs.

Libraries

PyCaret: PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows. It is an end-to-end machine learning and model management tool that exponentially speeds up the experiment cycle and makes you more productive.
Gradio: Gradio is an open-source Python library that is used to build machine learning and data science demos and web applications. Gradio is useful for demoing your machine learning models and deploying your models quickly with automatic shareable links.
Composer: Composer is a library for training neural networks better, faster, and cheaper. Note: num_workers: usually set this to the number of CPU cores in your machine divided by the number of GPUs.
DeepSpeed : DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
FairScale: FairScale is a PyTorch extension library for high performance and large scale training.

Frameworks

Ludwig: Ludwig is a declarative machine learning framework that makes it easy to define machine learning pipelines using a simple and flexible data-driven configuration system.
Candle: Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use.

Distributed training

Horovod: Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Testing

Testing Machine Learning Systems: https://madewithml.com/courses/mlops/testing/
Effective testing for machine learning systems: https://www.jeremyjordan.me/testing-ml/

Deep Learning

Intro to Deep Learning: https://www.kaggle.com/learn/intro-to-deep-learning
Train your data analysis model with PyTorch: https://learn.microsoft.com/en-us/windows/ai/windows-ml/tutorials/pytorch-analysis-train-model

CNN

Build convolutional neural networks with TensorFlow and Keras: https://www.kaggle.com/learn/computer-vision

Image classification with PyTorch

Face Recognition

deep face analysis library: https://insightface.ai, https://github.com/deepinsight/insightface

Automatic Speech Recognition

OpenAI's Whisper: The Whisper models are trained for speech recognition and translation tasks, capable of transcribing speech audio into the text in the language it is spoken (ASR) as well as translated into English (speech translation).

Image generation

Stable Diffusion: an image generation model that takes a text prompt and produces an image. Stable Diffusion can be freely downloaded.
DALL·E: DALL·E 2 is an AI system that can create realistic images and art from a description in natural language.

LLMs

Edge inference

Web LLM: WebLLM is a modular, customizable javascript package that directly brings language model chats directly onto web browsers with hardware acceleration. Everything runs inside the browser with no server support and accelerated with WebGPU.

Run locally

https://ollama.ai: Ollama makes it easy to host large language models locally.
gpt4all: A free-to-use, locally running, privacy-aware chatbot. No GPU or internet required: https://gpt4all.io
LM Studio: Discover, download, and run local LLMs
OpenWebUI: User-friendly WebUI for LLMs (Formerly Ollama WebUI)

Code Generation

CodeGen: CodeGen is an open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
Code Llama: Code Llama is a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks.
Stable Code 3B: Stable Code 3B is a 3 billion parameter Large Language Model (LLM), at a level on par with models such as CodeLLaMA 7b that are 2.5x larger. Operates offline even without a GPU on common laptops such as a MacBook Air.
OpenDevin: a platform for autonomous software engineers, powered by AI and LLMs. OpenDevin agents collaborate with human developers to write code, fix bugs, and ship features.

Transformers

Transformers-Tutorials: Demos made with the Transformers library by 🤗 HuggingFace
FastEdit: inject fresh and customized knowledge into large language models efficiently

LLM Frameworks

LangChain is a framework for developing applications powered by language models: https://python.langchain.com/docs/get_started/introduction
- Retrieval-augmented generation (RAG): https://python.langchain.com/docs/use_cases/question_answering/

LLM Architectures

https://github.blog/2023-10-30-the-architecture-of-todays-llm-applications/

LLM Models

Ferret - Refer and Ground Anything: https://github.com/apple/ml-ferret

Vector Databases

Milvus: Milvus is an open-source vector database built to power embedding similarity search and AI applications.

Model explainability

LIME: Local Interpretable Model-agnostic Explanations. It is an explanation technique that interprets an individual prediction locally.
SHAP: Shapley Additive Explanations. The key idea of SHAP is to calculate the Shapley values for each feature of the sample to be interpreted, where each Shapley value represents the impact that the feature to which it is associated, generates in the prediction.

Infrastructure

The true cost of building a Data Science platform: https://f.hubspotusercontent40.net/hubfs/6816846/The%20True%20Cost%20of%20Building%20a%20Data%20Science%20Platform_April%202021.pdf
Kubernetes GPU sharing with MetaGPU. YouTube talk Fractional GPU Allocations With MetaGPU Device Plugin.
NVIDIA GPU Sharing - Kubernetes: https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/gpu-sharing.html, YouTube talk: https://youtu.be/1QfShSQLsbs?si=C1Ic5JKmr-cfViv9
The Missing Guide to the H100 GPU Market: https://blog.lepton.ai/the-missing-guide-to-the-h100-gpu-market-91ebfed34516
GPUd, an AI-native GPU management utility that reduces GPU cluster unavailability by 4x. Developed at Lepton AI by experts with experience at Meta, Alibaba, and Uber, GPUd automates monitoring, diagnostics, and issue identification for GPUs: https://github.com/leptonai/gpud
Infrastructure to train a 70B parameter model: https://imbue.com/research/70b-infrastructure/

Performance

PyTorch profiler: https://pytorch.org/blog/pytorch-profiler-1.9-released/#gpu-metric-on-timeline
META’S AI PERFORMANCE PROFILING: Performance tuning for PyTorch in production environments
Storage Performance Basics for Deep Learning: https://developer.nvidia.com/blog/storage-performance-basics-for-deep-learning/

Projects

Self-hosted AI coding assistant: https://github.com/TabbyML/tabby

Tools

Domain-Specific Computer Vision Applications - Large Vision Models (LVMs): https://landing.ai
OpenUI lets you describe UI using your imagination, then see it rendered live. You can ask for changes and convert HTML to React, Svelte, Web Components: https://github.com/wandb/openui

Reference

GPU Glossary - CUDA: https://modal.com/gpu-glossary/readme

Venture Capital

https://aifund.ai: AI Fund is a venture studio for AI-based companies.