Lists on AI & LLM Tech - GRibbans/Gribbans GitHub Wiki

Image at the top of each wiki page, all Header Images use abstract geometric motifs

This wiki page for useful links on Generative A.I. technology, systems, tools, and research & analysis.

Unless stated, all detailed are Open-source software (OSS/FOSS)

'*' = marks those I use and would recommend taking a look at.

Table of Contents


AI Models

LLM Leader-boards

  • Aider Leader-board Aider is a well regarded code assistant; the ranking table is focused towards this use case.
  • LMSYS Leader-board Ranking is based of blind side-bys-de human evaluation. So is a good ranking of purely human opinion, so covers intrinsically, many aspects which we value in an AI model.
  • Bigcode Bench Evaluates LLMs with practical and challenging programming tasks.

Table of LLM Models

Below is a list of models which I have come across that have been noted by others (usually from Huggingface or Reddit). All LLM models below have been released to the public, and can be hosted locally.

Embedding models are specialist AI models dedicated to creating embeddings. They are used with Vector databases and the RAG process._

LLM NAME PARAMETER FILE SIZE CONTEXT SIZE RELEASE DATE DESCRIPTION
Granite Code 3B 3B 2.0GB 2K May-24 By IBM. Best for code completion tasks
Granite 8B 4.6GB 8.192K May-24 By IBM. For code generation, code explanation, code fixing etc.
Llama3.1 8B 4.7GB 128K Jul-24 By Meta. 70B (40GB) and 405B (231GB) also available
Llama3 Instruct 8B 4.7GB - - By Meta. 8B is the budget mode from its latest release
Mistral 7B 7.3B 4.1GB 4k Sep-23 General purpose AI
Mistral NeMo 12B 7.1GB 128K Jul-24 General purpose AI
Openhermes v2.5 7.24B 4.1GB 8.192K Feb-23 Fine-tuned by Teknium on Mistral with fully open datasets
Phi 3.5 3.82B 2.2GB 128K Aug-24 By Microsoft. Very safe. Lightweight, state-of-the-art open model
Phi 3.1 Mini 128K 3.8B 1.8GB 128K July-24 By Microsoft. Additional post-training data massive improvements across a range of benchmarks
Phi 3 Mini 4K 3.8B 2.3GB 4K - By Microsoft. Update to the Phi 3 model.
Phi 3 Small 3.8B 2.3GB - - Microsofts new small, but powerful (equiv. GPT 3.5)
Qwen2 7B / 1.5B 4.4GB - July-24 Range of models. 1.5B model suitable for IDE code completion
Wizard LM2 7B 4.1GB - May-24 By Microsoft. Previous fastest model (worst reasoning in WLM2 range)
Yi Coder 8.83B 5.0GB 128K Sep-24 Current no.1, state-of-the-art coding performance, fewer than 10 b parameters
Yi v1.5 6B 3.5GB - May-24 OSS LLM from 01.AI, it was trained on 3 trillion tokens of data
:-------------------------------------------------------------------- --------- --------- ------------ ------------ :-----------------------------------------------------------------------------
Autocoder 6.7B 7.2GB - - Its highest test score is for coding in: Python
CodeGeex4 9B 5.5GB 128k Jul-24 Currently it is highest scoring sub 10B param model for coding
CodeGemma 2B 2B 1.6GB 8K May-24 By Google. 2B model is ideal for IDE code auto-completion
Codegemma Instruct 7B 5.0GB - - -
CodeQwen 2.0 7B 4.4GB - - Its highest test score is for coding in: JS
Codestral 22B 13GB - - VG for Python. Mistrals first model focused on code gen. N.B. needs GPU
Deepseek v2.5 236B N/A 128K Sept-24 NOT LOCAL HOST. Best value, low cost and top tier for coding
DeepSeek-Coder-V2.1(0724) 16B 8.9GB 128k June-24 Aider.chat ranks it second best model for coding related tasks. as at release
Deepseek-coder v2 16B 8.9GB - July-24 Now former best coding model (July 2024), requires a large capacity GPU
Deepseek-coder 6.7B 3.8GB - - -
:-------------------------------------------------------------------- --------- --------- ------------ ------------ :-----------------------------------------------------------------------------
MiniCPM v2.6 8B 5.5GB N/A Aug-24 Vision model. Multimodal LLMs (MLLMs) designed for vision-language
:-------------------------------------------------------------------- --------- --------- ------------ ------------ :-----------------------------------------------------------------------------
All-minilm 22M 45MB - - Embedding (RAG) model
Mxbai-embed-large 335M 669MB - - Embedding (RAG) model
Nomic-embed-text 137M 274MB - - Embedding (RAG) model

N.B. mxbai needs a custom system prompt 'Represent this sentence for searching relevant passages: A man is eating a piece of bread'.

End-User Applications (LLM Interaction)

Desktop

SYSTEM NAME DESCRIPTION
AI Chat CLI tool featuring Chat-REPL, Shell Assistant, RAG, AI tools & agents
AnythingLLM * Desktop UI with RAG capabilities for local LLM
ChatGPTerminator -
Danswer AI Self hostable AI assistant (OSS)
Fabric AI assistant with crowd sourced prompt patterns
Kotaemon RAG-based (load your own documents to embed ready for AI use) chat app. Supports local LLM.
Msty Top tier. Desktop app for LLM chat, add documents, transcribe audio to text.z
PromptMixer * Desktop UI with decent prompt text management
ShellGPT Add LLM access and use within the terminal
Verba Desktop UI with RAG capabilities to local LLM chat

Web App

SYSTEM NAME DESCRIPTION
ChatGPT (OpenAI) No.2
Claude (Anthropic) * v3.5 is the new leader, especially for code gen.
Codestral Mamba Model is using a different technique to current, much smaller size
Google AI Studio * Prototype text-based prompts, access v.2.0 Gemma
Gemma (Google) v1.5 models, ability to analyse / create images.
GroqAI * AI Playground with fast inferrence.
Humata (Tilda) Model focused on the person, natural and realistic text.
HuggingFace Chat * Playground allowing access to a wide range of hosted models.
HuggingFace Spaces: Whisper Web Voice transcription from mic, file or url
Perplexity (Perplexity) Solid AI offering
Pi (Inflection AI) Interesting and original text analysis ability
Mistral AI (Mistral) Very competant model

Coding Assistants

Standalone tools, and IDE extensions (code autocompletion) etc..

SYSTEM NAME DESCRIPTION
Aider A Tier. Terminal CLI, pair program with LLMs, builds a tree map of the repo to improve AI responses
Amazon Q Developer Amazons developer assistant, use in VSCodium and derivatives
CodeGPT Extension for VSCode allows code with AI
CodiumAI Spidermen 👇 pointing meme. Different companies, very similarly named
Codeium Spidermen 👆 pointing meme. Different companies, very similarly named
Git Co-Pilot by Git and Microsoft
Claude Dev A Tier
Cody by Sourcegraph
Continue Dev * A Tier. IDE extension. Use local LLM e.g. Ollama, and remote LLMs
Cursor IDE * Inbuild AI for autocomplete, access external AI via API including Ollama local models
Devika
Gemini UI-to-Code Streamlit app to convert images of UI designs into code
Google Code Transformer Very competant model, free access level is generous
Omni Engineer
Pear AI IDE with built-in AI interaction
RapidPages -
Sourcery Python, JS and TS AI code assistant, free for public open source
Tabby Self-hosted AI coding assistant (OSS)
Tabnine * Independent company, one of the first. Free autocompletion functionality for public repos
Twinny * A Tier. Private, code-completion plugin for VSCode, only uses local hosted LLMs (OSS)
Vanna Python RAG (Retrieval-Augmented Generation) framework for SQL generation (OSS)
Zed AI

Application Development Platforms with LLM

Locally Hosted

SYSTEM NAME DESCRIPTION
Agenta LLM development platform, docker hostable
AgentScope Build multi-agent applications (OSS)
ChainForge * LLM prompt engineering tool (OSS)
CodeAct Agent Coding agent, uses Mistral
Cognita RAG framework for building modular applications (OSS)
CoPilotKit Incorporate AI into custom applications (OSS)
CrewAI Multi-agent automation
DSpy Python framework for algorithmically optimizing LM prompts and weights
Flowise Low code LLM application builder (OSS)
GorillaLLM API 'link list' for LLM and Agents to access as 'function calls'
GPTCache Semantic cache for LLMs
LAgent Lightweight framework for LLM-based agent development
Ollama Grid Search Desktop app supporting LLM evaluation of models, prompts, inferencing
OpenAOE Chat with multiple LLMs at the same time aka LLM group chat
OpenDevin Autonomous app.dev agent
OpenPrompt Add Python library
Promptflow Dev tools for E2E creation of LLM-based AI apps
PromptFoo Testing system inc. triggered in CI/CD, for the process of LLM evaluation (OSS)
Phidata framework for building AI Agents with memory storage, contextual knowledge
Tasking AI LLM application development

3rd Party Hosted

SYSTEM NAME DESCRIPTION
Amazon Bedrock Build with foundation models
Dify.ai LLM app dev platform with RAG, Agents, and LLMOps process
Klu Collaborative prompt engineering
Lambda AI Stack AI training envs, managed upgrades for: PyTorch®, TensorFlow, CUDA, cuDNN etc.
LangChain A framework to easily construct LLM‑powered apps.
LangSmith Build LLM powered applications
Lightning AI Studio Run AI from external resource, zero setup, 22 Free GPU hours/month
LLMOPS Evaluate LLM output
Nebius -
Promptchainer Visual flow builder for AI flow creation, chain prompts
Promptmetheus.com -
Promptlayer Manage prompts, evaluate models and oversight of chat usage
Release.ai AI Development and Deployment platform for private AI apps
Restack -
Together AI -
VectorShift No-Low Code to build AI focused apps and workflows

LLM Training and Fine-tuning

SYSTEM NAME DESCRIPTION
Amazon SageMaker Build, train, and deploy machine learning models at scale
Lightning AI Studio Cloud hosted e2e AI services, training to LLMApp dev. Zero setup, 22 Free GPU hours/month
NanoGPT Lightweight system for training and fine-tuning up to medium size GPT
NVIDIA AI Workbench Free to access LLMApp development, AI training/tuning

LLM Infrastructure

System Development (Local Host)

SYSTEM NAME DESCRIPTION
Autogen Microsoft framework to create multiple agents with LLM access
LMDeploy Deploy and serve LLMs. Test data faster than vLLM
Ollama Python library Integrate Python 3.8+ projects with Ollama
RouteLLM Framework for serving and evaluating LLM routers
Semantic text splitter Python chunking by semantics, for RAG method
Semchunk Semantic chunking of text (used in LLM RAG method)
Text-splitter Semantic chunking of text (used in LLM RAG method)
Vector Admin * Manage datasets within LLM RAG db instances
vLLM High-throughput, memory-efficient inference engine. Journal Paper

Systems Development (Remote Host)

SYSTEM NAME DESCRIPTION
Embedchain Supports creation of RAG LLMApp. Create embedded data 'chunking', stores in vector db

Local LLM Server Hosting

SYSTEM NAME DESCRIPTION
Gollama Dashboard for managing local hosted Ollama models
gpt4all.cpp Run "Assistant-Tuned Chat-Style LLM". I have not used.
LocalAI OSS backend
llama.cpp For models coming from Meta, and its fine-tuned derivatives.
LM Studio * OSS backend
NVIDIA ChatRTX LLM on a GFX card
Oobabooga OSS backend
Ollama * LLM host backend, as of v0.2 (July) concurrency is enabled! Run multiple models in parallel, great for agentic setup (OSS)
OpenLLM Run OSS LLM, OpenAI-compatible API endpoints

Remote LLM Server Hosting

SYSTEM NAME DESCRIPTION
Anyscale AI compute platform
Paperspace AI GPU platform
Run Diffusion AI compute platform for Stable Diffusion (image) models
Vast AI AI compute hosting

Vector Databases (for RAG)

Vector databases are required for Retrieval-Augmentation-Generation (RAG), this methodology uses embedding models to processing documents/files, with the resulting file stored in Vector databases._

N.B. LLMs using RAG, are called Retrieval-augmented language models (RALMs)

SYSTEM NAME DESCRIPTION
Chroma Db Local bare-metal or container (OSS)
Pinecone (OSS)
LanceDb Embedded vector db (OSS), good for local storage by a LLMApp
Milvus Easily installed (via PIP) vector database (OSS)
OpenSearch Combines vector with traditional lexical, and hybrid search and analytic (OSS)
PgVector Vector model addon to PostgreSQL
Vespa Db allows distributed inference, plus organize vectors, tensors, text, and structured data.

To-do List

  • OpenRouter - unified interface for leveraging various Large Language Models (LLMs)
  • Add TLDRaw - Converts drawings / images to HTML
  • Add LangFlow - LangChain GUI, designed with react-flow for prototyping AI flows
  • Add LangFuse - LLM engineering platform, Docker-ready (OSS)
  • Add LangGraph - Build language agents as graphs
  • Add Plandex - Terminal-based AI programming engine (OSS)
  • Add IllaBuilder -
  • Add OpenUI
  • Add OpenAgents
  • Add Firecrawl - Convert any website into LLM-ready markdown or structured data
  • Add Draw-a-UI - convert image to code
  • Add [Agentic] - ?
  • Add [Autogen] - ?
  • Add TinyStories. It is a data set, new table?
  • Add Claude-engineer

Completed To-do

  • Add ShellGPT
  • Add GorillaLLM
  • Add Promptmetheus.com
  • Add Lightning AI Studio
  • Add Gollama - used to manage Ollama models more easily.
  • Add Ollama Grid Search - desktop app to evaluate and compare LLM models
  • LocalAI
  • Add NanoGPT
  • Option: Add table for AI related libraries e.g. [semantic-text-splitter]
  • Add Google Code Transformer - ?
  • Add Smith.Langchain.com - ?
  • Add Promptchainer.io - ?
  • Rearchitect the page - new structure and groupings
  • Add n8n as it now added AI Agent capability
  • Add coding / IDE assistant table, reallocate other table entries
  • Add table on local hosting models: Llama3, Mistral, Gemma, and Phi 3
  • Add link to HuggingFace site list
  • Add RAG supporting tools, such as embedded databases: Nomic-embed-text / mxbai-embed-large
  • Add another vector db: Lance