Lists on AI & LLM Tech - GRibbans/Gribbans GitHub Wiki

Image at the top of each wiki page, all Header Images use abstract geometric motifs

This wiki page for useful links on Generative A.I. technology, systems, tools, and research & analysis.

Unless stated, all detailed are Open-source software (OSS/FOSS)

'*' = marks those I use and would recommend taking a look at.

Table of Contents
AI Models
LLM Leader-boards
- Table of LLM Models
End-User Applications (LLM Interaction)
Application Development Platforms with LLM
- Locally Hosted
- 3rd Party Hosted
LLM Training and Fine-tuning
LLM Infrastructure
To-do List
- Completed To-do

AI Models

LLM Leader-boards

Aider Leader-board Aider is a well regarded code assistant; the ranking table is focused towards this use case.
LMSYS Leader-board Ranking is based of blind side-bys-de human evaluation. So is a good ranking of purely human opinion, so covers intrinsically, many aspects which we value in an AI model.
Bigcode Bench Evaluates LLMs with practical and challenging programming tasks.

Table of LLM Models

Below is a list of models which I have come across that have been noted by others (usually from Huggingface or Reddit). All LLM models below have been released to the public, and can be hosted locally.

Embedding models are specialist AI models dedicated to creating embeddings. They are used with Vector databases and the RAG process._

LLM NAME	PARAMETER	FILE SIZE	CONTEXT SIZE	RELEASE DATE	DESCRIPTION
Granite Code 3B	3B	2.0GB	2K	May-24	By IBM. Best for code completion tasks
Granite	8B	4.6GB	8.192K	May-24	By IBM. For code generation, code explanation, code fixing etc.
Llama3.1	8B	4.7GB	128K	Jul-24	By Meta. 70B (40GB) and 405B (231GB) also available
Llama3 Instruct	8B	4.7GB	-	-	By Meta. 8B is the budget mode from its latest release
Mistral 7B	7.3B	4.1GB	4k	Sep-23	General purpose AI
Mistral NeMo	12B	7.1GB	128K	Jul-24	General purpose AI
Openhermes v2.5	7.24B	4.1GB	8.192K	Feb-23	Fine-tuned by Teknium on Mistral with fully open datasets
Phi 3.5	3.82B	2.2GB	128K	Aug-24	By Microsoft. Very safe. Lightweight, state-of-the-art open model
Phi 3.1 Mini 128K	3.8B	1.8GB	128K	July-24	By Microsoft. Additional post-training data massive improvements across a range of benchmarks
~~Phi 3 Mini 4K~~	3.8B	2.3GB	4K	-	By Microsoft. Update to the Phi 3 model.
~~Phi 3 Small~~	3.8B	2.3GB	-	-	Microsofts new small, but powerful (equiv. GPT 3.5)
Qwen2	7B / 1.5B	4.4GB	-	July-24	Range of models. 1.5B model suitable for IDE code completion
Wizard LM2	7B	4.1GB	-	May-24	By Microsoft. Previous fastest model (worst reasoning in WLM2 range)
Yi Coder	8.83B	5.0GB	128K	Sep-24	Current no.1, state-of-the-art coding performance, fewer than 10 b parameters
Yi v1.5	6B	3.5GB	-	May-24	OSS LLM from 01.AI, it was trained on 3 trillion tokens of data
:--------------------------------------------------------------------	---------	---------	------------	------------	:-----------------------------------------------------------------------------
Autocoder	6.7B	7.2GB	-	-	Its highest test score is for coding in: Python
CodeGeex4	9B	5.5GB	128k	Jul-24	Currently it is highest scoring sub 10B param model for coding
CodeGemma 2B	2B	1.6GB	8K	May-24	By Google. 2B model is ideal for IDE code auto-completion
Codegemma Instruct	7B	5.0GB	-	-	-
CodeQwen 2.0	7B	4.4GB	-	-	Its highest test score is for coding in: JS
Codestral	22B	13GB	-	-	VG for Python. Mistrals first model focused on code gen. N.B. needs GPU
Deepseek v2.5	236B	N/A	128K	Sept-24	NOT LOCAL HOST. Best value, low cost and top tier for coding
~~DeepSeek-Coder-V2.1(0724)~~	16B	8.9GB	128k	June-24	Aider.chat ranks it second best model for coding related tasks. as at release
~~Deepseek-coder v2~~	16B	8.9GB	-	July-24	Now former best coding model (July 2024), requires a large capacity GPU
~~Deepseek-coder~~	6.7B	3.8GB	-	-	-
:--------------------------------------------------------------------	---------	---------	------------	------------	:-----------------------------------------------------------------------------
MiniCPM v2.6	8B	5.5GB	N/A	Aug-24	Vision model. Multimodal LLMs (MLLMs) designed for vision-language
:--------------------------------------------------------------------	---------	---------	------------	------------	:-----------------------------------------------------------------------------
All-minilm	22M	45MB	-	-	Embedding (RAG) model
Mxbai-embed-large	335M	669MB	-	-	Embedding (RAG) model
Nomic-embed-text	137M	274MB	-	-	Embedding (RAG) model

N.B. mxbai needs a custom system prompt 'Represent this sentence for searching relevant passages: A man is eating a piece of bread'.

End-User Applications (LLM Interaction)

Desktop

SYSTEM NAME	DESCRIPTION
AI Chat	CLI tool featuring Chat-REPL, Shell Assistant, RAG, AI tools & agents
AnythingLLM *	Desktop UI with RAG capabilities for local LLM
ChatGPTerminator	-
Danswer AI	Self hostable AI assistant (OSS)
Fabric	AI assistant with crowd sourced prompt patterns
Kotaemon	RAG-based (load your own documents to embed ready for AI use) chat app. Supports local LLM.
Msty	Top tier. Desktop app for LLM chat, add documents, transcribe audio to text.z
PromptMixer *	Desktop UI with decent prompt text management
ShellGPT	Add LLM access and use within the terminal
Verba	Desktop UI with RAG capabilities to local LLM chat

Web App

SYSTEM NAME	DESCRIPTION
ChatGPT (OpenAI)	No.2
Claude (Anthropic) *	v3.5 is the new leader, especially for code gen.
Codestral Mamba	Model is using a different technique to current, much smaller size
Google AI Studio *	Prototype text-based prompts, access v.2.0 Gemma
Gemma (Google)	v1.5 models, ability to analyse / create images.
GroqAI *	AI Playground with fast inferrence.
Humata (Tilda)	Model focused on the person, natural and realistic text.
HuggingFace Chat *	Playground allowing access to a wide range of hosted models.
HuggingFace Spaces: Whisper Web	Voice transcription from mic, file or url
Perplexity (Perplexity)	Solid AI offering
Pi (Inflection AI)	Interesting and original text analysis ability
Mistral AI (Mistral)	Very competant model

Coding Assistants

Standalone tools, and IDE extensions (code autocompletion) etc..

SYSTEM NAME	DESCRIPTION
Aider	A Tier. Terminal CLI, pair program with LLMs, builds a tree map of the repo to improve AI responses
Amazon Q Developer	Amazons developer assistant, use in VSCodium and derivatives
CodeGPT	Extension for VSCode allows code with AI
CodiumAI	Spidermen 👇 pointing meme. Different companies, very similarly named
Codeium	Spidermen 👆 pointing meme. Different companies, very similarly named
Git Co-Pilot	by Git and Microsoft
Claude Dev	A Tier
Cody	by Sourcegraph
Continue Dev *	A Tier. IDE extension. Use local LLM e.g. Ollama, and remote LLMs
Cursor IDE *	Inbuild AI for autocomplete, access external AI via API including Ollama local models
Devika
Gemini UI-to-Code	Streamlit app to convert images of UI designs into code
Google Code Transformer	Very competant model, free access level is generous
Omni Engineer
Pear AI	IDE with built-in AI interaction
RapidPages	-
Sourcery	Python, JS and TS AI code assistant, free for public open source
Tabby	Self-hosted AI coding assistant (OSS)
Tabnine *	Independent company, one of the first. Free autocompletion functionality for public repos
Twinny *	A Tier. Private, code-completion plugin for VSCode, only uses local hosted LLMs (OSS)
Vanna	Python RAG (Retrieval-Augmented Generation) framework for SQL generation (OSS)
Zed AI

Application Development Platforms with LLM

Locally Hosted

SYSTEM NAME	DESCRIPTION
Agenta	LLM development platform, docker hostable
AgentScope	Build multi-agent applications (OSS)
ChainForge *	LLM prompt engineering tool (OSS)
CodeAct Agent	Coding agent, uses Mistral
Cognita	RAG framework for building modular applications (OSS)
CoPilotKit	Incorporate AI into custom applications (OSS)
CrewAI	Multi-agent automation
DSpy	Python framework for algorithmically optimizing LM prompts and weights
Flowise	Low code LLM application builder (OSS)
GorillaLLM	API 'link list' for LLM and Agents to access as 'function calls'
GPTCache	Semantic cache for LLMs
LAgent	Lightweight framework for LLM-based agent development
Ollama Grid Search	Desktop app supporting LLM evaluation of models, prompts, inferencing
OpenAOE	Chat with multiple LLMs at the same time aka LLM group chat
OpenDevin	Autonomous app.dev agent
OpenPrompt	Add Python library
Promptflow	Dev tools for E2E creation of LLM-based AI apps
PromptFoo	Testing system inc. triggered in CI/CD, for the process of LLM evaluation (OSS)
Phidata	framework for building AI Agents with memory storage, contextual knowledge
Tasking AI	LLM application development

3rd Party Hosted

SYSTEM NAME	DESCRIPTION
Amazon Bedrock	Build with foundation models
Dify.ai	LLM app dev platform with RAG, Agents, and LLMOps process
Klu	Collaborative prompt engineering
Lambda AI Stack	AI training envs, managed upgrades for: PyTorch®, TensorFlow, CUDA, cuDNN etc.
LangChain	A framework to easily construct LLM‑powered apps.
LangSmith	Build LLM powered applications
Lightning AI Studio	Run AI from external resource, zero setup, 22 Free GPU hours/month
LLMOPS	Evaluate LLM output
Nebius	-
Promptchainer	Visual flow builder for AI flow creation, chain prompts
Promptmetheus.com	-
Promptlayer	Manage prompts, evaluate models and oversight of chat usage
Release.ai	AI Development and Deployment platform for private AI apps
Restack	-
Together AI	-
VectorShift	No-Low Code to build AI focused apps and workflows

LLM Training and Fine-tuning

SYSTEM NAME	DESCRIPTION
Amazon SageMaker	Build, train, and deploy machine learning models at scale
Lightning AI Studio	Cloud hosted e2e AI services, training to LLMApp dev. Zero setup, 22 Free GPU hours/month
NanoGPT	Lightweight system for training and fine-tuning up to medium size GPT
NVIDIA AI Workbench	Free to access LLMApp development, AI training/tuning

LLM Infrastructure

System Development (Local Host)

SYSTEM NAME	DESCRIPTION
Autogen	Microsoft framework to create multiple agents with LLM access
LMDeploy	Deploy and serve LLMs. Test data faster than vLLM
Ollama Python library	Integrate Python 3.8+ projects with Ollama
RouteLLM	Framework for serving and evaluating LLM routers
Semantic text splitter	Python chunking by semantics, for RAG method
Semchunk	Semantic chunking of text (used in LLM RAG method)
Text-splitter	Semantic chunking of text (used in LLM RAG method)
Vector Admin *	Manage datasets within LLM RAG db instances
vLLM	High-throughput, memory-efficient inference engine. Journal Paper

Systems Development (Remote Host)

SYSTEM NAME	DESCRIPTION
Embedchain	Supports creation of RAG LLMApp. Create embedded data 'chunking', stores in vector db

Local LLM Server Hosting

SYSTEM NAME	DESCRIPTION
Gollama	Dashboard for managing local hosted Ollama models
gpt4all.cpp	Run "Assistant-Tuned Chat-Style LLM". I have not used.
LocalAI	OSS backend
llama.cpp	For models coming from Meta, and its fine-tuned derivatives.
LM Studio *	OSS backend
NVIDIA ChatRTX	LLM on a GFX card
Oobabooga	OSS backend
Ollama *	LLM host backend, as of v0.2 (July) concurrency is enabled! Run multiple models in parallel, great for agentic setup (OSS)
OpenLLM	Run OSS LLM, OpenAI-compatible API endpoints

Remote LLM Server Hosting

SYSTEM NAME	DESCRIPTION
Anyscale	AI compute platform
Paperspace	AI GPU platform
Run Diffusion	AI compute platform for Stable Diffusion (image) models
Vast AI	AI compute hosting

Vector Databases (for RAG)

Vector databases are required for Retrieval-Augmentation-Generation (RAG), this methodology uses embedding models to processing documents/files, with the resulting file stored in Vector databases._

N.B. LLMs using RAG, are called Retrieval-augmented language models (RALMs)

SYSTEM NAME	DESCRIPTION
Chroma Db	Local bare-metal or container (OSS)
Pinecone	(OSS)
LanceDb	Embedded vector db (OSS), good for local storage by a LLMApp
Milvus	Easily installed (via PIP) vector database (OSS)
OpenSearch	Combines vector with traditional lexical, and hybrid search and analytic (OSS)
PgVector	Vector model addon to PostgreSQL
Vespa	Db allows distributed inference, plus organize vectors, tensors, text, and structured data.

To-do List

OpenRouter - unified interface for leveraging various Large Language Models (LLMs)
Add TLDRaw - Converts drawings / images to HTML
Add LangFlow - LangChain GUI, designed with react-flow for prototyping AI flows
Add LangFuse - LLM engineering platform, Docker-ready (OSS)
Add LangGraph - Build language agents as graphs
Add Plandex - Terminal-based AI programming engine (OSS)
Add IllaBuilder -
Add OpenUI
Add OpenAgents
Add Firecrawl - Convert any website into LLM-ready markdown or structured data
Add Draw-a-UI - convert image to code
Add [Agentic] - ?
Add [Autogen] - ?
Add TinyStories. It is a data set, new table?
Add Claude-engineer

Completed To-do

~~Add ShellGPT~~
~~Add GorillaLLM~~
~~Add Promptmetheus.com~~
~~Add Lightning AI Studio~~
~~Add Gollama - used to manage Ollama models more easily.~~
~~Add Ollama Grid Search - desktop app to evaluate and compare LLM models~~
~~LocalAI~~
~~Add NanoGPT~~
~~Option: Add table for AI related libraries e.g. [semantic-text-splitter]~~
~~Add Google Code Transformer - ?~~
~~Add Smith.Langchain.com - ?~~
~~Add Promptchainer.io - ?~~
~~Rearchitect the page - new structure and groupings~~
~~Add n8n as it now added AI Agent capability~~
~~Add coding / IDE assistant table, reallocate other table entries~~
~~Add table on local hosting models: Llama3, Mistral, Gemma, and Phi 3~~
~~Add link to HuggingFace site list~~
~~Add RAG supporting tools, such as embedded databases: Nomic-embed-text / mxbai-embed-large~~
~~Add another vector db: Lance~~