LLM - gusenov/kb GitHub Wiki

Cake is a Rust framework for distributed inference of large models like LLama3 based on Candle.
YouTube / @TheOfficialACM / Understanding the LLM Development Cycle: Building, Training, and Finetuning
middleware.io / What is LLM Observability? By Vivek Tilva
Hackaday.com / An Animated Walkthrough Of How Large Language Models Work
A ChatGPT clone, in 3000 bytes of C, backed by GPT-2 by Nicholas Carlini
- Хабр / Клон ChatGPT в 3000 байтах на C, основанный на GPT-2
Хабр / Google представили Titan: архитектуру нейросетей, которая может стать новой серебряной пулей LLM
- Все современные LLM построены на архитектуре трансформера. GPT-4o от OpenAI, Gemini от Google, Claude Sonet от Anthropic, Grok от xAI...
Ollama
- ollama/ollama Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
darrenburns/elia A snappy, keyboard-centric terminal user interface for interacting with large language models. Chat with ChatGPT, Claude, Llama 3, Phi 3, Mistral, Gemma and more.

MCP (Model Context Protocol)

HuggingFace.com / Learn / MCP Course
The GitHub Blog / What the heck is MCP and why is everyone talking about it? TL;DR: It’s an open standard for connecting LLMs to data and tools.
GitHub / tadata-org/fastapi_mcp A zero-configuration tool for automatically exposing FastAPI endpoints as Model Context Protocol (MCP) tools.
- Turn any FastAPI app into an MCP server!
KDnuggets.com / Building A Simple MCP Server Give your LLMs the extra ability to fetch live stock prices, compare them, and provide historical analysis by implementation tools within the MCP Server.