LLM - AshokBhat/ml GitHub Wiki

Large language model (LLM)

  • Excels at a wide range of tasks
  • Parameter count - Order of billions or more.
  • More resources (data, parameter size, computing power) devoted to them, better/more things they can do.

Abilities

  • Capture much of the syntax and semantics of the human language
  • Considerable general knowledge about the world
  • Able to "memorize" a great number of facts during training

Leading Open-Source LLMs (August 2025)

Leading Open-Source LLMs (August 2025)

Model Name Company Release Date Parameters (B) License Commercial Use Specialization / Notes
GPT-OSS-120B OpenAI Aug 2025 117 total / 5.1 active Apache 2.0 Yes Reasoning, tool use
GPT-OSS-20B OpenAI Aug 2025 21 total / 3.6 active Apache 2.0 Yes Edge deployment
Qwen3-235B-A22B Alibaba Aug 2025 235 total / 22 active Apache 2.0 Yes Multilingual, generalist
DeepSeek R1 DeepSeek Jan 2025 671 total / 37 active MIT Yes Reasoning, math, code
LLaMA 3.3 Meta Jul 2025 70 Meta Yes (limited) Instruction tuning
Phi-3 Series Microsoft 2025 3.8B, 7B, 14B MIT Yes Edge, mobile AI
Kimi K2 Moonshot AI Aug 2025 1T total / 32 active MIT (mod.) Yes Coding, agentic tasks
Mixtral 8x7B Mistral AI 2025 56 total / 7 active Apache 2.0 Yes Efficient MoE design
Gemma 2.0 Flash Google 2025 Not stated Apache 2.0 Yes Multimodal, fast
Doubao-1.5-Pro ByteDance Jan 2025 Not stated Apache 2.0 Yes Reasoning, Chinese NLP
MPT Series MosaicML 2023–2025 7B–30B Apache 2.0 Yes Enterprise-ready

Questions

  • What are large language models?
  • How are large language models trained?
  • What are the different tasks that large language models can be used for?
  • What are the advantages and disadvantages of large language models?

See also

  • [GPT]] ](/AshokBhat/ml/wiki/[[ChatGPT)
  • LLAMA