LLM - AshokBhat/ml GitHub Wiki
Large language model (LLM)
- Excels at a wide range of tasks
- Parameter count - Order of billions or more.
- More resources (data, parameter size, computing power) devoted to them, better/more things they can do.
Abilities
- Capture much of the syntax and semantics of the human language
- Considerable general knowledge about the world
- Able to "memorize" a great number of facts during training
Leading Open-Source LLMs (August 2025)
Leading Open-Source LLMs (August 2025)
Model Name |
Company |
Release Date |
Parameters (B) |
License |
Commercial Use |
Specialization / Notes |
GPT-OSS-120B |
OpenAI |
Aug 2025 |
117 total / 5.1 active |
Apache 2.0 |
Yes |
Reasoning, tool use |
GPT-OSS-20B |
OpenAI |
Aug 2025 |
21 total / 3.6 active |
Apache 2.0 |
Yes |
Edge deployment |
Qwen3-235B-A22B |
Alibaba |
Aug 2025 |
235 total / 22 active |
Apache 2.0 |
Yes |
Multilingual, generalist |
DeepSeek R1 |
DeepSeek |
Jan 2025 |
671 total / 37 active |
MIT |
Yes |
Reasoning, math, code |
LLaMA 3.3 |
Meta |
Jul 2025 |
70 |
Meta |
Yes (limited) |
Instruction tuning |
Phi-3 Series |
Microsoft |
2025 |
3.8B, 7B, 14B |
MIT |
Yes |
Edge, mobile AI |
Kimi K2 |
Moonshot AI |
Aug 2025 |
1T total / 32 active |
MIT (mod.) |
Yes |
Coding, agentic tasks |
Mixtral 8x7B |
Mistral AI |
2025 |
56 total / 7 active |
Apache 2.0 |
Yes |
Efficient MoE design |
Gemma 2.0 Flash |
Google |
2025 |
Not stated |
Apache 2.0 |
Yes |
Multimodal, fast |
Doubao-1.5-Pro |
ByteDance |
Jan 2025 |
Not stated |
Apache 2.0 |
Yes |
Reasoning, Chinese NLP |
MPT Series |
MosaicML |
2023–2025 |
7B–30B |
Apache 2.0 |
Yes |
Enterprise-ready |
Questions
- What are large language models?
- How are large language models trained?
- What are the different tasks that large language models can be used for?
- What are the advantages and disadvantages of large language models?
See also
- [GPT]] ](/AshokBhat/ml/wiki/[[ChatGPT)
- LLAMA