LLaMA - AshokBhat/ml GitHub Wiki

About

Name	Release Date	Parameters (in B)	Context Length	Corpus Size	Commercial Viability?
Llama 4 Behemoth	TBD	288 (active), ~2,000 (total), 16 experts	TBD	TBD	TBD
Llama 4 Scout	Apr 2025	17 (active), 109 (total), 16 experts	10M (Instruct), 256K (pretrain)	40T	Yes
Llama 4 Maverick	Apr 2025	17 (active), 400 (total), 128 experts	512K (Instruct), 256K (pretrain)	40T	Yes
Llama 3.2	Sep 2024	1, 3, 11, 90	128K	>15T	Yes
Llama 3.1	Jul 2024	8, 70, 405	128K	>15T	Yes
Llama 3	Apr 2024	8, 70	8K	15T	Yes
Llama 2	Jul 2023	7, 13, 70	4K	2T	Yes
Llama	Feb 2023	7, 13, 33, 65	2K	1–1.4T	No

Mixture-of-Experts (MoE) architecture (Maverick: 128 experts, Scout: 16 experts)
Native multimodal input (text and images)
Dramatically increased context length (up to 10 million tokens for Scout, 1 million for Maverick)
Trained on 40 trillion tokens, supporting 200 languages (with special fine-tuning for 12 major languages)
Designed for both high performance (Maverick) and efficient deployment (Scout, fits on a single server-grade GPU)