GPT J - AshokBhat/ml GitHub Wiki
About
- A 6 Billion Parameter English Autoregressive Language Model
- Trained on the Pile.
- Released on Jun 2021.
- Part of MLPerf Inference benchmark.
- By EleutherAI
Characteristics
Model Component | Value |
---|---|
Number of Layers | 28 |
Model Dimension | 4096 |
Feedforward Dimension | 16384 |
Number of Heads | 16 |
Head Dimension | 256 |
Rotary Position Embedding (RoPE) | 64 per head |
Tokenization Vocabulary | 50257 |
BPEs | Same as GPT-2/GPT-3 |