EleutherAI - AshokBhat/ml GitHub Wiki
About
- A grass-roots non-profit AI research group
- Considered an open-source version of OpenAI
- Formed in July 2020 to organize a replication of GPT-3
Pile
- A curated dataset of diverse text for training LLMs.
- 825 GiB English text corpus
- From 22 different sources.
GPT-J
- 6B parameter
- Open source
- English autoregressive language model
- Trained on the Pile.
- Released on Jun 2021.
- Part of MLPerf Inference benchmark.