Links: Transformer Networks - touretzkyds/ai4k12 GitHub Wiki
Transformer networks are deep neural networks now widely used for neural natural language processing, including handling search queries, question answering, image captioning, and translating between languages.
Introductory Tutorials on Transformers
- video (9:10) and text: Transformers Explained: Understand the Model Behind GPT, BERT, and T5
- Transformer: A Novel Neural Network Architecture for Language Understanding, Google AI blog. Very accessible introduction.
- Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP)
- How Transformers Work in Deep Learning and NLP
- Getting Meaning From Text
- A deep dive into BERT: How BERT launched a rocket into natural language understanding
More Technical Tutorials
- Transformers From Scratch (Rohrer)
- Transformers From Scratch (Bloem)
- The Annotated Transformer
- Transformer model for language understanding (TensorFlow tutorial on language translation)
- Language modeling with nn.Transformer and TorchText (PyTorch tutorial)
Technical Videos on Transformers
- The Narrated Transformer Language Model
- Tensor2Tensor Transformers
- GPT-3: Language Models are Few-Shot Learners (Paper Explained) (1:04:29)
- A Visual Guide to Transformer Neural Networks (series):
- Rasa Algorithm Whiteboard - Transformers & Attention 1: Self Attention
Question Answering Demos Using Transformers
- Google BERT demo [direct link]
- ML4K BERT Q&A model [direct link]
Text Generation Demos Using Transformers
- Talk to Transformer [direct link]
- TextSynth [direct link]
Important Papers
- Attention Is All You Need, Vaswani et al. 2017.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin et al. 2019.
- Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Wu et al. 2016.
Capabilites of Large Language Models
- Google's AI Is Something Even Stranger Than Conscious, Stephen Marche, The Atlantic, June 19, 2022
- How Does ChatGPT Work? Tracing the Evolution of AIGC DTonomy, December 31, 2022
Other Resources
- Simple Transformer Language Model (Python notebook in CoLab)
- SQuAD: Stanford Question Answering Dataset used to train some BERT models