Transformer Book Notes - doraithodla/notes GitHub Wiki
Started 10/3
So what is it about transformers that changed the field almost overnight? Like many great scientific breakthroughs, it was the synthesis of several ideas, like attention, transfer learning, and scaling up neural networks, that were percolating in the research community at the time.
Notes
- RNNs
- Attention Mechanisms
- Self Attention
- Transfer Learning
Books
-
Hands-On Machine Learning with Scikit-Learn and TensorFlow, by Aurélien Géron (O’Reilly)
-
Deep Learning for Coders with fastai and PyTorch, by Jeremy Howard and Sylvain Gugger (O’Reilly)
-
- Natural Language Processing with PyTorch, by Delip Rao and Brian McMahan (O’Reilly)
-
The Hugging Face Course, by the open source team at Hugging Face Transformers offers several layers of abstraction for using and training transformer models. We’ll start with the easy-to-use pipelines that allow us to pass text examples through the models and investigate the predictions in just a few lines of code. Then we’ll move on to tokenizers, model classes, and the Trainer API, which allow us to train models for our own use cases.
Online rsources Google Colaboratory
Kaggle Notebooks
Paperspace Gradient Notebooks
Transformer - a network architecture for sequence modeling.
Well known transformer:
- Generative pretrained model (GPT)
- Bidrectional encoder representation from transfomers
By combining architecture with unsupervised learning, these models removed the need to train task specific architectures from scratch.
Encoder Decoder Framework