Model Architecture - HanjieChen/Reading-List GitHub Wiki
State Space Models (SSMs)
- LONG RANGE ARENA: A BENCHMARK FOR EFFICIENT TRANSFORMERS
- HiPPO: Recurrent Memory with Optimal Polynomial Projections
- Combining Recurrent, Convolutional, and Continuous-time Models with Structured Learned Linear State-Space Layers
- Efficiently Modeling Long Sequences with Structured State Spaces - S4
- The Annotated S4 - Practice S4 from Sasha
- SIMPLIFIED STATE SPACE LAYERS FOR SEQUENCE MODELING
- Hungry Hungry Hippos: Towards Language Modeling with State Space Models - SSM Architechture
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces
- MambaByte: Token-free Selective State Space Model
Backpack Language Model
- Backpack Language Models - ACL 2023 outstanding paper
- Character-level Chinese Backpack Language Models
- Model Editing with Canonical Examples