AI Video Prediction - BKJackson/BKJackson_Wiki GitHub Wiki

Transformers for Video Prediction

PredFormer - Video prediction transformers without recurrence or convolution
VPTR: Efficient Transformers for Video Prediction - Github

Papers

A Survey of Transformers in Video Prediction - W. Ji, 2023
Video Transformers: A Survey - Selva et al., 2023, We delve into how videos are handled at the input level first. Then, we study the architectural changes made to deal with video more efficiently, reduce redundancy, re-introduce useful inductive biases, and capture long-term temporal dynamics. In addition, we provide an overview of different training regimes and explore effective self-supervised learning strategies for video. Finally, we conduct a performance comparison on the most common benchmark for Video Transformers (i.e., action classification), finding them to outperform 3D ConvNets even with less computational complexity. A Comprehensive Survey of Recent Transformers in Image, Video and Diffusion Models - Le et al., 2024
Survey: Transformer-based Models in Data Modality Conversion - Rashno et al., 2024