bibliography_bk.md - hassony2/inria-research-wiki GitHub Wiki

TOC

First person body pose estimation
Hand detection
Object affordances
Object discovery
Motion clustering
Action Recognition

First person body pose estimation

You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions, ArXiv'19 {notes} {paper} {project page} {missing dataset?}

Evonne Ng, Donglai Xiang, Hanbyul Joo, Kristen Grauman

Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video, CVPR'17 {notes} {paper} {project page} {code.gz} {dataset.zip}

Hao Jiang, Kristen Grauman

hand-crafted and neural-network feature based ego-pose estimation

Introduces task of estimating full ego-pose when joints are not necessarily visible
Small synchronized 1st + 3d person dataset

3D ego-pose estimation via imitation learning. ECCV'18 {paper}

Ye Yuan and Kris Kitani

Relevant and looks worth the read !

Also relevant related work for learning in simulation.
Evaluates quantitatively on synthetic sequences

Motion Capture from Body-Mounted Cameras, SIGGRAPH'11 {paper} {project page}

Takaaki Shiratori, Hyun Soo Park, Leonid Sigal, Yaser Sheikh, Jessica Hodgins

SFM from joint-attached wearable cameras

"Uses structure from motion (SfM) to reconstruct the 3D location of 16 body mounted cameras placed on a person’s joints."

Hand detection

Analysis of the hands in egocentric vision: A survey, ArXiv'19 paper

Andrea Bandini, Jose Zariffa

Extensive survey with recent list of datasets with hand annotations

Contextual Attention for Hand Detection in the Wild, ICCV'19 {notes} {paper} {project page} {code, Keras}

Supreeth Narasimhaswamy†, Zhengwei Wei†, Yang Wang, Justin Zhang, Minh Hoai

Details

No left-right hands annotations, but general 2D orientation annotation
Most benefits seem to come from introduced datasets

Analysis of Hand Segmentation in the Wild, CVPR'19 {paper} {code, Matlab}

Aisha Urooj Khan, Ali Borji

Pixel-level Hand Detection for Ego-centric Videos, CVPR'13 {paper}

Cheng Li, Kris Kitani

Object affordances

Grounded Human-Object Interaction Hotspots from Video, ICCV'19 {notes} {paper} {project page} {code PyTorch}

Tushar Nagarajan, Christoph Feichtenhofer, Kristen Grauman

Generating 3D People in Scenes without People, ArXiv'19 {paper} {notes}

Yan Zhang, Mohamed Hassan† Heiko Neumann, Michael J. Black, Siyu Tang

Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments, CVPR'19 {paper} {notes}

Xueting Li, Sifei Liu, Kihwan Kim, Xiaolong Wang, Ming-Hsuan Yang, and Jan Kautz

EGO-TOPO: Environment Affordances from Egocentric Video, ArXiv'20 {project page} {paper} {notes}

Tushar Nagarajan, Yanghao Li, Christoph Feichtenhofer, Kristen Grauman

Object manipulation

Learning a Generative Model for Multi-Step Human–Object Interactions from Videos, Eurographics'19, {paper} {video}

He Wang1, Sören Pirk1, Ersin Yumer, Vladimir G. Kim, Ozan Sener, Srinath Sridhar, and Leonidas J. Guibas

Video-based Hand Manipulation Capture Through Composite Motion Control, SIGGRAPH'13 {paper} {video}

Yangang Wang, Jianyuan Min. Jianjie Zhang† Yebin Liu, Feng Xu, Qionghai Dai, Jinxiang Chai

Object discovery

Object Discovery in Videos as Foreground Motion Clustering, CVPR19 {paper} {poster} {notes}

Christopher Xie, Yu Xiang, Zaid Harchaoui, Dieter Fox

Next-Active-Object prediction from Egocentric Videos, JVCI'17 {paper} {project page}

Antonino Furnari, Sebastiano Battiato, Kristen Grauman, Giovanni Maria Farinella

Figure-Ground Segmentation Improves Handled Object Recognition in Egocentric Video, CVPR'10 {paper}

Xiaofeng Ren, Chunhui Gu

Manipulation generation

paper

An Example-Based Motion Synthesis Technique for Locomotion and Object Manipulation, SIGGRAPH'12 {paper} {video}

Andrew W. Feng, Yuyu Xu, Ari Shapiro

Task-based Locomotion, TOG'16 {project page} {paper}

Shailen Agrawal, Michiel van de Panne

Data-Driven Animation of Hand-Object Interactions, Conference on Automatic Face and Gesture Recognition'11 paper video

Henning Hamer, Jürgen Gall, Raquel Urtasun , Luc van Gool

Motion clustering

Convolutional Sequence Generation for Skeleton-Based Action Synthesis, ICCV'19 {paper} {code}

Sijie Yan, Zhizhong Li, Yuanjun Xiong, Dahua Lin

Learning Trajectory Dependencies for Human Motion Prediction, ICCV'19 {paper} {code}{notes}

Wei Mao1, Miaomiao Liu, Mathieu Salzmann, Hongdong Li

Details

graph convolutions on DCT (discrete cosine transform) coefficients
DCT equiv trajectory ? Why ? (speed of motion is not abstracted I think ?)

Graph Embedded Pose Clustering for Anomaly Detection, ArXiv'19 {paper} {code}

Amir Markovitz, Gilad Sharir, Itamar Friedman, Lihi Zelnik-Manor, and Shai Avidan

Efficient Unsupervised Temporal Segmentation of Motion Data, TPAMI'15 (?) {paper}

Bjorn Krüger, Anna Vögele, Tobias Willig, Angela Yao, Reinhard Klein

Metric Learning from Poses for Temporal Clustering of Human Motion, BMVC'12 {paper}

Adolfo López-Méndez, Juergen Gall, Josep R. Casas, Luc van Gool

Human Motion Analysis with Deep Metric Learning, ECCV'18 {paper} {non official code} {ArXiv} {notes}

Huseyin Coskun, David Joseph Tan, Sailesh Conjeti, Nassir Navab, Federico Tombari

Human Motion Prediction via Spatio-Temporal Inpainting, ICCV'19 {paper} {ArXiv}

Alejandro Hernandez, Jurgen Gall, Francesc Moreno-Noguer

Structured Prediction Helps 3D Human Motion Modelling, ICCV'19 {project page} {paper} {code} {notes}

Emre Aksan, Manuel Kaufmann, Otmar Hilliges

Human Motion Anticipation with Symbolic Label, ArXiv'19 {paper}

Julian Tanke, Andreas Weber, Jurgen Gall,

Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks, ACM Multimedia'16 {paper}

Pichao Wang, Zhaoyang Li, Yonghong Hou, Wanqing Li

Details

temopral color-encoding of trajectories

Imitation Learning for Human Pose Prediction, ICCV'19 [{paper}]

(http://openaccess.thecvf.com/content_ICCV_2019/papers/Wang_Imitation_Learning_for_Human_Pose_Prediction_ICCV_2019_paper.pdf) {poster}

Borui Wang, Ehsan Adeli, Hsu-kuang Chiu, De-An Huang, Juan Carlos Niebles

Details

Report mean angle errors over different time spans

DLow: Diversifying Latent Flows for Diverse Human Motion Prediction {paper}

Ye Yuan, Kris Kitani

Action recognition

Skeleton-based

awesome skeleton-based action recognition

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition, CVPR'19, {paper} {code} {notes}

Lei Shi, Yifan Zhang, Jian Cheng, Hanq Lu

Bayesian Hierarchical Dynamic Model for Human Action Recognition, CVPR'19 {paper} {matlab code} [{bib, supp}](http://openaccess.thecvf.com/content_CVPR_2019/html/Zhao_Bayesian_Hierarchical_Dynamic_Model_for_Human_Action_Recognition_CVPR_2019_paper.html

Rui Zhao, Wanru Xu, Hui Su and Qiang Ji

Graph Embedded Pose Clustering for Anomaly Detection, ArXiv'19, {paper} {code}

Amir Markovitz, Gilad Sharir, Itamar Friedman, Lihi Zelnik-Manor, Shai Avidan

Efficient Temporal Sequence Comparison and Classification using Gram Matrix Embeddings On a Riemannian Manifold, CVPR'16 {paper} {matlab code} {project page}

Xikang Zhang, Yin Wang, Mengran Gou, Mario Sznaier, Octavia Camps

Details

Use velocities
No DTW

Ego-centric

Fine-Grained Action Retrieval through Multiple Parts-of-Speech Embeddings, ICCV'19 {project page} {paper} {notes}

Michael Wray, Diane Larlus, Gabriela Csurka, Dima Damen

Learning Visual Actions Using Multiple Verb-Only Labels, BMVC'19 {paper} {notes}

Michael Wray, Dima Damen

Details

Special focus on open actions, which are differently visually encoded

Evaluates on Epic

split	s1 verb	s2 verb	s1 noun	s2 noun
AVSlowFast	65.7	55.8	46.4	32.7
EPIC-Fusion	64.8	52.7	46.0	27.9
LeaderBoard GT-WISC-MPI	68.51	60.05	49.96	38.14

EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition ICCV'19 {paper} {code (PyTorch)} {project page}

Evangelos Kazakos, Arsha Nagrani, Andrew Zisserman, Dima Damen

Audiovisual SlowFast Networks for Video Recognition, ArXiv'20 {paper} {code soon}

Fanyi Xiao, Yong Jae Lee, Kristen Grauman, Jitendra Malik, Christoph Feichtenhofer

LSTA: Long Short-Term Attention for Egocentric Action Recognition, CVPR'19 {paper}

Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz, Bruno Kessler

Action recognition

Video Action Transformer Network, CVPR'19 {paper} {project page}

Rohit Girdhar, Joao Carreira, Carl Doersch, Andrew Zisserman

⚠️ GitHub.com Fallback ⚠️