Attalos Reading - Lab41/attalos GitHub Wiki

##Text Feature Extraction

Paragraph Vectors: https://cs.stanford.edu/~quocle/paragraph_vector.pdf
Word2Vec: https://arxiv.org/pdf/1301.3781.pdf
Improving Distributional Similarity with Lessons Learned from Word Embeddings: http://www.aclweb.org/anthology/Q15-1016 (Or, "Why is Word2Vec so good at downstream tasks?")

##Image Feature Extraction

Very Deep Convolution Networks For Large-Scale Image Recognition: https://arxiv.org/pdf/1409.1556.pdf (VGG)
ImageNet Classification with Deep Convolution Neural networks: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf (AlexNet)
Multimodal: Text and Images
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models: https://arxiv.org/abs/1411.2539 (Encoder-Decoder)
DeViSE: A Deep Visual-Semantic Embeedding Model: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41473.pdf

##Localization

DenseCap: Fully Convolutional Localization Networks for Dense Captioning arXiv | code | about
Image Retrieval using Scene Graphs: http://hci.stanford.edu/publications/2015/scenegraphs/JohnsonCVPR2015.pdf (no code)
Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval: http://nlp.stanford.edu/pubs/schuster-krishna-chang-feifei-manning-vl15.pdf
Scene graph parser code: http://nlp.stanford.edu/software/scenegraphparser.shtml

##Graph Feature Extraction

Learning Convolutional Neural Networks for Graphs: https://arxiv.org/pdf/1605.05273.pdf

##Multimodal: Text and Graphs