Attalos Reading - Lab41/attalos GitHub Wiki
##Text Feature Extraction
- Paragraph Vectors: https://cs.stanford.edu/~quocle/paragraph_vector.pdf
- Word2Vec: https://arxiv.org/pdf/1301.3781.pdf
- Improving Distributional Similarity with Lessons Learned from Word Embeddings: http://www.aclweb.org/anthology/Q15-1016 (Or, "Why is Word2Vec so good at downstream tasks?")
##Image Feature Extraction
- Very Deep Convolution Networks For Large-Scale Image Recognition: https://arxiv.org/pdf/1409.1556.pdf (VGG)
- ImageNet Classification with Deep Convolution Neural networks: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf (AlexNet)
- Multimodal: Text and Images
- Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models: https://arxiv.org/abs/1411.2539 (Encoder-Decoder)
- DeViSE: A Deep Visual-Semantic Embeedding Model: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41473.pdf
##Localization
- DenseCap: Fully Convolutional Localization Networks for Dense Captioning arXiv | code | about
- Image Retrieval using Scene Graphs: http://hci.stanford.edu/publications/2015/scenegraphs/JohnsonCVPR2015.pdf (no code)
- Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval: http://nlp.stanford.edu/pubs/schuster-krishna-chang-feifei-manning-vl15.pdf
- Scene graph parser code: http://nlp.stanford.edu/software/scenegraphparser.shtml
##Graph Feature Extraction
- Learning Convolutional Neural Networks for Graphs: https://arxiv.org/pdf/1605.05273.pdf
##Multimodal: Text and Graphs
- Knowledge Graph and Text Jointly Embedding: http://www.aclweb.org/anthology/D14-1167
- Multimodal: Other
- Text and Brain Scans: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4497373/ http://aclweb.org/anthology/P14-1046
- Text and Video: https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/viewFile/9734/956