Multimodal RNN - rugbyprof/5443-Data-Mining GitHub Wiki
The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. http://www.stat.ucla.edu/~junhua.mao/m-RNN.html