General: Deep Learning Methods - chenyang03/Reading GitHub Wiki

{Mikolov13} Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. Proc. of NIPS, 2013. word2vec
{Cho14} Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Proc. of EMNLP, 2014. GRU
{Koutník14} Koutník J, Greff K, Gomez J F and Schmidhuber J. A Clockwork RNN. Proceedings of ICML 2014, 1863-1871. clockwork RNN
{Le14} Quoc V. Le, Tomas Mikolov. Distributed Representations of Sentences and Documents. Proc. of ICML, 2014. doc2vec
{LeCun15} Yann LeCun and Yoshua Bengio and Geoffrey Hinton. Deep learning. Nature, 2015, 521:436–444.
{Xu15} Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, Yoshua Bengio. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:2048-2057, 2015. attention model
{Bahdanau15} Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. Neural Machine Translation by Jointly Learning to Align and Translate. Proc. of ICLR, 2015. attention model
{Jozefowicz15} Rafal Jozefowicz, Wojciech Zaremba, and Ilya Sutskever. An empirical exploration of recurrent network architectures. Proc. of ICML, 2015.
{Chorowski15} Jan Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. 2015. Attention-based models for speech recognition. Proc. of NIPS, 2015. extend the attention-mechanism with features needed for speech recognition
{Dong15} Chao Dong, Chen Change Loy, Kaiming He, Xiaoou Tang. Image Super-Resolution Using Deep Convolutional Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015, 38(2):295-307. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one
{Neil16} Daniel Neil, Michael Pfeiffer, and Shih-Chii Liu. Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences. Proceedings of NIPS, 2016, 3882-3890. Phased LSTM network achieves faster convergence than regular LSTMs on tasks which require learning of long sequences
{Chen17} Jingyuan Chen, Hanwang Zhang, Xiangnan He, Liqiang Nie, Wei Liu, Tat-Seng Chua. Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention. Proc. of ACM SIGIR, 2017.
{Chang17} Shiyu Chang, Yang Zhang, Wei Han, Mo Yu, Xiaoxiao Guo, Wei Tan, Xiaodong Cui, Michael Witbrock, Mark Hasegawa-Johnson, Thomas S. Huang. Dilated Recurrent Neural Networks. Proc. of NIPS, 2017.
{Zhu17} Yu Zhu, Hao Li, Yikang Liao, Beidou Wang, Ziyu Guan, Haifeng Liu, and Deng Cai. What to Do Next: Modeling User Behaviors by Time-LSTM. Proc. of IJCAI, 2017. Time-LSTM equips LSTM with time gates to model time intervals PDF Code
{Li18} Zhuohan Li, Di He, Fei Tian, Wei Chen, Tao Qin, Liwei Wang, and Tie-Yan Liu. Towards Binary-Valued Gates for Robust LSTM Training. Proc. of ICML, 2018.
{Prokhorenkova18} Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, Andrey Gulin. CatBoost: unbiased boosting with categorical features. Proc. of NIPS, 2018. CatBoost is a fast, scalable, high performance open-source gradient boosting on decision trees library
{Wang19} Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng and Ben Y. Zhao. Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. Proc. of 40th IEEE Symposium on Security and Privacy (Oakland), 2019. the first robust and generalizable detection and mitigation system for DNN backdoor attacks
{Xu19} Keyulu Xu, Weihua Hu, Jure Leskovec and Stefanie Jegelka. How Powerful are Graph Neural Networks? Proc. of ICLR, 2019. present a theoretical framework for analyzing the expressive power of GNNs to capture different graph structures
{Ke19} Guolin Ke, Zhenhui Xu, Jia Zhang, Jiang Bian, and Tie-Yan Liu. DeepGBM: A Deep Learning Framework Distilled by GBDT for Online Prediction Tasks. Proc. of KDD, 2019.
{Guo19} Qipeng Guo, Xipeng Qiu, Pengfei Liu, Yunfan Shao, Xiangyang Xue, Zheng Zhang. Star-Transformer. Proc. of NAACL, 2019. reduce the computation complexity of the standard Transformer by carefully sparsifying the topology
{Yao19} Liang Yao, Chengsheng Mao, Yuan Luo. Graph Convolutional Networks for Text Classification. Proc. of IJCAI, 2019. We model the graph with a GraphConvolutional Network (GCN), a simple and effective graph neural network that captureshigh order neighborhoods information. The edge between two word nodes is built by word co-occurrence informationand the edge between a word node and document node isbuilt using word frequency and word's document frequency.We then turn text classification problem into a node classi-fication problem PDF