Reading - mohsensalari/cs571 GitHub Wiki
Stochastic Gradient Descent
- Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms, Michael Collins, EMNLP, 2002.
- Solving Large Scale Linear Prediction Problems Using Stochastic Gradient Descent Algorithms, Tong Zhang, ICML, 2004.
- Large-Scale Machine Learning with Stochastic Gradient Descent, Léon Bottou, COMPSTAT, 2010.
- Adaptive Subgradient Methods for Online Learning and Stochastic Optimization, John Duchi et. al., JMLR, 2012.
Part-of-Speech Tagging
- SVMTool: A general POS tagger generator based on Support Vector Machines, Gimenez and Marquez, LREC, 2004.
- Guided Learning for Bidirectional Sequence Classification, Shen et. al., ACL, 2007.
- Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics?, Manning, CICLing, 2011.
- Fast and Robust Part-of-Speech Tagging Using Dynamic Model Selection, Choi and Palmer, ACL, 2012.
- A Universal Part-of-Speech Tagset, Petrov et. al., LREC, 2012.
Dependency Parsing
- An Efficient Algorithm for Projective Dependency Parsing, Nivre, IWPT, 2003.
- Dynamic Programming for Linear-Time Incremental Parsing, Huang and Sagae, ACL, 2010.
- A Dynamic Oracle for Arc-Eager Dependency Parsing, Goldberg and Nivre,
COLING, 2012.
- Transition-based Dependency Parsing with Selectional Branching, Choi and McCallum, ACL, 2013.
- Online Large-Margin Training of Dependency Parsers, McDonald, ACL, 2005.
- Online Learning of Approximate Dependency Parsing Algorithms, McDonald and Pereira, EACL, 2006.