Part of Speech Tagging on WSJ - mohsensalari/cs571 GitHub Wiki

Running the part of speech tagging, using an Adaptive Gradient Descent algorithm with the following parameters:

batch ratio = 0.1, average = false, learning rate = 0.01, decaying rate = 0.4, bias = 0.0

I got a final score of

97.19.

Higher Batch Ratio, would intuitively increase the precision, at the cost of computation power. Using the limited computing power that I had, I changed the Batch Ratio parameter to 0.2, a slight performance improvement occurd, giving a final score of

97.22.

Reducing the batch ratio to increase the speed, and then decreasing label cutoff and feature cutoff to two and 1, the final score remained

97.22.