Part of Speech Tagging on WSJ - mohsensalari/cs571 GitHub Wiki
Running the part of speech tagging, using an Adaptive Gradient Descent algorithm with the following parameters:
batch ratio = 0.1, average = false, learning rate = 0.01, decaying rate = 0.4, bias = 0.0
I got a final score of
97.19.
Higher Batch Ratio, would intuitively increase the precision, at the cost of computation power. Using the limited computing power that I had, I changed the Batch Ratio
parameter to 0.2, a slight performance improvement occurd, giving a final score of
97.22.
Reducing the batch ratio to increase the speed, and then decreasing label cutoff
and feature cutoff
to two and 1, the final score remained
97.22.