Home - sameerwadkar/largelda GitHub Wiki

LargeLDA is a adaptation of the "MALLET" (MAchine Learning for LanguagE Toolkit) API by, McCallum, Andrew Kachites- "MALLET: A Machine Learning for Language Toolkit." http://mallet.cs.umass.edu. 2002.

LargeLDA utilizes Memory Mapped Files to scale beyond the limitations of the Heap Memory. It was been tested with approximately 2Million documents and 1000 topics.