Text preprocessing in Topic Modeling - SoojungHong/MachineLearning GitHub Wiki
Simple preprocessing techniques before building a document-term matrix
- Minimum-term length
- Case conversion
- Stop-word filtering
- Minimum frequency filtering
- Maximum frequency filtering
- Stemming
Reference >> http://derekgreene.com/slides/topic-modelling-with-scikitlearn.pdf