Copying the Project - irsaal/urdu GitHub Wiki

We encourage you to make a copy of the project for yourself. This will be enable you to input your own cleaned data, build out the word bank, and potentially experiment with additional regex formulas to augment the efficacy of the tool for your corpus.

To begin, click this link which will automatically create a copied version of the tool in your google drive. From there, you can rename the project and replace the test data in the clean corpus sheet.

From there, we recommend Using Voyant to quickly process the corpus into individual strings (i.e. individual words in your corpus).