How to add a new wiki - aburillo/WikiChron GitHub Wiki

Here are the steps for adding a new wiki to WikiChron (note that, by default, wikichron already includes some wikis, but the user can choose to add more, in order to analyze the wikis they wish):

  • First, after deciding which wiki you are going to analyze, you should focus on which kind of wiki it is. It can be a Wikia wiki, a Wikimedia project wiki, or another kind of wiki (like a self-hosted wiki). The procedure to download the XML dump depends on the kind of wiki it is, so first of all, go check the "XML Dumps" section of the Readme in https://github.com/Grasia/WikiChron/ , which will provide you with all the information you need to get the XML dump of your wiki.
  • Once you have your XML dump, you need to process the dump in order to get the corresponding .csv file. To do so, go run the script dump_parser.py located in WikiChron's scripts directory. Run the script using
python3 -m dump_parser data/<name_of_your.xml>

This will create the .csv file in your local data/ directory. If you have more than one XML file, run the script as follows:

python3 -m dump_parser data/*.xml

NOTE: all this information can be found in the "Process the dump" subsection of the "XML Dumps" section of the readme.

  • After creating the .csv file, you need to go to your local data/ directory and modify the wikis.csv file so that it also includes your wiki's URL and the name of the .csv file you just obtained.
  • After modifying the wikis.csv file, as it is stated in the "provide some metadata of the wiki" section of the readme, you need to have a wikis.json file in your data/ directory. This file includes some metadata of the wikis that you want to analyze (the number of pages, the number of users and bots ids). But getting this metadata is not straightforward. In order to include the metadata of your wiki in the wikis.json file, you need to go to the scripts folder in WikiChron: https://github.com/Grasia/WikiChron/tree/develop/scripts and execute the two scripts available:
ptyhon3 generate_wikis_json.py data/<name_of_your.csv>

this will include the basic metadata of your wiki in the wikis.json file (number of users and number of pages). In order to include also the bots ids, execute the query_bot_users.py script:

ptyhon3 query_bot_users.py data/<name_of_your.csv>

Now, you're done and everything is ready for you to start analyzing your wiki!

⚠️ **GitHub.com Fallback** ⚠️