Home - gousiosg/github-mirror GitHub Wiki

#GHTorrent

Welcome to the GHTorrent project, an effort to bring Github's data to the hands of the software engineering research community, without taxing Github.

GHTorrent reads GitHub events from the Github Events stream, stores them to a MongoDB database, retrieves the linked data, extracts them to a relational format and provides them in MongoDB and MySQL dump format over Bittorrent.

All original data is copyright of its owners.

The GHTorrent project is brought to you by the SENSE group at the Athens University of Economics and Business and the SERG group at the Technical University of Delft.

If you use the dataset or the provided tools for research, please provide a reference to the following work:

Georgios Gousios and Diomidis Spinellis, "GHTorrent: GitHub’s data from a firehose," in MSR '12: Proceedings of the 9th Working Conference on Mining Software Repositories, June 2–3, 2012. Zurich, Switzerland.