Tools - softwaresaved/rse-repo-analysis GitHub Wiki
Augur
Huge tool to build database of info from Git repositories. Link to resulting schema.
It has Docker containers which run a database and API. Repositories can be added using their git ID (docs).
Can it be used for repositories that I do not have e.g. push access to?
GrimoireLab
Component Perceval: Python API for retrieving data from repository.
Arthur: schedules and executes Perceval for larger amounts of software repositories. Uses Redis queue.
GH Archive
Record public GitHub timeline, archive it and make it easily accessible.
Data is available as raw, hourly JSON encoded events file from data.gharchive.org
.
Moreover, it's on Google BigQuery, which needs Google Developer access but allows SQL-like queries.
There is a limit of 1TB data processing per month though.