GraphBIG Dataset - graphbig/graphBIG GitHub Wiki

To address the diverse features of graph data, GraphBIG present two types of graph data sets, real-world data and synthetic data. The real-world data sets can illustrate real graph data features, while the synthetic data can help workload characterizations because of its flexible data size.

As shown below, we collect four real-world data sets to represent all graph data types and a synthetic data set with arbitrary size. Moreover, the well-defined dataset interface of GraphBIG can support any third-party datasets in csv format.

Data Set Vertex Edge Download Link
Twitter Graph* 120M / 13M (subset) 1.9B / 60M (subset) Link
Knowledge Repo 154K 1.72M Dataset
Watson Gene Graph 2M 12.2M Dataset
CA RoadNet 1.9M 2.8M Dataset
LDBC Graph Any Any Dataset

(* The full twitter graph dataset is extremely large. The raw data file is more than 50GB. We are still looking for a public server for hosting download services of huge files. For now, we suggest our users try the twitter data from the link provided above.)