GraphBIG Workload Selection - graphbig/graphBIG GitHub Wiki

Representativeness and coverage are the major concerns of our graph benchmarking effort. To cover representative and diverse graph workloads, we analyzed 21 key use cases from System G customers. As shown in Figure 1(B), the 21 use cases can be classified into six categories with various portions of each category.

Figure 1. Real-world use case analysis

We then select representative workloads from the use cases according to the number of used times. Besides, we summarize the computation types and graph data types as shown in Figure 2.

Workflow Figure 2. GraphBIG workload selection flow

After that, workload reselection is performed to ensure the coverage of all computation types. Figure 1(A) shows the number of used times of each chosen workload with the breakdown of categories. We can see that the chosen workloads are all widely used in the real-world use cases. The most popular workload, BFS, is used by 10 different use cases, while the least popular one, TC, is also used by 4 use cases. Moreover, ensured by the reselection step, they cover all graph computation types.