GraphBrewOrder - UVA-LavaLab/GraphBrew GitHub Wiki
GraphBrewOrder (algorithm 12) is the core reordering algorithm that applies different optimizations to different communities. This page explains how it works and when to use it.
Unlike traditional reordering that treats the entire graph uniformly, GraphBrewOrder:
- Detects communities using Leiden
- Analyzes each community's structure
- Applies configurable reordering within communities (default: RabbitOrder)
- Preserves community locality in the final ordering
Input Graph β Leiden β Communities β Per-Community Reorder β Output Graph
+------------------+
| INPUT GRAPH |
+--------+---------+
|
v
+------------------+
| Leiden Community |
| Detection |
+--------+---------+
|
+----+----+----+----+
| | |
v v v
+-------+ +-------+ +-------+
| Comm0 | | Comm1 | | CommN |
| 15234 | | 8912 | | 500 |
| nodes | | nodes | | nodes |
+---+---+ +---+---+ +---+---+
| | |
v v v
+-------+ +-------+ +-------+
|REORDER| |REORDER| |REORDER|
| (per- | | (per- | | (per- |
| comm) | | comm) | | comm) |
+---+---+ +---+---+ +---+---+
| | |
+----+----+----+----+
|
v
+------------------+
| SIZE-SORTED MERGE|
| [Comm0][Comm1]...|
+------------------+
|
v
+------------------+
| OUTPUT GRAPH |
| (reordered IDs) |
+------------------+
Global algorithms like HUBCLUSTER or RCM optimize for one metric across the entire graph:
Global HUBCLUSTER:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β [All Hubs] ββββββββ [All Non-Hubs] β
β Good hub locality, but communities are scattered β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
This can break community locality - vertices from different communities get interleaved.
GraphBrewOrder keeps communities together while optimizing within each:
GraphBrewOrder:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β [Community 0 ] [Community 1 ] [Community 2 ]β
β [ reordered ] [ reordered ] [ reordered ] β
β Both community AND cache locality! β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
// Leiden community detection
auto communities = leidenCommunityDetection(graph);
// Result: each vertex mapped to a community ID
// vertex 0 β community 3
// vertex 1 β community 0
// vertex 2 β community 3
// ...For each community, compute:
struct CommunityFeatures {
size_t num_nodes;
size_t num_edges;
double internal_density; // edges / possible_edges
double avg_degree;
double degree_variance; // normalized variance in degrees
double hub_concentration; // fraction of edges from top 10% nodes
double modularity; // community/subgraph modularity
double clustering_coeff; // local clustering coefficient (sampled)
double avg_path_length; // estimated average path length
double diameter_estimate; // estimated graph diameter
double community_count; // number of sub-communities
double reorder_time; // estimated reorder time (if known)
};Within each community, vertices are reordered using a configurable algorithm (default: RabbitOrder):
Community 0 (before):
Vertices: [45, 12, 89, 3, 67, 23]
Community 0 (after per-community reordering):
Vertices: [12, 3, 23, 67, 89, 45]
RabbitOrder optimizes cache locality within the community subgraph
The algorithm used for per-community ordering can be configured via the format string (see Usage section).
Communities are concatenated in order of size (largest first):
Final ordering:
[Community 2 (5000 nodes)] [Community 0 (3000 nodes)] [Community 1 (1500 nodes)]
[reordered internally ] [reordered internally ] [reordered internally ]
# Use GraphBrewOrder (algorithm 12) with auto-resolution
./bench/bin/pr -f graph.el -s -o 12 -n 5# Format: -o 12:cluster_variant:final_algo:resolution:levels
# cluster_variant: leiden (default), gve, gveopt, rabbit, hubcluster, gvefast, gveoptfast
# final_algo: per-community ordering algorithm ID (default: 8 = RabbitOrder)
# resolution: Leiden resolution parameter (default: dynamic for best PR)
# levels: recursion depth (default: 1)
# Examples:
./bench/bin/pr -f graph.el -s -o 12:gveopt -n 5 # GVE-Leiden optimized, auto-resolution
./bench/bin/pr -f graph.el -s -o 12:rabbit -n 5 # Coarse communities (resolution=0.5)
./bench/bin/pr -f graph.el -s -o 12:gveopt:6 -n 5 # Use HubSortDBG as final algorithm
./bench/bin/pr -f graph.el -s -o 12:gveopt:8:0.75 -n 5 # Custom resolution 0.75# Format: -o 12:freq:algo:resolution:maxIterations:maxPasses
# freq: frequency threshold for small communities (default: 10)
# algo: per-community ordering algorithm ID (default: 8 = RabbitOrder)
# resolution: Leiden resolution parameter (default: dynamic)
# Use HubClusterDBG (7) for per-community ordering with resolution 1.0
./bench/bin/pr -f graph.el -s -o 12:10:7:1.0
# Full example with all parameters (explicit resolution)
./bench/bin/pr -f graph.el -s -o 12:10:8:1.0:30:30# Verbose output shows community breakdown
./bench/bin/pr -f graph.el -s -o 12 -n 1 2>&1 | head -30Output:
=== GraphBrew Per-Community Reordering ===
Communities detected: 847
Total reorder time: 0.234 seconds
Community Size Distribution:
< 100 nodes: 812 communities
100-1000: 28 communities
1000-10000: 6 communities
> 10000: 1 community
Largest communities:
Comm 0: 15,234 nodes, 189,456 edges (hub_conc: 0.58)
Comm 1: 8,912 nodes, 98,234 edges (hub_conc: 0.42)
...
| Aspect | HUBCLUSTER | GraphBrewOrder |
|---|---|---|
| Hub locality | β Excellent | β Good (per-community) |
| Community locality | β Poor | β Excellent |
| Scalability | β Fast | β‘ Moderate |
| Best for | Uniform hub graphs | Modular graphs |
| Aspect | LeidenOrder | GraphBrewOrder |
|---|---|---|
| Community detection | β Leiden | β Leiden |
| Internal ordering | Original | Per-community (RabbitOrder) |
| Cache optimization | β None | β Yes |
| Best for | Basic community grouping | Better cache locality |
| Aspect | LeidenCSR | GraphBrewOrder |
|---|---|---|
| Community detection | β Leiden | β Leiden |
| Internal ordering | Configurable (gve/gveopt/gveopt2/gveadaptive/gveturbo/gvefast/gverabbit/dfs/bfs/hubsort/modularity) | Configurable (default: RabbitOrder) |
| Speed | Very fast (CSR-native) | Moderate |
| Best for | Large graphs, tunable quality | Modular graphs |
Vertex IDs scattered, poor locality:
Memory: [v892][v12][v5601][v234][v8923][v45][v167][v4501]...
ββC0βββC1βββC2ββββC0ββββC1ββββC2βββC0ββββC2ββββ
Cache lines load vertices from multiple communities
= Many cache misses during graph traversal
Communities grouped, optimized within each:
Memory: [v12][v234][v167][v892][v45][v8923][v5601][v4501]...
βββCommunity 0ββββββββββCommunity 1ββββββCommunity 2ββ
βββ optimized ββββββ βββ optimized ββ ββ optimized ββ
Sequential access within communities
= Fewer cache misses
β Modular graphs with clear community structure β Social networks (communities = friend groups) β Web graphs (communities = websites/domains) β Citation networks (communities = research areas)
β Road networks - low modularity, use RCM instead β Random graphs - no community structure, use HUBCLUSTER β Very small graphs - overhead not worth it, use ORIGINAL
Leiden detection: ~10-15% of total time
Hub sorting: ~2-5% of total time
Memory: O(n) for community mapping
For graphs > 10K vertices, the overhead is usually recovered through better cache performance.
GraphBrewOrder's implementation is organized in modular header files in bench/include/graphbrew/reorder/:
| File | Purpose |
|---|---|
reorder_graphbrew.h |
GraphBrewConfig struct, GraphBrewClusterVariant enum |
reorder_types.h |
Base types, perceptron model, feature computation |
GraphBrewClusterVariant Enum:
// bench/include/graphbrew/reorder/reorder_graphbrew.h
enum class GraphBrewClusterVariant {
LEIDEN, // Original Leiden library (igraph)
GVE, // GVE-Leiden CSR-native
GVEOPT, // Cache-optimized GVE
GVEFAST, // Single-pass GVE
GVEOPTFAST, // Cache-optimized single-pass
RABBIT, // RabbitOrder-based
HUBCLUSTER // Hub-clustering based
};GraphBrewConfig Struct:
// bench/include/graphbrew/reorder/reorder_graphbrew.h
struct GraphBrewConfig {
GraphBrewClusterVariant variant = GraphBrewClusterVariant::LEIDEN;
int frequency = 10; // Hub frequency threshold
int intra_algo = 8; // Algorithm ID for within-community reordering
double resolution = -1.0; // Leiden resolution (-1 = auto)
int maxIterations = 30; // Max Leiden iterations
int maxPasses = 30; // Max Leiden passes
// Parse from command-line options string
static GraphBrewConfig FromOptions(const std::string& options);
// Convert to internal reordering options
ReorderingOptions toInternalOptions() const;
// Print configuration for debugging
void print() const;
};
// Usage in builder.h:
GraphBrewConfig config = GraphBrewConfig::FromOptions("gve:10:8:1.0:30:30");
// β variant=GVE, frequency=10, intra_algo=8, resolution=1.0, etc.Key Functions in builder.h:
// Unified entry point for GraphBrewOrder
void GenerateGraphBrewMappingUnified(
const CSRGraph& g, pvector<NodeID_>& new_ids,
const ReorderingOptions& opts);Communities are processed in this order:
// Sort communities by size (largest first)
// Rationale: largest communities benefit most from optimization
sort(communities.begin(), communities.end(),
[](auto& a, auto& b) { return a.size > b.size; });Each community subgraph is reordered using the configurable algorithm:
// Default: RabbitOrder (algorithm 8 with csr variant)
// RabbitOrder has variants: csr (default), boost
// Can be changed via format string: -o 12:freq:algo:resolution
ReorderingAlgo algo = getReorderingAlgo(reorderingOptions[1].c_str()); // default: "8"
// Apply the selected algorithm to each community subgraph
GenerateMappingLocalEdgelist(g, edge_list, new_ids_sub, algo, ...);GraphBrewOrder includes guards for graphs with extreme community structure (e.g., Kronecker graphs):
// In ReorderCommunitySubgraphStandalone (reorder.h):
// If community has no internal edges (all edges go to other communities),
// skip reordering and assign nodes in original order
if (sub_edges.empty()) {
for (NodeID_ node : nodes) {
new_ids[node] = current_id++;
}
return;
}This prevents division-by-zero errors when:
- Leiden creates communities where nodes only connect to OTHER communities
- The induced subgraph has 0 edges
- Hub-based algorithms compute
avgDegree = num_edges / num_nodes
Graphs that benefit from this handling:
- Kronecker graphs (kron_g500-logn*)
- Synthetic power-law graphs with extreme degree distributions
- Graphs with highly disconnected community structure
// New vertex ID assignment
new_id = community_start_offset + position_within_community
// Example:
// Community 0 starts at offset 0, has 1000 vertices
// Community 1 starts at offset 1000, has 500 vertices
// Vertex 45 in Community 1 (position 23) gets ID: 1000 + 23 = 1023GraphBrewOrder supports multiple community detection backends via variant selection:
| Variant | Description | Final Algorithm |
|---|---|---|
leiden |
Default. GVE-Leiden optimized, auto-resolution | RabbitOrder |
gve |
GVE-Leiden CSR-native (non-optimized) | RabbitOrder |
gveopt |
GVE-Leiden with cache optimization | RabbitOrder |
gvefast |
GVE-Leiden (non-optimized) | HubSortDBG |
gveoptfast |
GVE-Leiden optimized | HubSortDBG |
rabbit |
GVE-Leiden with resolution=0.5 (coarser communities) | RabbitOrder |
hubcluster |
Hub-degree based clustering | RabbitOrder |
# Test specific GraphBrewOrder variants
python3 scripts/graphbrew_experiment.py --phase benchmark \
--graph wiki-topcats \
--algo-list GraphBrewOrder \
--all-variants \
--graphbrew-variants leiden gve gveopt \
--benchmarks pr --trials 3# Format: -o 12:variant:final_algo:resolution:levels
./bench/bin/pr -f graph.el -s -o 12:leiden -n 5 # Default (GVE-Leiden optimized)
./bench/bin/pr -f graph.el -s -o 12:gve -n 5 # GVE-Leiden non-optimized
./bench/bin/pr -f graph.el -s -o 12:gveopt -n 5 # GVE-Leiden optimized
./bench/bin/pr -f graph.el -s -o 12:rabbit -n 5 # Coarse communities (resolution=0.5)
./bench/bin/pr -f graph.el -s -o 12:gveopt:6 -n 5 # Custom final algo (HubSortDBG)
./bench/bin/pr -f graph.el -s -o 12:gveopt:8:0.75 -n 5 # Custom resolutionThe implementation uses dynamic thresholds based on graph statistics:
// Dynamic threshold formula:
// min_size = max(ABSOLUTE_MIN, min(avg_community_size / FACTOR, sqrt(N)))
// Capped at MAX_THRESHOLD (2000)
size_t ComputeDynamicMinCommunitySize(size_t num_nodes,
size_t num_communities,
size_t avg_community_size) {
const size_t ABSOLUTE_MIN = 50;
const size_t MAX_THRESHOLD = 2000;
const size_t FACTOR = 4;
size_t dynamic_threshold = std::max(ABSOLUTE_MIN,
std::min(avg_community_size / FACTOR,
static_cast<size_t>(std::sqrt(num_nodes))));
return std::min(dynamic_threshold, MAX_THRESHOLD);
}This ensures thresholds scale appropriately across different graph sizes.
Communities must exceed this size to apply internal reordering:
// Local reorder threshold = 2 Γ min_community_size (capped at 5000)
size_t ComputeDynamicLocalReorderThreshold(size_t num_nodes,
size_t num_communities,
size_t avg_community_size) {
size_t min_size = ComputeDynamicMinCommunitySize(...);
return std::min(min_size * 2, static_cast<size_t>(5000));
}Small communities (below the threshold) are grouped together and processed as a single "mega community" using feature-based algorithm selection:
// Small communities grouped together
// Features computed for the entire group
// Algorithm selected via perceptron (AdaptiveOrder) or heuristic (GraphBrew)Apply Degree-Based Grouping after GraphBrewOrder:
# First apply GraphBrewOrder, then DBG
./bench/bin/pr -f graph.el -s -o 12 -n 1 # GraphBrewOrder
# Manual DBG pass would need code modificationAdaptiveOrder (14) can select GraphBrewOrder for appropriate communities. Per-type perceptron weights are stored in:
scripts/weights/active/type_N.json # Per-cluster weights
scripts/weights/active/type_registry.json # Graph β type mappingGraphBrewOrder default weights (from builder.h):
// GraphBrewOrder: Leiden + per-community RabbitOrder
{GraphBrewOrder, {
.bias = 0.6,
.w_modularity = -0.25, // better on low-mod graphs
.w_log_nodes = 0.1, // scales well
.w_log_edges = 0.08,
.w_density = -0.3,
.w_avg_degree = 0.0,
.w_degree_variance = 0.2,
.w_hub_concentration = 0.1,
.w_reorder_time = -0.5 // highest reorder cost
}}./bench/bin/pr -f graph.el -s -o 12 -n 10# Test different algorithms
for algo in 0 4 7 12 15 17; do
echo "Algorithm $algo:"
./bench/bin/pr -f graph.el -s -o $algo -n 5 2>&1 | grep "Average"
doneFor social networks:
Algorithm 0 (ORIGINAL): Average: 0.0523 seconds
Algorithm 4 (HUBCLUSTER): Average: 0.0445 seconds
Algorithm 7 (HUBCLUSTERDBG): Average: 0.0412 seconds
Algorithm 12 (GraphBrewOrder): Average: 0.0385 seconds β Good
Algorithm 15 (LeidenOrder): Average: 0.0398 seconds
Algorithm 17 (LeidenCSR): Average: 0.0371 seconds β Better
If Leiden creates too many tiny communities:
# Check community distribution
./bench/bin/pr -f graph.el -s -o 12 -n 1 2>&1 | grep -A 10 "Community Size"If most communities have < 10 vertices, the graph may have weak community structure.
For graphs without clear communities, global hub ordering may be better:
# Compare
./bench/bin/pr -f graph.el -s -o 7 -n 5 # HUBCLUSTERDBG
./bench/bin/pr -f graph.el -s -o 12 -n 5 # GraphBrewOrderIf HUBCLUSTERDBG is faster, the graph has weak community structure.
GraphBrewOrder needs memory for:
- Community mapping: O(n) integers
- Temporary degree arrays: O(n) integers
For very large graphs:
# Check memory before running
free -h
# GraphBrewOrder needs ~16 bytes per vertex
# 1B vertices β 16GB additional memory- AdaptiveOrder-ML - Automatic algorithm selection per community
- Reordering-Algorithms - All available algorithms
- Running-Benchmarks - Benchmark commands