Performance Benchmark - decentraland/unity-explorer GitHub Wiki
A Python script that generates PDF reports comparing Unity Performance Test Framework results across different test scenarios.
unity-explorer/
├── scripts/
│ ├── generate_perf_report.py # Main script
│ ├── perf_report_config.json # Configuration file
│ ├── requirements-perf-report.txt # Python dependencies
│ └── PERF_REPORT.md # This documentation
└── Explorer/
└── PerformanceTestResults.json # Test output (generated)
py -m pip install -r scripts\requirements-perf-report.txtOr with Python launcher:
python -m pip install -r scripts\requirements-perf-report.txtpip3 install -r scripts/requirements-perf-report.txtOr with virtual environment (recommended):
python3 -m venv venv
source venv/bin/activate
pip install -r scripts/requirements-perf-report.txt- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: pip install -r scripts/requirements-perf-report.txtGenerate a full PDF report with summary page and detailed charts:
python generate_perf_report.py <input.json> [output.pdf]Example:
python generate_perf_report.py PerformanceTestResults.json report.pdfGenerate only the summary page:
python generate_perf_report.py <input.json> <output.pdf> --summary-onlyOutput markdown summary for $GITHUB_STEP_SUMMARY:
python generate_perf_report.py <input.json> <output.pdf> --github-summary summary.mdIn CI workflow:
- name: Generate performance report
run: |
python scripts/generate_perf_report.py results.json report.pdf --github-summary summary.md
cat summary.md >> $GITHUB_STEP_SUMMARYCompare results from two test runs side-by-side:
python generate_perf_report.py --compare <report1.json> <report2.json> <output.pdf> --fixture <TestFixtureName>Options:
-
--fixture <name>- (Required) Filter to a specific test fixture class -
--label1 <name>- Label for first report (default: "Report 1") -
--label2 <name>- Label for second report (default: "Report 2")
Example:
python generate_perf_report.py --compare baseline.json new.json comparison.pdf \
--fixture ProfilesPerformanceTest \
--label1 "Before" \
--label2 "After"The script uses perf_report_config.json in the same directory.
{
"difference_thresholds": [...],
"metrics": {...},
"default_summary_metrics": [...],
"parallelism": {...},
"summary_cases": [...]
}Define how percentage differences are categorized and colored:
"difference_thresholds": [
{
"min": null,
"max": -30,
"label": "Major Improvement",
"color": "#006400"
},
{
"min": -10,
"max": 10,
"label": "Within Margin of Error",
"color": "#808080"
},
{
"min": 30,
"max": null,
"label": "Major Regression",
"color": "#8B0000"
}
]-
min/max: Percentage bounds (usenullfor unbounded) -
label: Human-readable category name -
color: Hex color for PDF output
Define human-readable names for sample groups:
"metrics": {
"WebRequest.Send": {
"name": "Web Request Send Time",
"description": "time to send the web request"
},
"Iteration Total Time": {
"name": "Total Iteration Time",
"description": "total time for the complete operation"
}
}Specify which metrics appear in summary by default:
"default_summary_metrics": [
"WebRequest.Send",
"WebRequest.ProcessData",
"Iteration Total Time"
]Group results by concurrency level:
"parallelism": {
"no_concurrency": {
"min": 1,
"max": 1,
"label": "No Concurrency"
},
"low_concurrency": {
"min": 2,
"max": 10,
"label": "Low Concurrency"
},
"high_concurrency": {
"min": 11,
"max": null,
"label": "High Concurrency"
}
}The script extracts concurrency from the first test case argument.
Configure which tests appear on the summary page and how they're compared.
Compare one endpoint against others in the same test:
{
"test": "ProfilesPerformanceTest.PostProfilesAsync",
"endpoint": "asset-bundle-registry"
}Output shows percentage difference of asset-bundle-registry vs other endpoints.
Compare an endpoint from one test against endpoints from another test:
{
"test": "ProfilesPerformanceTest.PostMetadataAsync",
"endpoint": "asset-bundle-registry",
"compare_test": "ProfilesPerformanceTest.PostProfilesAsync"
}Output shows how PostMetadataAsync.asset-bundle-registry performs vs all PostProfilesAsync endpoints.
List raw metric values without comparison (when endpoint is omitted):
{
"test": "RPCFriendsServiceBenchmark.GetFriendsAsync"
}Output shows Median and P95 values for each scenario grouped by parallelism.
Override default metrics for specific cases:
{
"test": "ProfilesPerformanceTest.PostMetadataAsync",
"endpoint": "asset-bundle-registry",
"compare_test": "ProfilesPerformanceTest.PostProfilesAsync",
"metrics": [
"WebRequest.Send",
"WebRequest.ProcessData",
"Iteration Total Time",
"Iteration Downloaded Data"
]
}The script expects Unity Performance Test Framework JSON output:
{
"Results": [
{
"Name": "Namespace.TestClass(fixtureArg1,fixtureArg2).MethodName(testArg1,testArg2)",
"SampleGroups": [
{
"Name": "WebRequest.Send",
"Unit": 1,
"Median": 12345.67,
"Min": 1000.0,
"Max": 50000.0,
"StandardDeviation": 5000.0,
"Samples": [1000, 2000, 3000, ...]
}
]
}
]
}| Value | Unit |
|---|---|
| 0 | Undefined |
| 1 | Microsecond |
| 2 | Millisecond |
| 3 | Second |
| 4 | Byte |
| 5 | Kilobyte |
| 6 | Megabyte |
| 7 | Gigabyte |
| 8 | Nanosecond |
The script determines baseline from TestFixture arguments:
- Extracts the second parameter from fixture args
- If
True, that scenario is the baseline - Example:
TestClass("https://endpoint.com",True)marks this as baseline
Labels are extracted from TestFixture URL arguments:
-
https://peer-ap1.decentraland.org/...->peer-ap1 -
https://asset-bundle-registry.decentraland.today/...->asset-bundle-registry -
https://gateway.decentraland.zone/...->gateway.zone(includes TLD for distinction)
- Summary Page: Overview of configured test cases with percentage comparisons or raw metrics
- Detail Pages: Per-method charts showing Median, P95, P99 with error bars
- Baseline marked with "baseline" label
- Percentage annotations colored by threshold category
The GitHub summary uses emoji indicators:
- 🟢 Green: Improvement
- 🔴 Red: Regression
- ⚪ White: Within margin of error
Config:
{
"test": "ProfilesPerformanceTest.PostProfilesAsync",
"endpoint": "asset-bundle-registry"
}Output:
### ProfilesPerformanceTest.PostProfilesAsync
*Endpoint: `asset-bundle-registry` (performance vs other endpoints)*
**No Concurrency:**
- 🟢 **Web Request Send Time**: -65.1% (Major Improvement)
- 🟢 **Data Processing Time**: -15.7% (Slight Improvement)
- 🟢 **Total Iteration Time**: -66.7% (Major Improvement)
**Low Concurrency:**
- 🟢 **Web Request Send Time**: -52.6% (Major Improvement)
- 🔴 **Data Processing Time**: +144.7% (Major Regression)
- 🟢 **Total Iteration Time**: -53.3% (Major Improvement)
**High Concurrency:**
- 🟢 **Web Request Send Time**: -66.9% (Major Improvement)
- 🔴 **Data Processing Time**: +307.3% (Major Regression)
- 🟢 **Total Iteration Time**: -56.0% (Major Improvement)Config:
{
"test": "ProfilesPerformanceTest.PostMetadataAsync",
"endpoint": "asset-bundle-registry",
"compare_test": "ProfilesPerformanceTest.PostProfilesAsync",
"metrics": [
"WebRequest.Send",
"WebRequest.ProcessData",
"Iteration Total Time",
"Iteration Downloaded Data"
]
}Output:
### ProfilesPerformanceTest.PostMetadataAsync
*Endpoint: `asset-bundle-registry` vs `ProfilesPerformanceTest.PostProfilesAsync` endpoints*
**No Concurrency:**
- 🟢 **Web Request Send Time**: -53.3% (Major Improvement)
- 🟢 **Data Processing Time**: -91.2% (Major Improvement)
- 🟢 **Total Iteration Time**: -55.2% (Major Improvement)
- 🟢 **Downloaded Data**: -92.2% (Major Improvement)
**Low Concurrency:**
- 🟢 **Web Request Send Time**: -63.4% (Major Improvement)
- 🟢 **Data Processing Time**: -93.3% (Major Improvement)
- 🟢 **Total Iteration Time**: -66.2% (Major Improvement)
- 🟢 **Downloaded Data**: -92.2% (Major Improvement)Config:
{
"test": "ProfilesPerformanceTest.PostProfilesAsync"
}Output:
### ProfilesPerformanceTest.PostProfilesAsync
*Metrics by scenario*
**No Concurrency:**
**asset-bundle-registry:**
- Web Request Send Time: 51.1ms (P95: 71.3ms)
- Data Processing Time: 18.1ms (P95: 32.5ms)
- Total Iteration Time: 5554.4ms (P95: 5771.7ms)
**peer-ap1:**
- Web Request Send Time: 284.5ms (P95: 770.9ms)
- Data Processing Time: 21.5ms (P95: 34.2ms)
- Total Iteration Time: 33759.2ms (P95: 34983.2ms)
**peer-ec1:**
- Web Request Send Time: 80.2ms (P95: 162.7ms)
- Data Processing Time: 21.6ms (P95: 36.2ms)
- Total Iteration Time: 9017.6ms (P95: 10293.9ms)
**Low Concurrency:**
**asset-bundle-registry:**
- Web Request Send Time: 86.0ms (P95: 132.9ms)
- Data Processing Time: 57.7ms (P95: 124.6ms)
- Total Iteration Time: 1014.2ms (P95: 1362.6ms)
**peer-ap1:**
- Web Request Send Time: 299.6ms (P95: 767.2ms)
- Data Processing Time: 21.8ms (P95: 39.1ms)
- Total Iteration Time: 3791.6ms (P95: 4116.4ms)Config:
{
"test": "AssetBundleRegistryPerformanceTests.GetEntitiesActive",
"endpoint": "gateway.zone"
}Output:
### AssetBundleRegistryPerformanceTests.GetEntitiesActive
*Endpoint: `gateway.zone` (performance vs other endpoints)*
**Low Concurrency:**
- ⚪ **Web Request Send Time**: +10.0% (Within Margin of Error)
- ⚪ **Data Processing Time**: +2.2% (Within Margin of Error)
- 🔴 **Total Iteration Time**: +12.3% (Slight Regression)
**High Concurrency:**
- ⚪ **Web Request Send Time**: +9.5% (Within Margin of Error)
- ⚪ **Data Processing Time**: -1.6% (Within Margin of Error)
- ⚪ **Total Iteration Time**: +9.7% (Within Margin of Error)Command:
python generate_perf_report.py --compare baseline.json new.json comparison.pdf \
--fixture ProfilesPerformanceTest --label1 "Before" --label2 "After"PDF Output (per chart):
- Grouped bar chart with two bars per scenario
- Blue bars: "Before" (Report 1)
- Orange bars: "After" (Report 2)
- Baseline scenario marked with "base" label on Report 1 bar
- Report 2 bars show percentage difference vs Report 1 baseline
- Charts for Median, P95, P99 side by side
- name: Generate performance report
run: |
python scripts/generate_perf_report.py \
Explorer/PerformanceTestResults.json \
Explorer/PerformanceBenchmarkReport.pdf \
--github-summary Explorer/summary.md
cat Explorer/summary.md >> $GITHUB_STEP_SUMMARY
- name: Upload report
uses: actions/upload-artifact@v4
with:
name: Performance Report
path: Explorer/PerformanceBenchmarkReport.pdf# Run baseline tests
python generate_perf_report.py --compare \
results_main.json \
results_feature.json \
comparison.pdf \
--fixture ProfilesPerformanceTest \
--label1 "main" \
--label2 "feature-branch"