Advanced GPU profiling tools for Meta Quest 2 - o3de/o3de-extras GitHub Wiki
This document contains a list of tools to do advanced GPU profiling on the Quest 2. For each tool you'll find steps for how to install them and how to use them.
- RenderDoc Meta Fork: Great for capturing and analyzing GPU data in great depth, see a trace of your frame, view all drawcalls sent to the GPU, get render stages info with times from the tile render GPU.
- GPU Systrace: Great for getting render stages info with times from the tile render GPU. This is the only tool that can visualize GPU and CPU work concurrently.
- ovrgpuprofiler: Great for getting render stages info with times from the tile render GPU. It also reports generic GPU stats. Since it's a simple command line tool, it could be used in Automatic Review (AR) to discover sudden changes in performance caused by PRs.
A very good video about GPU profiling in Quest: State of the Art GPU Profiling on Quest.
Steps to build and run OpenXR gems in O3DE for Quest 2 can be found here: Build and Run OpenXR in O3DE
NOTE
To capture more accurate profiling data, run your app on Quest 2 using a profile build and run with rhi-device-validation disabled (which it's disabled by default on profile).
RenderDoc is a graphics debugger that allows quick and easy single-frame capture and detailed introspection of any application. It also allows you to get a rough idea of how long draws and logical groups of draws (denoted by debug markers) are taking to execute on the GPU. The numbers themselves are not 100% accurate due to the nature of tile-based rendering, but they are accurate enough relative to each other.
Meta maintains its own fork of RenderDoc. On top of the familiar UI, RenderDoc Meta Fork offers two profiling features unavailable in standard RenderDoc: Render Stage Tracing and Draw Call Tracing. With Render Stage Tracing, you get a detailed graphical view of how the GPU works in the form of the Tile Timeline. With Draw Call Tracing, you get 45 different low level metrics specific to the draw calls your app makes. The new data comes in the form of a Tile Timeline view and can be opened through the Window->Tile Timeline menu. RenderDoc’s UI makes render stage capture more convenient than GPU Systrace, but without the ability to see CPU-GPU concurrency.
Installation:
- Easiest way to install RenderDoc Meta Fork is using Oculus Developer Hub (ODH). Download and install ODH in your PC: Meta Quest Developer Hub for Windows
- Open ODH
- Go to Downloads tab
- Under Tools tab find "RenderDoc for Oculus" entry and click Download.
How to do a capture:
- Connect Quest 2 to RenderDoc Meta Fork:
- Connect Quest 2 to PC
- Open RenderDoc Meta Fork on PC
- In the bottom-left corner click on "Replay Context: Local", select "Oculus Quest 2" and wait until it says "Remote server Ready". RenderDoc Meta Fork will install necessary remote files. If it's the first time you have to accept permission inside the headset.
- The replay context text will change to "Replay Context: Oculus Quest 2"
- Launch you App on Quest 2 from RenderDoc Meta Fork:
- In Launch Application tab, click "..." next to Executable Path field. It will like the apps in your Quest 2. Select your app and click OK.
- Click Launch button to run the app on the Quest 2.
- A new tab named after the device and app name will appear.
- Take a capture from you app:
- On the new tab click on "Capture Frame(s) Immediately".
- Captures will appear in the "Captures collected"section.
- Since development apps are prone to crashes, it is recommended that you right-click and save captures as soon as possible. Saving transfers the
.rdc
file from the Quest 2 to the PC. If the app crashes when capturing, it typically means the device is out of system memory.
- Profile capture:
Note: A Quest 2 is necessary to profile the capture, but it doesn't have to have your app installed, so the capture.rdc
can be shared with others.- To load a capture first change the Replay Context to "Oculus Quest 2 Profiling Mode" in the bottom-left corner. Wait until it says "Remote server Ready".
This puts the headset in detailed GPU profiling mode. This mode introduces a 5-10% GPU performance hit in exchange for getting accurate profiling timings.- Sometime it fails to switch to profiling mode, just change to "Oculus Quest 2" and back to "Oculus Quest 2 Profiling Mode" again.
- Note: to take more captures remember to put Replay Context back to "Oculus Quest 2".
- Open the capture by double clicking on it in the tab, or selecting File > Open Capture from the menu bar.
- Note: to take more captures first you have to close the current one opened by click on File > Close Capture.
- With RenderDoc Meta Fork now you can capture additional GPU info:
- Capture RenderStage data by clicking the meta button in the Event Browser.
The render stage data will appear in both the Tile Timeline window and the Tile Browser tab.
- Capture render pass timings by clicking the clock button in the Event Browser. RenderDoc will now show you the per-renderpass time.
Note: This capture failed during my tests so I wasn't able to capture render pass timings.
- Capture metrics per draw call by opening Window > Performance Counter Viewer, click on "Capture Counters" button.
In the Counters list open Generic > Meta Quest, select the counters you want from the list and click on "Sample counters" button.
After replaying the scene on the Quest 2, it will appear a table of metrics. Each row represents a draw call, and you can double click on a row to select the corresponding draw call in the Event Browser.
- Capture RenderStage data by clicking the meta button in the Event Browser.
- When finished, check the GPU detailed profiling mode has not been left enabled on the Quest 2 by running
adb shell ovrgpuprofiler -i
. If it reports it's enabled, then you can disable it by runningadb shell ovrgpuprofiler -d
.
- To load a capture first change the Replay Context to "Oculus Quest 2 Profiling Mode" in the bottom-left corner. Wait until it says "Remote server Ready".
Useful Links:
- RenderDoc for Oculus
- Use RenderDoc Meta Fork for GPU Profiling
- How to Optimize your Oculus Quest App w/ RenderDoc: Getting Started + Frame Capture
IMPORTANT
Oculus GPU Systrace modifies Android Systrace, which was deprecated and removed from Android SDK Platform-Tools in version 33.0.1 (March 2022). In order to use Oculus GPU Systrace you need Android SDK Platform-Tool 33.0.0 (February 2022) or earlier.
WARNING
Systrace does not support Python 3, it requires Python 2.7.
GPU Systrace extends the Android SDK tool Systrace in order to provide low-level GPU pipeline data for apps running on Quest 2. GPU Systrace integrates render stage information into Systrace for a better visualization experience and lets you visualize the GPU and CPU workloads in the same view, allowing you to see how your application’s CPU and GPU workloads work together.
With GPU Systrace we can obtain Render Stage tracing metrics from the GPU, giving us a way to effectively tells us for example “the GPU rendered a 1216x1344 surface with 96 tiles that are all of size 192x168, and that took 5.2ms.”
Installation:
- Systrace requires Python 2.7 with the following modules installed:
- win32con: You can install it with the command
pip2 install pypiwin32
- six: You can install it with the command
pip2 install six
- win32con: You can install it with the command
- Check your "<ANDROID_SDK_DIR>/platform-tools" folder contains the folder "systrace". If it doesn't then download Android SDK platform-tools version 33.0.0 and override your platform-tools folder with it.
- Download GPU systrace.
- Unzip it and copy the files into your "<ANDROID_SDK_DIR>/platform-tools/systrace/catapult/systrace/systrace" folder. Replace the file if it already exists.
- For Apps to communicate with GPU Systrace they need "android.permission.INTERNET" permission in their manifest. O3DE apps already have that permission by default.
How to do render stage tracing:
- Connect Quest 2 to PC
- Enable GPU detailed profiling by running the command
adb shell ovrgpuprofiler -e
. - Open your app
- In a command line go to "<ANDROID_SDK_DIR>/platform-tools/systrace" folder and run
python2 systrace.py --app="<app name>" app renderstage
.
For<app name>
use your app package name (for example "org.o3de.OpenXRTest"). - It will start recording data, press Enter to stop. The trace will be saved in "<ANDROID_SDK_DIR>/platform-tools/systrace/trace.html"
- Open the trace by dragging and dropping the trace.html file into Google Chrome.
- To disable GPU detailed profiling, run the command
adb shell ovrgpuprofiler -d
Useful Links:
Ovrgpuprofiler is a low level CLI tool on Oculus Quest that provides access to detailed GPU information. It’s built as a super lightweight CLI client that effectively acts as a wrapper on top of the PIL Qualcomm library. It allows you to retrieve two types of information, render stage tracing metrics (like GPU systrace, although simply in text form) and real-time metrics.
How to do render stage tracing:
- Connect Quest 2 to PC
- Enable GPU detailed profiling by running the command
adb shell ovrgpuprofiler -e
. - Open your app
- Run
adb shell ovrgpuprofiler --trace
to do the trace.- Default duration is 0.1 second but the
--trace
option can receive number of seconds as argument, for example--trace=2
would do a trace of 2 seconds. - NOTE: If your app performance is very low you might have to increase the trace time to capture data.
- Default duration is 0.1 second but the
- To disable GPU detailed profiling, run the command
adb shell ovrgpuprofiler -d
Surface 1 | 1216x1344 | color 32bit, depth 24bit, stencil 0 bit, MSAA 4 | 60 128x224 bins | 5.12 ms | 123 stages : Binning : 0.643ms Render : 2.17ms StoreColor : 0.474ms Blit : 0.002ms Preempt : 1.411ms
How to show GPU real-time metrics:
- Connect Quest 2 to PC
- Run the command
adb shell ovrgpuprofiler -m
will print the list of all real-time metrics that the tool supports. Pick the numbers of the metrics you want to enable. - Open your app
- Enter
adb shell
from a command line and then run the commandovrgpuprofiler --realtime="STATS_NUMBER_LIST"
.
For example to show the stats "4) % Texture Fetch Stall" and "6) % Texture L1 Miss" run the commandovrgpuprofiler --realtime="4,6"
NOTE: Runningadb shell ovrgpuprofiler --realtime="4,6"
directly didn't work, ovrgpuprofiler has to be run withingadb shell
.
% Texture Fetch Stall : 2.449
% Texture L1 Miss : 20.338
% Texture Fetch Stall : 2.369
% Texture L1 Miss : 20.130
Useful Links: