ArmPkg Profiling - jljusten/tianocore GitHub Wiki
We will take the ARM Versatile Express TC2 (big.LITTLE test chip) as an example in this page.
1. Build UEFI in Release build & Copy the binary to the board. See the instructions in this wiki page
2. Identify the start and end point of your trace. Example on TC2, we want to measure from the start of UEFI at 0xB0000000 - defined into ArmVExpress-CTA15-A7.fdf
[FD.ARM_VEXPRESS_CTA15A7_EFI] BaseAddress = 0xB0000000|gArmTokenSpaceGuid.PcdFdBaseAddress # The base address of the Firmware in remapped DRAM. Size = 0x000B0000|gArmTokenSpaceGuid.PcdFdSize # The size in bytes of the FLASH Device
... until the start of Linux.
To identfy where we will start Linux we need to load the symbol of the BDS.
So start UEFI a first time on the target up to the Boot menu.
Load all the symbols with DS-5 (symbols for Pre-EFI and UEFI phases):
source "/home/olivier/tianocore/ArmPlatformPkg/Scripts/Ds5/cmd_load_symbols.py" -f (0xB0000000,0x000B0000) -m (0x80000000,0x40000000) -v -aSave the output into a file and replace the entry line to make DS-5 command line. Example:
From:
Add symbols of /home/olivier/tianocore/Build/ArmVExpress-CTA15-A7/RELEASE_GCC48/ARM/ArmPlatformPkg/PrePi/PeiMPCore/DEBUG/ArmPlatformPrePiMPCore.dll at 0xb0000180 Add symbols of /home/olivier/tianocore/Build/ArmVExpress-CTA15-A7/RELEASE_GCC48/ARM/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll at 0xbfd62240 Add symbols of /home/olivier/tianocore/Build/ArmVExpress-CTA15-A7/RELEASE_GCC48/ARM/ArmPkg/Drivers/CpuDxe/CpuDxe/DEBUG/ArmCpuDxe.dll at 0xbfd14240 (...)To:
add-symbol-file /home/olivier/tianocore/Build/ArmVExpress-CTA15-A7/RELEASE_GCC48/ARM/ArmPlatformPkg/PrePi/PeiMPCore/DEBUG/ArmPlatformPrePiMPCore.dll 0xb0000180 add-symbol-file /home/olivier/tianocore/Build/ArmVExpress-CTA15-A7/RELEASE_GCC48/ARM/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll 0xbfd62240 add-symbol-file /home/olivier/tianocore/Build/ArmVExpress-CTA15-A7/RELEASE_GCC48/ARM/ArmPkg/Drivers/CpuDxe/CpuDxe/DEBUG/ArmCpuDxe.dll 0xbfd14240 (...)Save the file into symbols.ds
Go to the assembly view of 'StartLinux'. Find where we start Linux and set a hardware breakpoint at this location. Image:uefi-profile1.png
Set a hardware breakpoint at the start of UEFI:
hbreak -p *S:0xB0000000
4. After disconnecting the debugger, restart the ARM Versatile Express.
5. Setting up DS-5 for DSTREAM trace with UEFI
This tutorial assumes you have already set up DS-5 for hardware debugging UEFI. If you haven't, help can be found here.
Ensure that your DSTREAM's debug probe is connected via a Mictor-38 to the target's trace port. This is in addition to the JTAG connection.
6. Open the Debug Configurations menu and from there open the DTSL Options window: Image:uefi-profile2.png In the Trace Capture tab, select "DSTREAM 4GB Trace Buffer". The other settings are optional. Image:uefi-profile3.png
In the Core Trace tab, select "Enable core trace", "Enable core trace", for each core you want to trace, and optionally "Cycle accurate trace". Image:uefi-profile4.png
On TC2, you should now be in the ARM Boot Monitor Menu.
Image:uefi-profile5.png
7. Connect the debugger.
8. Enable your two hardware breakpoints (the start and stop of your trace).
9. Load all the symbols (ie: 'source symbols.ds'). Symbols must be loaded before starting the acquisition.
10. Resume the execution. Start UEFI by typing:
flash run uefiProgram execution should stop at address 0xB0000000.
11. Clear the trace and resume the execution.
UEFI will now boot. As soon as you enter into the Boot Menu press the boot entry (another alternative would be to automatically start Linux by setting PcdTimeout... to 0). Program execution should stop again when reaching the breakpoint you added in stop address. Image:uefi-profile6.png
After some post-processing here are the list of the 20 functions that consume the more cycles on the UEFI Firmware of the ARM Versatile Express TC2.
Module Name / Function Name | Cycle | Percentage | Count |
---|---|---|---|
ArmPlatformPrePiMPCore.dll/LzmaDec_DecodeReal | 1538397019 | 33% | 9 |
ArmCpuDxe.dll/ArmCleanInvalidateDataCacheEntryBySetWay | 1210112914 | 26% | 18487296 |
ArmPlatformBds.dll/InternalMemCopyMem | 557552792 | 12% | 99 |
DxeCore.dll/InternalMemCopyMem | 372182434 | 8% | 26152 |
ArmPlatformPrePiMPCore.dll/InternalMemCopyMem | 158259634 | 3% | 65 |
ArmCpuDxe.dll/ArmV7AllDataCachesOperation | 156004379 | 3% | 18489420 |
VariableRuntimeDxe.dll/__aeabi_uread4 | 67701311 | 1% | 68442 |
SerialDxe.dll/MmioRead32 | 49657481 | 1% | 167543 |
DxeCore.dll/InternalMemCompareMem | 48489155 | 1% | 70456 |
HdLcdGraphicsDxe.dll/MmioRead32 | 28337835 | 0% | 95824 |
DxeCore.dll/FwVolBlockReadBlock | 27751950 | 0% | 45740 |
HiiDatabase.dll/InternalMemCopyMem | 21179285 | 0% | 8105 |
ArmVeNorFlashDxe.dll/InternalMemCopyMem | 14168462 | 0% | 281 |
DxeCore.dll/FvCheck | 12961574 | 0% | 23193 |
DxeCore.dll/ReadUnaligned16 | 12397148 | 0% | 27832 |
ArmCpuDxe.dll/UpdatePageEntries | 11785691 | 0% | 20544 |
DxeCore.dll/ProduceFVBProtocolOnBuffer | 11522208 | 0% | 14 |
HiiDatabase.dll/__aeabi_memcpy | 9341275 | 0% | 20710 |
DxeCore.dll/CoreSetInterruptState | 8332436 | 0% | 32373 |
ArmPlatformPrePiMPCore.dll/MmioRead32 | 8261530 | 0% | 27246 |
Function Name | Cycle | Percentage | COunt |
---|---|---|---|
LzmaDec_DecodeReal | 1538397019 | 33% | 9 |
ArmCleanInvalidateDataCacheEntryBySetWay | 1215995571 | 26% | 18504704 |
InternalMemCopyMem | 1133754058 | 24% | 36242 |
ArmV7AllDataCachesOperation | 160028408 | 3% | 18515535 |
MmioRead32 | 86717719 | 1% | 292770 |
__aeabi_uread4 | 67707303 | 1% | 68453 |
InternalMemCompareMem | 52567407 | 1% | 76317 |
FwVolBlockReadBlock | 27751950 | 0% | 45740 |
ReadUnaligned16 | 13271790 | 0% | 29780 |
FvCheck | 12961574 | 0% | 23193 |
UpdatePageEntries | 11785691 | 0% | 20544 |
ProduceFVBProtocolOnBuffer | 11522208 | 0% | 14 |
__aeabi_memcpy | 9399629 | 0% | 20814 |
CoreSetInterruptState | 8332436 | 0% | 32373 |
VariableWriteServiceInitialize | 7058548 | 0% | 64019 |
CompareGuid | 6685126 | 0% | 141296 |
IsErasedFlashBuffer | 6550933 | 0% | 1 |
InternalMemSetMem32 | 6450910 | 0% | 1800 |
CoreRestoreTpl | 6072718 | 0% | 15192 |
NarrowGlyphToBlt | 5892173 | 0% | 20809 |