Collecting memory snapshots during high CPU and analyzing them - mitikov/KeepSitecoreSimple GitHub Wiki
Agenda
Our application randomly consumes 100500 % CPU:

We have a right to know why!
Game plan
A) We'll use ProcDump and Processor performance counters together to get snapshots just-in-time.
-
ProcDump can be safely used in the Production environment since it does not need the installer.
-
It must be executed with admin permissions, though.
B) Expect to get 2-3 sequential snapshots within the short interval.
C) Gonna check how much CPU time did each thread spend, and find one what threads have high increment between snapshots.
D) Will see if threads are system ( like GC ), or user threads ( like ThreadPool worker threads ).
E) Will compare call stacks of the most CPU time-consuming threads in both snapshots.
Memory dump collection part
Configuring ProcDump
The ProcDump is a powerful and lightweight tool by Sysinternals. We are going to launch it with following args:
procdump64.exe -c 80 -s 2 -ma w3wp.exe -w -n 3

-c 80 specifies the CPU load percent (80% in the case) to trigger the rule
-s 2 number of seconds (2 in the case) CPU load stays above limit
-ma produce full memory snapshot/userdump
w3wp.exe name or ID of the process to inspect
-w can wait if application has not started yet
-n 3 create a number (3 in the case) snapshots before exiting.
Notes
-
ProcDump must be running with
admincredentials. -
At least 2 snapshots are needed to find out the cause of the high CPU.
-
The system should have enough free space ( at least 3 times more than process memory usage ).
-
The exact process would be monitored. Do not expect snapshots once another process started.
-
Since process must be fully suspended to create a memory snapshot, it will not reply to IIS Ping command and could be terminated before full snapshot produced.
Investigation part
Please refer to Opening Memory Snapshots generated on other machines locally article in order to load snapshots into WinDbg.
Open a few instances of WinDbg with different memory snapshots.
Getting time-related information
You can get snapshot generation time via .time command:
We have 2 snapshots produced in 10 seconds for the demo.
Confirming high CPU
Next, we gonna check the CPU usage when the snapshot was created via !ThreadPool command:
The CPU usage in both snapshots is more than 80 percent per ProcDump setting.
Locating suspects
Wanna see what threads took CPU time? Easy with !runaway command:
There are 4 threads that took almost 5 sec of CPU time in 10 seconds.
Simple math shows 4 threads *5 seconds = 20 seconds [ not 10 :) ]. I have the multi-core processor, so it is okay.
NOTES
The value is aggregated time taken by the thread from the moment of creation. The longer application works, the higher numbers you'll get.
It is vital to have at least 2 snapshots to measure aggregated time growth in the short interval.
The longer delay between snapshots, the higher chances not to meet the same operation in both snapshots.
As a result, snapshots may not bring closer to the reason.
Do we suspect system threads?
Next we gonna find system threads ( f.e. GC, Finalizer... ) via !threads -special command:
None of suspected time-consuming threads is the system. It means user code provokes CPU usage.
NOTES
GC threads would take noticeable CPU time in applications that produce high memory pressure ( a lot of allocations & tons of short lived objects ).
The more objects stay in RAM, the more difficult it gets for GC to clean up.
Examine suspects one by one
We need go to check call stacks for threads one by one ( via !clrstack or !mk (sosex) ), and check if same operation is being executed:

Yes! Same operation and same object :) Although this is a thread without information who launched it, we can find who prevents an object from being collected via !gcroot command:
It seems like StartupProcessor constructs an instance of CpuLoadProducer in the GiveHighCPU method.
The regenerated code from snapshot confirms our findings:
private void GiveHighCPU()
using (new CpuLoadProducer(85))
Thread.Sleep(TimeSpan.FromMinutes(5));
Why is StartupProcessor called?
Switching to 3248 thread (aka #3 from !runaway ) shows us the exact reason:
The StartupProcessor is located inside initialize pipeline that is executed when Application starts.
Case solved.
Summary
ProcDumptool was configured to capture snapshots just in time.- Resulting snapshots were loaded into WinDbg and ordered by time. The high CPU was confirmed.
- CPU time accumulated per thread was compared between snapshots, and a few with maximum gain were added to
suspectslist. Suspectsare not system threads.- Same operation by same object instance was performed by
suspectedthread in both snapshots. - The source code of the callee object class was regenerated.
- The logic produced
suspectedobject was located & reason of a call was determined.
Life is not that easy
The demo show is the simplest high CPU case, things will get complicated fast in real life.
I strongly encourage you to follow the same steps locally using the processor & configuration under the link.
Good luck :)





