Collecting memory snapshots during high CPU and analyzing them - mitikov/KeepSitecoreSimple GitHub Wiki
Agenda
Our application randomly consumes 100500 % CPU:
We have a right to know why!
Game plan
A) We'll use ProcDump and Processor
performance counters together to get snapshots just-in-time.
-
ProcDump can be safely used in the Production environment since it does not need the installer.
-
It must be executed with admin permissions, though.
B) Expect to get 2-3 sequential snapshots within the short interval.
C) Gonna check how much CPU time did each thread spend, and find one what threads have high increment between snapshots.
D) Will see if threads are system ( like GC ), or user threads ( like ThreadPool worker threads ).
E) Will compare call stacks of the most CPU time-consuming threads in both snapshots.
Memory dump collection part
Configuring ProcDump
The ProcDump is a powerful and lightweight tool by Sysinternals. We are going to launch it with following args:
procdump64.exe -c 80 -s 2 -ma w3wp.exe -w -n 3
-c 80
specifies the CPU load percent (80%
in the case) to trigger the rule
-s 2
number of seconds (2
in the case) CPU load stays above limit
-ma
produce full memory snapshot/userdump
w3wp.exe
name or ID of the process to inspect
-w
can wait if application has not started yet
-n 3
create a number (3
in the case) snapshots before exiting.
Notes
-
ProcDump must be running with
admin
credentials. -
At least 2 snapshots are needed to find out the cause of the high CPU.
-
The system should have enough free space ( at least 3 times more than process memory usage ).
-
The exact process would be monitored. Do not expect snapshots once another process started.
-
Since process must be fully suspended to create a memory snapshot, it will not reply to IIS Ping command and could be terminated before full snapshot produced.
Investigation part
Please refer to Opening Memory Snapshots generated on other machines locally article in order to load snapshots into WinDbg.
Open a few instances of WinDbg with different memory snapshots.
Getting time-related information
You can get snapshot generation time via .time
command:
We have 2 snapshots produced in 10 seconds for the demo.
Confirming high CPU
Next, we gonna check the CPU usage when the snapshot was created via !ThreadPool
command:
The CPU usage in both snapshots is more than 80
percent per ProcDump setting.
Locating suspects
Wanna see what threads took CPU time? Easy with !runaway
command:
There are 4 threads that took almost 5 sec of CPU time in 10 seconds.
Simple math shows 4 threads *5 seconds = 20 seconds [ not 10 :) ]. I have the multi-core processor, so it is okay.
NOTES
The value is aggregated time taken by the thread from the moment of creation. The longer application works, the higher numbers you'll get.
It is vital to have at least 2 snapshots to measure aggregated time growth in the short interval.
The longer delay between snapshots, the higher chances not to meet the same operation in both snapshots.
As a result, snapshots may not bring closer to the reason.
Do we suspect system threads?
Next we gonna find system threads ( f.e. GC, Finalizer... ) via !threads -special
command:
None of suspected time-consuming threads is the system. It means user code provokes CPU usage.
NOTES
GC threads would take noticeable CPU time in applications that produce high memory pressure ( a lot of allocations & tons of short lived objects ).
The more objects stay in RAM, the more difficult it gets for GC to clean up.
Examine suspects one by one
We need go to check call stacks for threads one by one ( via !clrstack
or !mk
(sosex) ), and check if same operation is being executed:
Yes! Same operation and same object :) Although this is a thread without information who launched it, we can find who prevents an object from being collected via !gcroot
command:
It seems like StartupProcessor
constructs an instance of CpuLoadProducer
in the GiveHighCPU
method.
The regenerated code from snapshot confirms our findings:
private void GiveHighCPU()
using (new CpuLoadProducer(85))
Thread.Sleep(TimeSpan.FromMinutes(5));
Why is StartupProcessor called?
Switching to 3248
thread (aka #3 from !runaway
) shows us the exact reason:
The StartupProcessor
is located inside initialize
pipeline that is executed when Application starts.
Case solved.
Summary
ProcDump
tool was configured to capture snapshots just in time.- Resulting snapshots were loaded into WinDbg and ordered by time. The high CPU was confirmed.
- CPU time accumulated per thread was compared between snapshots, and a few with maximum gain were added to
suspects
list. Suspects
are not system threads.- Same operation by same object instance was performed by
suspected
thread in both snapshots. - The source code of the callee object class was regenerated.
- The logic produced
suspected
object was located & reason of a call was determined.
Life is not that easy
The demo show is the simplest high CPU case, things will get complicated fast in real life.
I strongly encourage you to follow the same steps locally using the processor & configuration under the link.
Good luck :)