profiling - casadi/casadi GitHub Wiki
Profiling is the process of finding out what the execution time of each line of code or method is. This allows you to spot bottlenecks. This may also be referred to as benchmarking
See also: tracing.
How to profile a CasADi C++ program?
Compile CasADi and your C++ program with a -pg flag.
How to profile a CasADi python program?
with package runsnakerun:
python -m cProfile -o stats.prof myscript.py
runsnake stats.prof
Note that python profiling stops at the SWIG boundary. To really get profiling of C++ as well, try 'perf', a statistical profiler:
perf record -g -- python myscript.py
perf report -n -g graph,0.5,caller --comm=python
Things to try: pressing E/C. Right arrow twice to see timings on source code
Note that perf can possibly disrupt your system permanently.
Profiling the memory life in a python program:
import psutil
import os
from casadi import *
pid = os.getpid()
p = psutil.Process(pid)
def getinfo(p):
ret = []
ret.append(p.get_memory_info())
ret.append(p.get_ext_memory_info())
ret+= [i for i in p.get_memory_maps() if "casadi" in i.path]
return ret
pre = getinfo(p)
print pre
n = 1000
s = Sparsity.dense(n,n)
expected = ((n+1)+n*n)*4
# Assuming int32
print "expected [bytes]:", expected
post = getinfo(p)
print post
def showdiff(pre,post):
for k in pre.__dict__.keys():
pre_ = getattr(pre,k)
post_= getattr(post,k)
if isinstance(pre_,int):
print "%20s: %10d -> %10d | delta = %10d (%0.2f %%)" % (k,pre_,post_, post_ - pre_,100*float(post_ - pre_)/expected)
for pre_,post_ in zip(pre,post):
print "-"*80
if hasattr(pre_,"path"):
print pre_.path
showdiff(pre_,post_)
How to profile CasADi virtual machines
In your CasADi script, do:
CasadiOptions.startProfiling('prof.log')
On the terminal, run
casadi-build-dir/bin/profilereport prof.log
This will generate a local web page with statistics.
This requires CasADi to be built with ENABLE_PROFILING options set to ON
There is some work underway to make a combined call-graph and treemap representation in the webpage, e.g.:

Generate C code and use C profiling tools
First generate C code with
Function::generateCode(generate_main=true)
You may want to edit the generated main function to make inputs never 0 (avoid divide by zero) or increase the length of the loop for better statistics.
Gprof
Compile that code with profiling options, like
gcc -pg -lm my_fun.c -o my_fun
Then you run that code with
./my_fun
and generate a profile report with
gprof my_fun
If you want to get really fancy, get graphviz and gprof2dot.py, and run something like
gprof my_fun | ./gprof2dot.py | dot -Tpng -o output.png
You will get a beautiful graph.

callgrind + kcachegrind
With callgrind you can run a binary and magically get profiling output. The disadvantage is that it doesn't handle very large generated code like gprof does. First compile your binary
gcc -g -lm my_fun.c -o my_fun
Run the binary with
valgrind --tool=callgrind ./my_fun
This generates a file like callgrind.out.1234. You can view this output with a tool like kcachegrind
kcachegrind callgrind.out.1234
Kcachegrind has great interactive tools:
also:

More advanced usages of callgrind:
- Investigate only a portion of your program:
--instr-atstart=no. Performcallgrind_control -i onwithin your program and end withcallgrind_control -i off. - Profiling jitted/external parts of CasADi needs some extra care.
shell_compilerissues a dlclose, which makes callgrind ignore the annotated source code. Make sure to dump before the compiled function goes out of scope:callgrind_control -dsee https://docs.kde.org/stable/en/kdesdk/kcachegrind/using-kcachegrind.html for more info
pprof (part of gperftools 2.0)
LD_PRELOAD=/usr/local/lib/libprofiler.so CPUPROFILE=test.prof python file.py
pprof --callgrind`which python` > test.callgrind
kcachegrind test.callgrind