Optimizing Python - BKJackson/BKJackson_Wiki GitHub Wiki
- Why Python is Slow: Looking Under the Hood - Jake VDP, May 2014
- How to profile memory usage in Python
- 7 tips to Time Python scripts and control Memory & CPU usage
- memory_profiler Docs
- Pytorch vs numpy - how pytorch will replace numpy
- Data Analysis in Parallel - with ipyparallel (video)
- 5 Minute Guide to Numba
Profiling and Timing Code
IPython magic commands
Note: Use double %% for multiline notebook cells.
%time: Time the execution of a single statement%timeit: Time repeated execution of a single statement for more accuracy%prun: Run code with the profiler%lprun: Run code with the line-by-line profiler%memit: Measure the memory use of a single statement%mprun: Run code with the line-by-line memory profiler
The last four commands are not bundled with IPython–you'll need to get the line_profiler and memory_profiler extensions.
Using line_profiler in a notebook
import line_profiler
%load_ext line_profiler
%lprun -f cavity_flow run_cavity()
Profiling your code line-by-line with line_profiler
Save line_profiler results to a text file
%lprun -T timings.txt -f simulate simulate(12)
Look at timings.txt in a notebook
%load timings.txt
Profiling on the command line
Open file, add @profile decorator to any function you want to profile, then run
kernprof -l script_to_profile.py
which will generate script_to_profile.py.lprof (pickled result). To view the results, run
python -m line_profiler script_to_profile.py.lprof
Difference between %time and %timeit
%timeit does some clever things under the hood to prevent system calls from interfering with the timing. For example, it prevents cleanup of unused Python objects (known as garbage collection) which might otherwise affect the timing. For this reason, %timeit results are usually noticeably faster than %time results.
timeit is not perfect, but it is helpful.
Potential concerns re: timeit
- Only runs benchmark 3 times
- It disables garbage collection
python -m timeit -r 25 "print(42)"
python -m timeit -s "gc.enable()" "print(42)"
Victor Stinner has a module, perf, which is more robust and addresses these concerns. You can check it out at: https://perf.readthedocs.io/en/latest/user_guide.html
Optimizing .py Files
Trick for copying notebook cell functions to .py files for use with %mprun
Create module called mprun_demo.py in notebook cell:
%%file mprun_demo.py
def sum_of_lists(N):
total = 0
for i in range(5):
L = [j ^ (j >> i) for j in range(N)]
total += sum(L)
del L # remove reference to L
return total
Then import this function from the mprun_demo module:
from mprun_demo import sum_of_lists
%mprun -f sum_of_lists sum_of_lists(1000000)
Numpy variable sizes
Itemsize is size of each element. Nbytes is the total size of the numpy array.
print("itemsize:", x3.itemsize, "bytes")
print("nbytes:", x3.nbytes, "bytes")
Vectorization
- What it is: Reducing the number of explicit for-loops in your code
Losing your Loops: Fast Numerical Computing with NumPy
Using %timeit
def func_python(N):
d = 0.0
for i in range(N):
d += (i % 3 - 1) * i
return d
Usage:
%timeit func_python(10000)
Using line_profiler
import line_profiler
lp = line_profiler.LineProfiler()
lp.add_function(some_useless_slow_function)
lp.runctx('some_useless_slow_function()', locals=locals(), globals=globals())
lp.print_stats()
Vaex
Vaex docs
Vaex: Out of Core Dataframes for Python and Fast Visualization
Vaex: A DataFrame with super strings
Numba
JIT - Just-in-time compilation
Cython
Optimization of Scientific Code with Cython: Ising Model - Jake VDP, Dec. 2017
Cython Tutorials - Official Cython documentation
Numba vs. Cython - Jake VDP, Aug. 2012
Numba vs. Cython: Take 2 - Jake VDP, June 2013
Python multithreading the GIL
Has the Python GIL been slain? - If you want truly concurrent code in CPython, you have to use multiple processes. See code examples in article.