Optimizing Python - BKJackson/BKJackson_Wiki GitHub Wiki

Why Python is Slow: Looking Under the Hood - Jake VDP, May 2014
How to profile memory usage in Python
7 tips to Time Python scripts and control Memory & CPU usage
memory_profiler Docs
Pytorch vs numpy - how pytorch will replace numpy
Data Analysis in Parallel - with ipyparallel (video)
5 Minute Guide to Numba

Profiling and Timing Code

IPython magic commands

Note: Use double %% for multiline notebook cells.

%time: Time the execution of a single statement
%timeit: Time repeated execution of a single statement for more accuracy
%prun: Run code with the profiler
%lprun: Run code with the line-by-line profiler
%memit: Measure the memory use of a single statement
%mprun: Run code with the line-by-line memory profiler

The last four commands are not bundled with IPython–you'll need to get the line_profiler and memory_profiler extensions.

Using line_profiler in a notebook

import line_profiler

%load_ext line_profiler  

%lprun -f cavity_flow run_cavity()

Profiling your code line-by-line with line_profiler

Save line_profiler results to a text file

%lprun -T timings.txt -f simulate simulate(12)

Look at timings.txt in a notebook

%load timings.txt

Profiling on the command line

Open file, add @profile decorator to any function you want to profile, then run

kernprof -l script_to_profile.py

which will generate script_to_profile.py.lprof (pickled result). To view the results, run

python -m line_profiler script_to_profile.py.lprof

Difference between %time and %timeit

%timeit does some clever things under the hood to prevent system calls from interfering with the timing. For example, it prevents cleanup of unused Python objects (known as garbage collection) which might otherwise affect the timing. For this reason, %timeit results are usually noticeably faster than %time results.

`timeit` is not perfect, but it is helpful.

Potential concerns re: timeit

Only runs benchmark 3 times
It disables garbage collection

python -m timeit -r 25 "print(42)"

python -m timeit -s "gc.enable()" "print(42)"

Victor Stinner has a module, perf, which is more robust and addresses these concerns. You can check it out at: https://perf.readthedocs.io/en/latest/user_guide.html

Optimizing .py Files

Trick for copying notebook cell functions to .py files for use with %mprun

Create module called mprun_demo.py in notebook cell:

%%file mprun_demo.py
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
        del L # remove reference to L
    return total

Then import this function from the mprun_demo module:

from mprun_demo import sum_of_lists
%mprun -f sum_of_lists sum_of_lists(1000000)

Numpy variable sizes

Itemsize is size of each element. Nbytes is the total size of the numpy array.

print("itemsize:", x3.itemsize, "bytes")
print("nbytes:", x3.nbytes, "bytes")

Vectorization

What it is: Reducing the number of explicit for-loops in your code
Losing your Loops: Fast Numerical Computing with NumPy

Using %timeit

    def func_python(N):  
        d = 0.0  
        for i in range(N):  
            d += (i % 3 - 1) * i  
        return d

Usage:

%timeit func_python(10000)

Using line_profiler

    import line_profiler
    lp = line_profiler.LineProfiler()
    lp.add_function(some_useless_slow_function)
    lp.runctx('some_useless_slow_function()', locals=locals(), globals=globals())
    lp.print_stats()

Vaex

Vaex docs
Vaex: Out of Core Dataframes for Python and Fast Visualization
Vaex: A DataFrame with super strings

Numba

5 Minute Guide to Numba

JIT - Just-in-time compilation

LLVM Compiler Infrastructure

Cython

Optimization of Scientific Code with Cython: Ising Model - Jake VDP, Dec. 2017
Cython Tutorials - Official Cython documentation
Numba vs. Cython - Jake VDP, Aug. 2012
Numba vs. Cython: Take 2 - Jake VDP, June 2013

Python multithreading the GIL

Has the Python GIL been slain? - If you want truly concurrent code in CPython, you have to use multiple processes. See code examples in article.