1. Python Boot Camp - Spinmob/spinmob GitHub Wiki
Overview
This tutorial is intended to give those new to programming and/or Python a starting point in the "Spyder" coding environment. The Python programming language is similar to Matlab, but free, more stable / reliable, behaves the same on every system, and has better web support. Out of the box, however, Python is just a programming language, so one must install "modules" to do science. In our lab, we primarily use the following modules:
- Scipy & Numpy for numerical calculations.
- Matplotlib / Pylab for slow-but-publication-quality figures.
- PyQtGraph for fast and amazing visualization, mostly for data-taking situations.
- Spyder for writing code and interacting with it.
- Spinmob for high-level data handling, plotting, fitting, and more.
If none of this makes sense yet, no worries. It will be explained! In the end, these Python packages, together with Inkscape and TeXstudio or LyX are all that is needed to acquire, analyze, plot, and publish scientific results.
1. Installing everything
The most reliable way to install everything can be found on the Home Page of this wiki.
2. Introduction to Spyder
Spyder is a solid way to code / interact with Python.
Getting Spyder up and running
I. Open Spyder: In Windows, this should be in your start menu. In OSX or Linux, this should be an application, or you can type “spyder” into the terminal. When it launches (takes a long time), you should see something like this:
The panels might be in different locations, but you can move them around to your liking by clicking the little "pop-out" button next to the "x" in the upper-right corner of each panel. The upper left area is a file editor, the bottom is the "IPython console" (make sure "IPython console" is the currently selected tab at the bottom!), and in the upper right you can get real-time information about the objects and functions in your code, or explore variables, etc. In the following, we will use the IPython console to test / run code on the fly (one command at a time), and the editor to create multi-command scripts.
Simple workflow: writing a script and running it in the console
Let's run our first Python script:
I. Write some code: In the editor, delete all the junk at the top of the file and write the following code:
print("Hello world!")
a = 27
II. Save the file: Select from the menu "File -> Save As..." and save the file into a convenient directory of your choosing (we will use this directory for all of our tutorial scripts). At this point, you now have a plain text file with two python commands in it.
III. Configure the run: Select from the menu "Run -> Configure...", and make sure it looks like the one below, and click "OK". (Note you can set the default run configuration in the menu "Tools -> Preferences", selecting "Run" on the left.)
- Execute in a dedicated console: sends all the command from the script, one by one, to a new IPython Console. This keeps the operations of your scripts separate from one another.
- Remove all variables before execution: this creates a "clean slate", preventing "Mathematica Disease" wherein previously defined variables interfere with your script's execution (sometimes, however, you can save time by not clearing the memory before launching, for example when you have to load a 2GB file!).
- Working directory: you will not have to modify this; it is the first directory Python checks whenever you specify a path.
IV. Run the script: Either press the “run file” button, type F5, or select from the menu “Run -> Run”. You should see the following:
The three lines of code in the editor window have now been executed by a fresh Python console. The first line printed the "string" (an array of characters) "Hello world!". The second line (blank) did nothing, and the third, set the "variable" named a
to value 27. The =
sign is used to assign values to variables.
Note that the last command did not produce any visible result in the console. However, it did do some stuff behind the scenes. If we now type
In [2]: a
into the IPython console and press enter, we see the value of a
.
Out[2]: 27
Furthermore, we can attempt some math using a
(more on this below):
In [3]: a+7
Out[3]: 34
We could also run each of the commands in the script manually by typing them into the console, e.g.:
In [4]: print("Hello world!")
Hello world!
So, basically that's it. You can write a bunch of commands into a script then run it in the console, or you can run commands directly from the console, or some combination of the two. Whatever has been previously defined (by hand or in a script that has been run) can be subsequently monkeyed with in the console.
3. Python Basics
Built-in math
Python is a calculator. You can add, subtract, multiply, exponentiate numbers, and divide as shown below, and Python will respond as you might expect:
In []: 1+1
Out[]: 2
In []: 2*2
Out[]: 4
In []: 2**5
Out[]: 32
In []: 5/3
Out[]: 1
I intentionally wrote a weird result on the last command (try this!) to illustrate a common issue with those new to Python: unless you somehow tell Python you are working with non-integers, Python will perform calculations using integer math, meaning 5/3 = 1 remainder 2 (this behavior was changed in Python 3). To get the remainder, use the mod operation:
In []: 5%3
Out[]: 2
Why default to integer math? Because it's fast. If you want to stick to "floating point" math (i.e. "having a decimal point"), just add a decimal point in the right place. As soon as Python "sees" a floating point, it switches to floating point math. More examples:
In []: 3.0/2
Out[]: 1.5
In []: 2.0**0.5
Out[]: 1.4142135623730951
In []: (5+2.0)/2
Out[]: 3.5
Notice how this decision is made in the order of operations. So the third command above was computed as (5+2.0)/2 = 7.0/2 = 3.5, but
In []: (5+2.0)/(3/2)
7.0
Note also: this behavior depends on the version of python (this weird behavior doesn't happen in Python3) and shell you are using. As of 2014, the Spyder shell seems to always assume floating point, and other Python2 shells do not. Hence, it is best to always take this into consideration so that when you send someone your code it doesn't fall apart completely -- I recommend always using a decimal point to be safe.
Variables
Like most programming languages, values can be store in variables, for later use, e.g.
In []: a = 5.0/2.0
In []: b = 3.0*a
In []: a+b
Out[]: 10.0
Good times.
Functions
Functions are objects that receive inputs, do things with those inputs, and make some outputs. Let's define one at the command line:
In []: def f(x,y): return x*y
Press enter twice after this to finish defining the function (note you can also make multiline functions at the command line; how this works is different from console to console). This function takes two inputs, x
and y
, multiplies them, and returns the result. Try it!
In []: f(32,2)
Out[]: 64
You can also do things with the result of any function directly, by treating the function itself as a variable, e.g.
In []: f(32,2)*2+1
Out[]: 129
This function is too simple, and super boring. To define a multiline function, let's make a new script in Spyder and add this code:
def f2(x,y,z):
print("x = ", x)
print("y = ", y)
return x+y*z
When we run this script, nothing visible will happen, but we can now use this function from the command line:
In []: a = f2(1,2,3)
x = 1
y = 2
Out[]: 7
Here the function printed the x
and y
values we sent it, then returned x+y*z
. We stored this result (equal to 7 in this case) as the variable a
.
Note that Python is very strict about indentation. The "contents" of the function must all be indented by the same amount, and Python stops defining the function when it sees that you have stopped indenting. Thus you can define several functions in one script (and run them in the same script if you like!):
def f2(x,y,z):
print("f2 executed")
return x+y*z
def f3(x):
print("f3 executed")
return x**x
print(f3(2) + f2(1,2,3))
Try predicting the output of this script and then run it to see if you're right!
Strings, Lists, Tuples, and Dictionaries
There are many objects in Python other than numbers, too. I constantly use the following.
Strings
One of the most amazing things Python can do is efficiently manipulate text. Text objects are called "strings" in Python. Let's make two strings:
In []: s1 = "my first string"
In []: s2 = 'my "second" string'
Strings are defined by enclosing some text within either single or double quotes. There is no difference between the two, though enclosing by single quotes allows you to use double quotes in the string itself, and vice versa. If you need to use both (or other funny characters like "line breaks", use "escape characters" which consist of a backslash \
followed by a character. For example:
In []: s3 = "my \"third\" string \n has a line break."
The \"
combination allows you to include the quotes without ending the string, and the \n
combination adds a line break. Try inspecting s3:
In []: print(s3)
my "third" string
has a line break.
- Note the space on either side of
\n
is not necessary.
For strings with multiple line breaks, however, this can become quite tedious and inelegant. For these, Python graciously provides triple quotes which parses line breaks as they are written in the code.
In []: s4 = """This is a
totally valid
string in Python"""
In []: print(s4)
This is a
totally valid
string in Python
As before, there is no difference between single and double quotes; s = '''spinmob'''
is equivalent to s = """spinmob"""
.
Strings also have some amazing functionality, for example you can add them:
In []: s1+s3
Out[]: 'my first stringmy "third" string \n has a line break.'
You can get a particular character (zero corresponds to the first):
In []: s1[3]
Out[]: 'f'
You can get a subset of characters:
In []: s1[3:8]
Out[]: 'first'
And you can split them:
In []: s1.split('s')
Out[]: ['my fir', 't ', 'tring ']
and recombine them as Daffy Duck:
In []: x = s1.split('s')
In []: "th".join(x)
Out[]: 'my firtht thtring'
-
Note the difference between square brackets [] and parenthesis (). Parenthesis are used for calling functions, and square brackets are used for referencing data (in this case, the characters of the string).
-
Note also we have used a dot "." to access built-in functionality of the string s1. This
split()
function splits the string (in this case by the delimiter 's') and returns a "list" (see next section) of three sub-strings, without affecting the original strings1
. The functions accessed by a "." are referred to as "methods of an object"; this is a nifty feature of "object-oriented coding": all Python objects have their own built-in functionality, in addition to the data they hold. We'll talk more about this later, but for now, just know that you can access all object methods and data by typing a "." (even if the object does not have a variable name, such as"th"
above) and looking at the list that pops up in the console. Playtime ensues, and this is a great way to learn new libraries!
Lists
A list is an object that can hold many other objects (including other lists). Let's create one.
In []: a = [3.4, 27, "test"]
Here the list is surrounded by square brackets, and each element is separated by a comma. This list has 3 elements: two numbers and a string. To access an element or range of elements,
In []: a[1]
Out[]: 27
In []: a[1:3]
Out[]: [27, "test"]
These “functions” also return values and do not change the original list. Try typing a.
and looking for all of the list functionality. You can remove elements with a.pop()
, add elements with a.append()
, sort them, etc.
Tuples
Tuples are like lists but they are defined with parenthesis instead of brackets:
In []: a = (3.4, 27, "test")
In []: a[0]
Out[]: 3.4
You can read about the various differences between the two, but in practice for me, I don't use them unless I have to because they don't have as much of the functionality I use as lists. They often appear in the context of function definitions for me, e.g.:
In []: def f(*a): print(a)
Putting a *
before the input specifies that you can call f()
with as many arguments as you like, and they will be stored in a tuple named a
. So:
In []: f(1,2,3,4,"test")
(1, 2, 3, 4, "test")
This adds some serious flexibility to function definitions!
Dictionaries
Dictionaries are one of the most powerful Python objects I have found. As the name suggests, it's an object with which you can "look up" values. Dictionaries are defined with some odd-looking syntax involving curly braces:
In []: d = {"python":"a programming language", 32:"entry 32", 44:128}
This dictionary has 3 entries. Each entry has a “key” (i.e. the thing to the left of the colon) and a “value” (i.e. the thing to the right of the colon). To access a list of keys or values:
In []: d.keys()
Out[]: ['python', 32, 44]
In []: d.values()
Out[]: ['a programming language', 'entry 32', 128]
To get values and play with them:
In []: d["python"]
Out[]: 'a programming language'
In []: d[32]
Out[]: 'entry 32'
In []: d[44] + 1
Out[]: 129
And to add new entries or remove entries:
In []: d['new entry'] = 444
In []: d.pop('python')
Out[]: 'a programming language'
In []: d
Out[]: {32: 'entry 32', 44: 128, 'new entry': 444}
So, there are a few options to store / interact with information.
Logic and Loops
Boolean logic
Python can evaluate equalities and other logic. For example
In[]: not a == 3 and b >= 2 or c <=1
is almost as readable as text. not
before an item negates it, ==
is "equal to", >=
is "greater than or equal to", etc. and
and or
are exactly the logical AND and OR statements. numpy
also has logical statements that can be applied to arrays. These days, you can also even write things like x is None
and a is not b
or "pants" in ["pants", "shoes"]
(which returns True
because "pants" is in that list). It's worth looking up all the logical operations that exist, but these are the basics.
"If" Statements
Often you want to do different things based on logic. For this, we use "if" statements. Here is an example:
In []: if 3 == 2: print("what??")
As with the inline function definitions above, you must push enter twice to make this work. Here Python checks whether 3 is equal to 2, and if it is, it prints a confused sentence. Since Python is logical, nothing will be printed. If we change this line to
In []: if 3 > 2: print("fine.")
it will print "fine.", because 3 is actually greater than 2. Let's add a bit more logic and define the following function:
def f(a):
if a%2 == 0:
print("it's even.")
print("nice work.")
elif a%3 == 0:
print("it's a multiple of 3.")
else:
print("I don't know.")
- This function first checks if the argument is even by "modding" it with 2. If it is even (and thus has a remainder of 0 when divided by 2) it prints "it's even" and "nice work".
- The following line should be read "else if" or "otherwise, if". If it's not even, this line of code is called: it checks if it is a multiple of 3 and prints "it's a multiple of 3" if it is. You can have as many
elif
statements as you like. - Finally, if none of the if's are satisfied, the "else" or "otherwise" code is called, and it prints "I don't know."
Note that just like a function definition, indentation for if
, elif
, and else
statements is strictly obeyed.
"For" Loops
Often one needs to repeat an action many times, and for this I often use “for” loops. For example, say we have a long list of numbers and we need to print all of the even values:
for n in [0,1,2,3,4,5,6,7,8]:
if n%2 == 0: print(n)
Note that this is just iterating over the supplied list, and any list will do. For example, this can be written more succinctly as
for n in range(10):
if n%2 == 0: print(n)
The range()
function returns a list (or a "range" object in Python3), which is iterable.
"While" Loops
Sometimes you want to loop until some condition is met, but you do not know ahead of time how many steps this will take. For this we use a "while" loop. For example, we might want to find the first prime number above 1000. Here we first define a (totally inefficient) function to determine whether a number is prime, then perform a "while" loop that tests numbers above 1000.
def is_prime(x):
for i in range(2,x):
if x%i == 0:
print(x, "has factor", i)
return False
return True
n = 1000
while not is_prime(n): n = n+1
print("First prime above 1000 =", n)
This code results in the following output:
1000 has factor 2
1001 has factor 7
1002 has factor 2
1003 has factor 17
1004 has factor 2
1005 has factor 3
1006 has factor 2
1007 has factor 19
1008 has factor 2
First prime above 1000 = 1009
I did not know that until this moment. Thank you, Python.
Comments (a.k.a. the most important things you'll ever finally learn to use a few years from now)
Until now we have only written code. The other thing we can (MUST!) include in scripts are comments. A comment is a portion of the script that Python will ignore. They provide information to the human being that is reading your code. This includes other lab members, your future self, and your present self.
However, I can guarantee that, no matter what I say, two things will happen to you:
-
You will write code without comments, because what you are doing is so simple "it doesn't need comments".
-
You will return to your code N months later, have no idea what the code is doing, and lose several days of progress trying to add one small feature.
This is a promise, and you will find it happening even for N ~ π. I have yet to find a way to sufficiently emphasize the importance of comments so as to avoid these events and for this I am sorry.
There are several ways to add comments in Python. The simplest is to type the pound sign "#" before some code, e.g.
# some notes about what is about to happen
print("some stuff") # some notes about this particular line
What this will do is simply print "some stuff". Everything after the # signs on each line is ignored by Python. You can also use the "#" symbol to temporarily disable some code without deleting it, and in spyder, you can select a big block of code and comment all of it by typing ctrl-1.
Another way to add comments is to insert them in triple-quote strings (without saving it to a variable), either
"""
Some stuff Python will ignore.
You can write a few lines
or even some code:
x = 32
"""
print("some stuff.")
Everything but the print statement will be ignored. Typically, the “three quote” comment is used below a function definition, such as:
def f(x):
"""
You can describe what the function does here. In
this case, it returns the square of x.
"""
return x*x
This is formally called a docstring, and the great part about this type of commenting is that the information in between the triple quotes will appears as the documentation of the function, either above the function as you are typing it or when you type help(f)
.
For example, we should comment the "while loop" code above:
def is_prime(x):
"""
Function that (in the silliest way possible)
determines whether x is prime.
"""
# loop over all possible factors and
# check to see if it is not prime
for i in range(2,x):
# if there is no remainder, i is
# not a factor of x
if x%i == 0:
# print the naughty factor
print(x, "has factor", i)
return False
# no factors found!
return True
# look for the first prime number above 1000
n = 1000
while not is_prime(n):
n = n+1
print("First prime above 1000 =", n)
Much easier to read!
Advice: write the comments describing what is about to happen before writing the code. You will spend a lot more time debugging code than writing code, so having clear, natural-language blocks of text will greatly simplify your experience. I typically try to write more comments than code, because it simply does not add a any time to the coding process compared to learning how the actual code works or looking up functions online, etc. It also will save you factors of 10 in debugging time.
You'll see.
A few years from now, anyway.
Modules
Python's built-in functionality (some of which is described above) can be extended dramatically with "modules". For example, a common module for very fast numerical calculations, people typically use the "numpy" module, which can be “imported” (after you install it, of course!) as follows:
In []: import numpy
The module itself is an object, and after you run the above command, you can access all the module's functionality with the dot ".", for example:
In []: numpy.linspace(0,10,3)
Out[]: array([ 0., 5., 10.])
In this case, we used the linspace()
function to create a 3-element array (discussed below) spanning the range 0 to 10. There are a lot of modules. We can also write our own modules. If you don't like typing numpy
all the time, you can shorten it:
In []: import numpy as np
In []: np.linspace(0, np.pi, 5)
Out[]: array([ 0. , 0.78539816, 1.57079633, 2.35619449, 3.14159265])
Or you can import everything into your “name space” so that you have access to it directly:
In []: from numpy import *
In []: linspace(0, pi, 5)
Out[]: array([ 0. , 0.78539816, 1.57079633, 2.35619449, 3.14159265])
I'm not a fan because it clutters the name space, making any modules I write difficult to navigate with the pop-up code completion.
All of Python's modules are located in a single "site-packages" ("dist-packages" on Linux) folder. To find the location of this folder on your computer, use the following commands:
In []: import sys
In []: for p in sys.path: print(p)
...
C:\Python27\lib\site-packages\spyderlib\utils\external
C:\Python27
C:\Python27\Scripts\
C:\Python27\Lib\site-packages
...
Note you'll have to push enter twice after the for loop. My actual list is larger than the one shown here, but you can quickly see where “site-packages” lives on my computer. Once you find this, remember it forever, because if you ever have to manually install a Python package (such as spinmob, described below), you will need to have access to this folder.
The numpy module
Python (like Matlab) is an interpreted language, meaning it is super powerful and easy to use, but not compiled into fast machine-level code before executing, meaning it slow. To perform fast calculations, it is almost always possible formulate the problem in terms of “numpy arrays” (like those above), which are handled by fast machine-level code. For example, if I want to numerically integrate the function sin(x) from 0 to 1, I could do this:
import numpy as np
# create an array of a million x-values between 0 and 1
xs = np.arange(0.0, 1.0, 1e-6)
# use Python to loop over each value and add it to the sum:
# start with sum = 0
sum = 0
# loop over each element of the array
for x in xs:
# add this value times dx to the total sum
sum = sum + np.sin(x) * 1e-6
print(sum)
This works, and it is correct. However, on my laptop, this takes about 10 seconds, because it takes time for Python to interpret what amounts to a million lines of code! If instead I do the following:
sum = np.sum(np.sin(xs)) * 1e-6
I get the answer immediately because the underlying numpy code is blindingly fast. Note that when I type np.sin(xs)
, numpy loops over the entire array, taking sin()
of each element, returning an array of the same size. The command np.sum()
then tell numpy to add all the elements together.
In short, anything that can be done to an individual number in Python can also be done to a numpy array of any size or shape. See the numpy documentation for more details!
The scipy module
This is the other biggy. In this module you will find fast-compiled code for solving differential equations, taking smarter integrals than the one above, fitting, and a whole lot more. Definitely look into this module. Basically every well-established mathematical trick you can think of has already been ported to python and lives in here somewhere.