Python - JasonLocklin/jasonlocklin.github.com GitHub Wiki
Intro to Programming
- How to think like a Computer Scientist Downloadable Book that teaches with Python.
- Project Euler For "learning while doing." A list of simple problems to solve computationally, each builds on the last and introduces new concepts.
Python Resources
-
Getting started in Python Getting started page.
-
Official Documentation: Python's standard documentation is substantial. See also the complete list of documentation by Python version. If you're not sure where to go, try the help page.
-
Dive into Python: a book available for free online that teaches the "Python" way of tackling typical programming tasks.
-
Intro Video This lecture is a fast-paced introduction to Python. It assumes that viewers have some previous experience of programming, and know at least a little about loops, lists, if statements, functions, and file I/O.
-
Mathesaurus Thesaurus-like references for those transitioning between R, Python(numpy), Matlab/Octave, and scilab.
-
Python for Data Analysis." The main documentation is here
Python in the Literature
- Python in Neuroscience Special Topic in Frontiers in Neuroinformatics. 2009.
Useful Packages
Python it's self is rather minimal. It is the packages that extend the language and make it useful for so many things. Keeping track of important packages, where to find them, and what they are good for can be tricky. Here are some of the most useful ones:
Data Analysis
- IPython -extends the python shell to turn it into a more powerful interactive analysis tool. Now includes a "Notebook interface" that provides a great way of writing a self-documenting analysis right in your web browser, effectively replacing the need for a text editor, or IDE, a console, and streamlining everything.
Numpy
-Adds basic numerical programming to Python. Arrays, matrices, that sort of thing.Scipy
-ExtendsNumpy
to do things like linear algebra, statistics, and other higher level maths.Pandas
-Adds a convenient new data structure (dataframe
) that is convenient for holding data, and some methods to work with them, including some stats and summary functions.Matplotlib
-adds plotting functionality. If usingiPython
, importingpylab
brings inmatplotlib
tools automatically.Seaborn
-turns the basic (matlab ugly) plots produced bymatplotlib
into aesthetically pleasing figures for publication.Rpi
-basic interface for calling R commands or scripts.Spyder
-a Python "integrated development environment" designed explicitly for the purpose of using the packages above.- NeuroImaging in Python (a group of packages for brain imaging research)
Also see my R Python Cheatsheet
More info about Pandas:
Pandas is very powerful, but suffers from "bleeding edge syndrome." The documentation is difficult to follow, and there are often various ways of doing things that don't seem to follow a consistent design.
- Normally imported with
import pandas as pd
- Debian testing has up-to-date versions of it.
- read_csv can directly read compressed csv files (neat). I use bz2.
- data frame variables can be accessed with
df.variable
ordf['variable']
. Like R, the shorthand version should not ever be used for writing to data frames. I avoid using the short hand all-together outside of the interactive console. - Unlike R, Pandas does things in an object oriented way. I.e.,
data.describe()
rather thansummary(data)
. - Be careful writing to objects as some functions work on copies of data, while others do not, and it's far from obvious which do which. The documentation isn't perfectly clear, so always test your commands that they are doing what you intend.
- Hierarchical indexing is very cool, but takes some practice.
Psychology Experiments
- Psychopy (Environment for creating psychology experiments)
- VisionEgg (Python Library for psychophysics experiments)
- Pylink (Eyelink II Python Interface)
Tips and Tricks
Memory leaks with fast loops.
I have observed several people run into issues with fast looping psychology experiments. Two things generally cause this. One is assignment in the loop. If you make an assignment, say a=1
, Python allocates memory, assigns it a value of 1, and creates a pointer called a that points to it. If you do the same assignment a second time, python allocates memory again, assigns it a value of 1, and changes the a
pointer to point there. Since nothing is pointing to the old memory location, it is marked for "garbage collection" and Python de-allocates it when it has some spare time. If you are doing this in a fast loop, you could imagine that those assignments can add up. This is especially true in a fast loop where Python doesn't have time to ever do it's garbage collection. If you are doing something like loading stimuli from files, or creating them, do so outside of a loop, and simply turn them on or off in the loop. If you must do calculations that involve assignments inside a loop, you may need to add a wait function of a few milliseconds to free up some CPU time for garbage collection.