Python Notes - fcrimins/fcrimins.github.io GitHub Wiki

Pandas Cheat Sheet (1/30/17)

How Python Makes Working With Data More Difficult in the Long Run

  • "Let's consider two definitions of "good code" so we can be clear what we mean by better.
    1. Code that is short, concise, and can be written quickly
    2. Code that is maintainable
  • If we're using the first definition, the Python version is "better". If we're using the second, it's far, far worse."
  • "I've painted a rather bleak picture of using Python to manipulate complex (and even not-so-complex) data structures in a maintainable way. In truth, however, it's a shortcoming shared by most dynamic languages. In the second half of this article, I'll describe what various people/companies are doing about it, from simple things like the movement towards 'live data in the editor' all the way to the Dropboxian 'type-annotate all the things' (Static Typing in Python). In short, there's a lot of interesting work going on in this space and lot's of people are involved (notice the second presenter name [Guido] in that Dropbox deck)."

Practical Machine Learning Tutorial with Python Introduction

Data Mining in Python: A Guide

  • Good overview of the tools and IPython Notebook

Vladimir Iakolev: Abusing annotations with dependency injection (8/7/16)

Why Python is Slow (7/5/16)

  • it's the C code that's slow, not the JIT interpreter

Asynchronous Programming with Python 3 (5/6/16)

  • Good explanation of async and await keywords introduced in Python 3.5 (similar to synchronized and Future in Java)

Jamal Moir: An Introduction to Scientific Python (and a Bit of the Maths Behind It) - Matplotlib (4/28/16)

A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization (4/15/16)

  • bottom line: use scipy

How does Python compare to C#? (1/11/16)

Machine Learning in Python

  • Includes an intro to Pandas, Matplotlib, and Scikit-Learn

Learn Python interactively with IPython - A Complete Tutorial!

Probability, Paradox, and the Reasonable Person Principle (in iPython Notebook, by Peter Norvig)

Straightening Loops: How to Vectorize Data Aggregation with pandas and NumPy

  • To summarize in terms of best performance at summing a list, NumPy ndarray sum > pandas Series sum > standard library sum > for loop > standard library reduce.
  • DataFrame methods for aggregation and grouping are typically faster to run and write than the equivalent standard library implementation with loops. For instances where performance is a serious consideration, NumPy ndarray methods offer as much as one order of magnitude increases in speed over DataFrame methods and the standard library.

Email: OOC Dataframes

Email: (no subject)