07 Python Data Science Toolbox - HannaAA17/Data-Scientist-With-Python-datacamp GitHub Wiki

01 Writing functions
02 Default arguments, variable-length arguments and scope
03 Lambda functions and error-handling
04 Using iterators in PythonLand
05 List comprehensions and generators

01 Writing functions

Multiple parameters and return values

  • Tuples.
    • Like a list.
    • Immutable.
    • Constructed using parentheses () : even_nums = (2, 4, 6)
  • Unpacking tuples:
even_nums = (2, 4, 6)
a, b, c = even_nums
  • accessing tuple elements: print(even_nums[0])
  • an example: Functions that return multiple values
# Define shout_all with parameters word1 and word2
def shout_all(word1, word2):
    
    # Concatenate word1 with '!!!': shout1
    shout1 = word1 + '!!!'
    
    # Concatenate word2 with '!!!': shout2
    shout2 = word2 + '!!!'
    
    # Construct a tuple with shout1 and shout2: shout_words
    shout_words = (shout1, shout2) 

    # Return shout_words
    return shout_words  # /return shout1, shout2/ would do the same thing

# Pass 'congratulations' and 'you' to shout_all(): yell1, yell2
yell1, yell2 = shout_all('congratulations', 'you')

# Print yell1 and yell2
print(yell1)
print(yell2)

02 Default arguments, variable-length arguments and scope

Global vs. local scope

  • Use the keyword global within a function to alter the value of a variable defined in the global scope.
new_val = 10

def square(value):
    ***returns the square of a value.***
    global new_val
    new_val = new_val ** 2
    return new_val
square(3) # output: 100  
new_val # output: 100

Nested Functions

  • to avoid writing out the same computations within functions repeatedly
  • closure
    • the nested or inner function remembers the state of its enclosing scope when called
  • the keyword nonlocal can be used to alter the value of a variable defined in the enclosing scope
# Define echo
def echo(n):
    """Return the inner_echo function."""

    # Define inner_echo
    def inner_echo(word1):
        """Concatenate n copies of word1."""
        echo_word = word1 * n
        return echo_word

    # Return inner_echo
    return inner_echo

# Call echo: twice
twice = echo(2)

# Call echo: thrice
thrice = echo(3)

# Call twice() and thrice() then print
print(twice('hello'), thrice('hello')) # hellohello hellohellohello

Default and flexible arguments

  • Functions with variable-length arguments (*args)
# Define gibberish
def gibberish(*args):
    """Concatenate strings in *args together."""

    # Initialize an empty string: hodgepodge
    hodgepodge=''

    # Concatenate the strings in args
    for word in args:
        hodgepodge += word

    # Return hodgepodge
    return hodgepodge

# Call gibberish() with one string: one_word
one_word = gibberish('luke')

# Call gibberish() with five strings: many_words
many_words = gibberish("luke", "leia", "han", "obi", "darth")

# Print one_word and many_words
print(one_word)
print(many_words)
  • Functions with variable-length keyword arguments (**kwargs)
    • kwargs is a dictionary : keyword-value
# Define report_status
def report_status(**kwargs):
    """Print out the status of a movie character."""

    print("\nBEGIN: REPORT\n")

    # Iterate over the key-value pairs of kwargs
    for key, value in kwargs.items():
        # Print out the keys and values, separated by a colon ':'
        print(key + ": " + value)

    print("\nEND REPORT")

03 Lambda functions and error-handling

Lambda functions

  • Some function definitions are simple enough that they can be converted to a lambda function.
raise_to_power = lambda x, y = x ** y

raise_to_power(2,3) #8
  • The best use case for lambda functions, however, are for when you want these simple functionalities to be anonymously embedded within larger expressions.
  • use in map(func, seq)
    • map() applies the function to all the elements in the given sequence
# Create a list of strings: spells
spells = ["protego", "accio", "expecto patronum", "legilimens"]

# Use map() to apply a lambda function over spells: shout_spells
shout_spells = map(lambda item: item + '!!!', spells)

# Convert shout_spells to a list: shout_spells_list
shout_spells_list = list(shout_spells)

# Print the result
print(shout_spells_list)
  • use in filter(func, seq)
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'pippin', 'aragorn', 'boromir', 'legolas', 'gimli', 'gandalf']

# Use filter() to apply a lambda function over fellowship: result
result = filter(lambda x: len(x)>6, fellowship)

# Convert result to a list: result_list
result_list = list(result)

# Print result_list
print(result_list) #['samwise', 'aragorn', 'boromir', 'legolas', 'gandalf']
  • reduce() : returns a single value as a result.
# Import reduce from functools
from functools import reduce

# Create a list of strings: stark
stark = ['robb', 'sansa', 'arya', 'brandon', 'rickon']

# Use reduce() to apply a lambda function over stark: result
result = reduce(lambda x,y : x+y, stark)

# Print the result
print(result) #   robbsansaaryabrandonrickon

Error handling

  • try-except
def sqrt(x): 
    """Returns the square root of a number."""

    try:
        return x ** 0.5
    except:
        print('x must be a float or an integer')
  • can also specify the error tyoe
def sqrt(x): 
    """Returns the square root of a number."""

    try:
        return x ** 0.5
    except TypeError:
        print('x must be a float or an integer')
  • raise a error
def sqrt(x): 
    """Returns the square root of a number."""
    if x < 0:
        raise ValueError('x must be non-negative.')
    try:
        return x ** 0.5
    except:
        print('x must be a float or an integer')

04 Using iterators in PythonLand

Iterators vs. Iterables

  • Iterable
    • Examples: lists, string, dictionaries, file connections
    • An object with an associated iter() method
    • Applying iter() to an iterable creates an iterator
  • Iterator
    • Produces next value with next()
  • Iterating over iterables
# Create a list of strings: flash
flash = ['jay garrick', 'barry allen', 'wally west', 'bart allen']

# Print each list item in flash using a for loop
for person in flash:
    print(person)

# Create an iterator for flash: superhero
superhero = iter(flash)

# Print each item from the iterator
print(next(superhero))
print(next(superhero))
print(next(superhero))
print(next(superhero)) # same output
  • iterating at one with * to repeat, we must define it = iter(word) again
word = 'Data'
it = iter(word)
print(*it)   # D a t a
  • Iterating with range() function
    • range() doesn't actually create the list; instead, it creates a range object with an iterator that produces the values until it reaches the limit.
# Create an iterator for range(10 ** 100): googol
googol = iter(range(10 ** 100))

# Print the first 5 values from googol
print(next(googol))
print(next(googol))
print(next(googol))
print(next(googol))
print(next(googol))
  • Iterating over dictionaries
for key, value in dic.items():
    print(key, value)
  • Iterating over file connection
file = open('file.txt')
it = iter(file)
print(next(it))

enumerate() and zip()

  • enumerate() returns an enumerate object that produces a sequence of tuples, and each of the tuples is an index-value pair

# Create a list of strings: mutants
mutants = ['charles xavier', 
            'bobby drake', 
            'kurt wagner', 
            'max eisenhardt', 
            'kitty pryde']

# Create a list of tuples: mutant_list
mutant_list = list(enumerate(mutants))

# Print the list of tuples
print(mutant_list)   # [(0, 'charles xavier'), (1, 'bobby drake'), (2, 'kurt wagner'), (3, 'max eisenhardt'), (4, 'kitty pryde')]

# Unpack and print the tuple pairs
for index1,value1 in enumerate(mutants):
    print(index1, value1)

# Change the start index
for index2,value2 in enumerate(mutants,start=1):
    print(index2, value2)
1 charles xavier
2 bobby drake
3 kurt wagner
4 max eisenhardt
5 kitty pryde
  • zip()
    • takes any number of iterables and returns a zip object that is an iterator of tuples.
    • If you wanted to print the values of a zip object, you can convert it into a list and then print it. Printing just a zip object will not return the values unless you unpack it first.
# Create a list of tuples: mutant_data
mutant_data = list(zip(mutants, aliases, powers))

# Print the list of tuples
print(mutant_data)

# Create a zip object using the three lists: mutant_zip
mutant_zip = zip(mutants, aliases, powers)

# Print the zip object 
print(mutant_zip)  # <zip object at 0x7f3e2856bfc8>

# Unpack the zip object and print the tuple values
for value1, value2, value3 in mutant_zip:
    print(value1, value2, value3)
  • Using * and zip to unzip
# Create a zip object from mutants and powers: z1
z1 = zip(mutants, powers)

# Print the tuples in z1 by unpacking with *
print(*z1)  # ('charles xavier', 'telepathy') ('bobby drake', 'thermokinesis') ('kurt wagner', 'teleportation') ('max eisenhardt', 'magnetokinesis') ('kitty pryde', 'intangibility')

# Re-create a zip object from mutants and powers: z1
z1 = zip(mutants, powers)

# 'Unzip' the tuples in z1 by unpacking with * and zip(): result1, result2
result1, result2 = zip(*z1)

# Check if unpacked tuples are equivalent to original tuples
print(result1 == mutants)
print(result2 == powers)

Using iterators to load large files into memory

  • load data in chunks: pd.read_csv(filename, chunk_size=)
    • each chunk is a DataFrame
# Initialize an empty dictionary: counts_dict
counts_dict = {}

# Iterate over the file chunk by chunk
for chunk in pd.read_csv('tweets.csv', chunksize=10):

    # Iterate over the column in DataFrame
    for entry in chunk['lang']:
        if entry in counts_dict.keys():
            counts_dict[entry] += 1
        else:
            counts_dict[entry] = 1

# Print the populated dictionary
print(counts_dict)

05 List comprehensions and generators

List comprehensions

  • Collapse for loops for building lists into a single line
  • Components
    • Iterable
    • Iterator variable (represent members of iterable)
    • Output expression
nums = [12, 8, 21, 3, 16]
new_nums = [num + 1 for num in nums]
print(new_nums)
  • Nested list comprehension
pairs = [(num1, num2) for num1 in range(0, 2) for num2 in range (6,8) #[(0, 6) ,(0, 7), (1, 6), (1, 7)]
# Create a 5 x 5 matrix using a list of lists: matrix
matrix = [[col for col in range(5)] for row in range(5)]

# Print the matrix
for row in matrix:
    print(row)
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]

Advanced comprehensions

  • Conditionals on the iterable
    • [num ** 2 for num in range(10) if num % 2 == 0]
  • Conditionals on the output expression
    • [num ** 2 if num %2 == 0 else 0 for num in range(10)]
  • Dict comprehensions
    • create dictionaries, use () instead of []
    • pos_neg = {num: -num for num in range(9)}

Intro to generator expressions

  • use () instead of []
  • returns a generator object, no need to store the entire list in the memory
  • can be iterated over

Print values from generator

result = (num for num in range(6))
for num in result:
    print(num)
0
1
2
3
4
5
result = (num for num in range(6))
print(list(result)) # [0, 1, 2, 3, 4, 5]
result = (num for num in range(6))
print(next(result)) # 0
print(next(result)) # 1

Generator functions

  • produces generator objects when called
  • Defined def
  • Yields a sequence of values instead of returning a single value
  • Generates a value with yield keyword
def num_sequence(n):
    """Generate values from 0 to n."""
    i = 0
    while i < n:
        yield i
        i += 1