07 Python Data Science Toolbox - HannaAA17/Data-Scientist-With-Python-datacamp GitHub Wiki
01 Writing functions
02 Default arguments, variable-length arguments and scope
03 Lambda functions and error-handling
04 Using iterators in PythonLand
05 List comprehensions and generators
01 Writing functions
Multiple parameters and return values
- Tuples.
- Like a list.
- Immutable.
- Constructed using parentheses () :
even_nums = (2, 4, 6)
- Unpacking tuples:
even_nums = (2, 4, 6)
a, b, c = even_nums
- accessing tuple elements:
print(even_nums[0])
- an example: Functions that return multiple values
# Define shout_all with parameters word1 and word2
def shout_all(word1, word2):
# Concatenate word1 with '!!!': shout1
shout1 = word1 + '!!!'
# Concatenate word2 with '!!!': shout2
shout2 = word2 + '!!!'
# Construct a tuple with shout1 and shout2: shout_words
shout_words = (shout1, shout2)
# Return shout_words
return shout_words # /return shout1, shout2/ would do the same thing
# Pass 'congratulations' and 'you' to shout_all(): yell1, yell2
yell1, yell2 = shout_all('congratulations', 'you')
# Print yell1 and yell2
print(yell1)
print(yell2)
02 Default arguments, variable-length arguments and scope
Global vs. local scope
- Use the keyword
global
within a function to alter the value of a variable defined in the global scope.
new_val = 10
def square(value):
***returns the square of a value.***
global new_val
new_val = new_val ** 2
return new_val
square(3) # output: 100
new_val # output: 100
Nested Functions
- to avoid writing out the same computations within functions repeatedly
- closure
- the nested or inner function remembers the state of its enclosing scope when called
- the keyword
nonlocal
can be used to alter the value of a variable defined in the enclosing scope
# Define echo
def echo(n):
"""Return the inner_echo function."""
# Define inner_echo
def inner_echo(word1):
"""Concatenate n copies of word1."""
echo_word = word1 * n
return echo_word
# Return inner_echo
return inner_echo
# Call echo: twice
twice = echo(2)
# Call echo: thrice
thrice = echo(3)
# Call twice() and thrice() then print
print(twice('hello'), thrice('hello')) # hellohello hellohellohello
Default and flexible arguments
- Functions with variable-length arguments (*args)
# Define gibberish
def gibberish(*args):
"""Concatenate strings in *args together."""
# Initialize an empty string: hodgepodge
hodgepodge=''
# Concatenate the strings in args
for word in args:
hodgepodge += word
# Return hodgepodge
return hodgepodge
# Call gibberish() with one string: one_word
one_word = gibberish('luke')
# Call gibberish() with five strings: many_words
many_words = gibberish("luke", "leia", "han", "obi", "darth")
# Print one_word and many_words
print(one_word)
print(many_words)
- Functions with variable-length keyword arguments (**kwargs)
kwargs
is a dictionary : keyword-value
# Define report_status
def report_status(**kwargs):
"""Print out the status of a movie character."""
print("\nBEGIN: REPORT\n")
# Iterate over the key-value pairs of kwargs
for key, value in kwargs.items():
# Print out the keys and values, separated by a colon ':'
print(key + ": " + value)
print("\nEND REPORT")
03 Lambda functions and error-handling
Lambda functions
- Some function definitions are simple enough that they can be converted to a lambda function.
raise_to_power = lambda x, y = x ** y
raise_to_power(2,3) #8
- The best use case for lambda functions, however, are for when you want these simple functionalities to be anonymously embedded within larger expressions.
- use in
map(func, seq)
map()
applies the function to all the elements in the given sequence
# Create a list of strings: spells
spells = ["protego", "accio", "expecto patronum", "legilimens"]
# Use map() to apply a lambda function over spells: shout_spells
shout_spells = map(lambda item: item + '!!!', spells)
# Convert shout_spells to a list: shout_spells_list
shout_spells_list = list(shout_spells)
# Print the result
print(shout_spells_list)
- use in
filter(func, seq)
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'pippin', 'aragorn', 'boromir', 'legolas', 'gimli', 'gandalf']
# Use filter() to apply a lambda function over fellowship: result
result = filter(lambda x: len(x)>6, fellowship)
# Convert result to a list: result_list
result_list = list(result)
# Print result_list
print(result_list) #['samwise', 'aragorn', 'boromir', 'legolas', 'gandalf']
reduce()
: returns a single value as a result.
# Import reduce from functools
from functools import reduce
# Create a list of strings: stark
stark = ['robb', 'sansa', 'arya', 'brandon', 'rickon']
# Use reduce() to apply a lambda function over stark: result
result = reduce(lambda x,y : x+y, stark)
# Print the result
print(result) # robbsansaaryabrandonrickon
Error handling
try-except
def sqrt(x):
"""Returns the square root of a number."""
try:
return x ** 0.5
except:
print('x must be a float or an integer')
- can also specify the error tyoe
def sqrt(x):
"""Returns the square root of a number."""
try:
return x ** 0.5
except TypeError:
print('x must be a float or an integer')
- raise a error
def sqrt(x):
"""Returns the square root of a number."""
if x < 0:
raise ValueError('x must be non-negative.')
try:
return x ** 0.5
except:
print('x must be a float or an integer')
04 Using iterators in PythonLand
Iterators vs. Iterables
- Iterable
- Examples: lists, string, dictionaries, file connections
- An object with an associated
iter()
method - Applying
iter()
to an iterable creates an iterator
- Iterator
- Produces next value with
next()
- Produces next value with
- Iterating over iterables
# Create a list of strings: flash
flash = ['jay garrick', 'barry allen', 'wally west', 'bart allen']
# Print each list item in flash using a for loop
for person in flash:
print(person)
# Create an iterator for flash: superhero
superhero = iter(flash)
# Print each item from the iterator
print(next(superhero))
print(next(superhero))
print(next(superhero))
print(next(superhero)) # same output
- iterating at one with
*
to repeat, we must defineit = iter(word)
again
word = 'Data'
it = iter(word)
print(*it) # D a t a
- Iterating with
range()
functionrange()
doesn't actually create the list; instead, it creates a range object with an iterator that produces the values until it reaches the limit.
# Create an iterator for range(10 ** 100): googol
googol = iter(range(10 ** 100))
# Print the first 5 values from googol
print(next(googol))
print(next(googol))
print(next(googol))
print(next(googol))
print(next(googol))
- Iterating over dictionaries
for key, value in dic.items():
print(key, value)
- Iterating over file connection
file = open('file.txt')
it = iter(file)
print(next(it))
enumerate() and zip()
enumerate()
returns anenumerate
object that produces a sequence of tuples, and each of the tuples is an index-value pair
# Create a list of strings: mutants
mutants = ['charles xavier',
'bobby drake',
'kurt wagner',
'max eisenhardt',
'kitty pryde']
# Create a list of tuples: mutant_list
mutant_list = list(enumerate(mutants))
# Print the list of tuples
print(mutant_list) # [(0, 'charles xavier'), (1, 'bobby drake'), (2, 'kurt wagner'), (3, 'max eisenhardt'), (4, 'kitty pryde')]
# Unpack and print the tuple pairs
for index1,value1 in enumerate(mutants):
print(index1, value1)
# Change the start index
for index2,value2 in enumerate(mutants,start=1):
print(index2, value2)
1 charles xavier
2 bobby drake
3 kurt wagner
4 max eisenhardt
5 kitty pryde
zip()
- takes any number of
iterables
and returns azip
object that is an iterator of tuples. - If you wanted to print the values of a
zip
object, you can convert it into alist
and then print it. Printing just a zip object will not return the values unless you unpack it first.
- takes any number of
# Create a list of tuples: mutant_data
mutant_data = list(zip(mutants, aliases, powers))
# Print the list of tuples
print(mutant_data)
# Create a zip object using the three lists: mutant_zip
mutant_zip = zip(mutants, aliases, powers)
# Print the zip object
print(mutant_zip) # <zip object at 0x7f3e2856bfc8>
# Unpack the zip object and print the tuple values
for value1, value2, value3 in mutant_zip:
print(value1, value2, value3)
- Using
*
andzip
to unzip
# Create a zip object from mutants and powers: z1
z1 = zip(mutants, powers)
# Print the tuples in z1 by unpacking with *
print(*z1) # ('charles xavier', 'telepathy') ('bobby drake', 'thermokinesis') ('kurt wagner', 'teleportation') ('max eisenhardt', 'magnetokinesis') ('kitty pryde', 'intangibility')
# Re-create a zip object from mutants and powers: z1
z1 = zip(mutants, powers)
# 'Unzip' the tuples in z1 by unpacking with * and zip(): result1, result2
result1, result2 = zip(*z1)
# Check if unpacked tuples are equivalent to original tuples
print(result1 == mutants)
print(result2 == powers)
Using iterators to load large files into memory
- load data in chunks:
pd.read_csv(filename, chunk_size=)
- each chunk is a DataFrame
# Initialize an empty dictionary: counts_dict
counts_dict = {}
# Iterate over the file chunk by chunk
for chunk in pd.read_csv('tweets.csv', chunksize=10):
# Iterate over the column in DataFrame
for entry in chunk['lang']:
if entry in counts_dict.keys():
counts_dict[entry] += 1
else:
counts_dict[entry] = 1
# Print the populated dictionary
print(counts_dict)
05 List comprehensions and generators
List comprehensions
- Collapse for loops for building lists into a single line
- Components
- Iterable
- Iterator variable (represent members of iterable)
- Output expression
nums = [12, 8, 21, 3, 16]
new_nums = [num + 1 for num in nums]
print(new_nums)
- Nested list comprehension
pairs = [(num1, num2) for num1 in range(0, 2) for num2 in range (6,8) #[(0, 6) ,(0, 7), (1, 6), (1, 7)]
# Create a 5 x 5 matrix using a list of lists: matrix
matrix = [[col for col in range(5)] for row in range(5)]
# Print the matrix
for row in matrix:
print(row)
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
Advanced comprehensions
- Conditionals on the iterable
[num ** 2 for num in range(10) if num % 2 == 0]
- Conditionals on the output expression
[num ** 2 if num %2 == 0 else 0 for num in range(10)]
- Dict comprehensions
- create dictionaries, use
()
instead of[]
pos_neg = {num: -num for num in range(9)}
- create dictionaries, use
Intro to generator expressions
- use
()
instead of[]
- returns a generator object, no need to store the entire list in the memory
- can be iterated over
Print values from generator
result = (num for num in range(6))
for num in result:
print(num)
0
1
2
3
4
5
result = (num for num in range(6))
print(list(result)) # [0, 1, 2, 3, 4, 5]
result = (num for num in range(6))
print(next(result)) # 0
print(next(result)) # 1
Generator functions
- produces generator objects when called
- Defined
def
- Yields a sequence of values instead of returning a single value
- Generates a value with
yield
keyword
def num_sequence(n):
"""Generate values from 0 to n."""
i = 0
while i < n:
yield i
i += 1