generators and the yield statement in Python - unix1998/technical_notes GitHub Wiki

Generators

Generators are a special type of iterator that allow you to iterate through a sequence of values. Unlike lists, generators do not store all the values in memory; instead, they generate the values on the fly. This makes them much more memory-efficient, especially when dealing with large datasets.

The yield Statement

The yield statement is what makes a function a generator. When a function contains a yield statement, it will return a generator object when called, rather than executing the function and returning a single value.

Example 1: Simple Generator

Here's a basic example of a generator function that generates a sequence of numbers:

def simple_generator():
    yield 1
    yield 2
    yield 3

# Using the generator
gen = simple_generator()
print(next(gen))  # Output: 1
print(next(gen))  # Output: 2
print(next(gen))  # Output: 3

Example 2: Generator for a Range of Numbers

This example demonstrates a generator that behaves like the built-in range function:

def my_range(start, end):
    current = start
    while current < end:
        yield current
        current += 1

# Using the generator
for number in my_range(1, 5):
    print(number)  # Output: 1 2 3 4

Example 3: Fibonacci Sequence

Here's a generator that produces the Fibonacci sequence:

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Using the generator
fib = fibonacci()
for _ in range(10):
    print(next(fib))  # Output: 0 1 1 2 3 5 8 13 21 34

Example 4: File Processing

Generators are particularly useful for reading large files. Here’s an example that reads a file line by line:

def read_large_file(file_path):
    with open(file_path) as file:
        for line in file:
            yield line

# Using the generator
for line in read_large_file('large_file.txt'):
    print(line, end='')

Key Points

  • Memory Efficiency: Generators are more memory-efficient than lists, especially for large datasets, because they generate items on the fly.
  • Infinite Sequences: Generators can represent infinite sequences (like the Fibonacci sequence) since they yield items one at a time and don’t store the entire sequence in memory.
  • Statefulness: Each time a generator's yield is encountered, the function's state is saved, and the value is returned. The state includes local variables and the point of execution, which allows the function to resume where it left off.

Understanding these concepts with examples should make the usage of generators and yield clearer. They are powerful tools for writing efficient and expressive Python code. If you have more specific scenarios or questions, feel free to ask!