Generators and Iterators - CameronAuler/python-devops GitHub Wiki

Generators and iterators allow efficient handling of large datasets by producing values on demand instead of storing them in memory. This is useful for working with large files, infinite sequences, and streaming data.

Table of Contents

Iterators

An iterator is an object that implements the __iter__() and __next__() methods. It allows looping through a sequence without storing all elements in memory. iter() creates an iterator from an iterable, and next() retrieves the next item. It is mainly used for lazy evaluation, working with large datasets, and stream processing.

nums = [1, 2, 3]
iter_nums = iter(nums)  # Get an iterator from the list

print(next(iter_nums))  # 1
print(next(iter_nums))  # 2
print(next(iter_nums))  # 3
# print(next(iter_nums))  # Raises StopIteration (end of sequence)
# Output:
1
2
3

Creating Custom Iterators

Iterator Classes

Iterator classes are mainly used for streaming data processing, custom numeric sequences, and pagination systems. A custom iterator must define:

  • __iter__() β†’ Returns the iterator object itself.
  • __next__() β†’ Returns the next value in the sequence or raises StopIteration when finished.
class Countdown:
    def __init__(self, start):
        self.current = start

    def __iter__(self):
        return self  # The iterator object itself

    def __next__(self):
        if self.current <= 0:
            raise StopIteration  # Stop iteration when current reaches 0
        self.current -= 1
        return self.current + 1

countdown = Countdown(5)

for num in countdown:
    print(num)  # Counts down from 5 to 1
# Output:
5
4
3
2
1

Generators

A generator is a special type of iterator that yields values lazily. It uses yield instead of return β†’ This allows it to pause execution and resume later. Generators are more memory-efficient than regular iterators.

Generator Functions

Instead of storing the entire sequence in memory, generators produce values on demand. They are mainly used for handling large files, infinite sequences, and real-time data streams.

def countdown(n):
    while n > 0:
        yield n  # Pauses execution and returns n
        n -= 1

gen = countdown(5)

print(next(gen))  # 5
print(next(gen))  # 4
print(next(gen))  # 3
# Output:
5
4
3

Generator Vs. Regular Function

Generators are more memory-efficient since they don’t store all values at once.

def regular_function():
    return [1, 2, 3]  # Returns all values at once

def generator_function():
    yield 1
    yield 2
    yield 3  # Returns values one at a time

print(generator_function())  # <generator object>

Generator Expressions (Compact Syntax)

Generator expressions are similar to list comprehensions, but uses parentheses () instead of brackets [] and they are more memory-efficient because it generates values on demand. They are mainly used for memory-efficient loops, large datasets, and lazy evaluation.

nums = (x * 2 for x in range(5))  # Generator expression

print(next(nums))  # 0
print(next(nums))  # 2
print(list(nums))  # Remaining values: [4, 6, 8]
# Output:
0
2
[4, 6, 8]

Infinite Generators

Generators can be used to create infinite sequences without exhausting memory. Infinite generators are mainly used for streaming data, real-time processing, and game loops.

def infinite_counter():
    n = 0
    while True:
        yield n
        n += 1

counter = infinite_counter()
print(next(counter))  # 0
print(next(counter))  # 1
print(next(counter))  # 2
# Output:
0
1
2

Combining Generators (yield from)

yield from allows delegating part of the generator logic to another iterable. This is mainly used for flattening nested generators, delegating tasks, and combining generators.

def generator1():
    yield from [1, 2, 3]  # Yielding elements from a list
    yield "Done"

for val in generator1():
    print(val)
# Output:
1
2
3
Done