Advanced Data Structures - CameronAuler/python-devops GitHub Wiki
Table of Contents
Collections Module
The collections
module provides optimized alternatives to Python’s built-in data structures.
namedtuple
(Immutable Lightweight Objects)
namedtuple
is a subclass of tuple with named fields and is useful when working with structured data without creating a class. It is mainly used for replacing lightweight classes for read-only data.
from collections import namedtuple
# Define a namedtuple called 'Point' with fields 'x' and 'y'
Point = namedtuple("Point", ["x", "y"])
p = Point(3, 5) # Creating an instance
print(p.x, p.y) # Accessing values by name
# Output:
3 5
deque
(Fast Append and Pop)
deque
(double-ended queue) allows fast appends and pops from both ends (O(1)
time complexity) and is more efficient than lists for queue-like behavior. It is mainly used for implementing queues, stacks, and sliding window algorithms efficiently.
NOTE: O(1)
time complexity refers to the process speed of methods used at both ends of a data structure. Lists only offer fast appends and pops at the end since poppoing or appending at the beginning shifts the position of every other element.
from collections import deque
dq = deque([1, 2, 3])
dq.append(4) # Add to the right
dq.appendleft(0) # Add to the left
dq.pop() # Remove from the right
dq.popleft() # Remove from the left
print(dq)
# Output:
deque([1, 2])
Counter
(Counting Element in a Collection)
Counter
is a dictionary subclass used for counting hashable objects. It is mainly used for word frequency analysis, histogram generation, and duplicate detection.
from collections import Counter
text = "banana"
counter = Counter(text) # Counts occurrences of each letter
print(counter)
print(counter["a"]) # Access count of 'a'
# Output:
Counter({'a': 3, 'n': 2, 'b': 1})
3
OrderedDict
(Dictionary That Remembers Order)
In Python 3.7+, regular dictionaries maintain order, but OrderedDict
is still useful for compatibility with older versions and extra ordering features. It is mainly used for maintaining ordered key-value pairs where order matters.
from collections import OrderedDict
od = OrderedDict()
od["a"] = 1
od["b"] = 2
od["c"] = 3
print(od) # Maintains insertion order
# Output:
OrderedDict([('a', 1), ('b', 2), ('c', 3)])
defaultdict
(Dictionary with Default Values)
defaultdict
returns a default value when accessing a non-existent key instead of raising a KeyError
. It is mainly used for avoiding manual key existence checks in counting and grouping operations.
from collections import defaultdict
dd = defaultdict(int) # Default value of int() is 0
dd["a"] += 1 # No KeyError!
print(dd["a"]) # 1
print(dd["b"]) # 0 (default value)
# Output:
1
0
heapq
Module)
Heap Queue (heapq
provides an efficient min-heap (priority queue) implementation. The smallest element is always at index 0
.
Pushing and Popping from a Heap
Mainly used for priority queues, scheduling tasks, and graph algorithms (e.g., Dijkstra's Algorithm).
NOTE: A heap is a specialized tree-based data structure that satisfies the heap property. In a min-heap, for any given node, the value of the node is less than or equal to the values of its children, ensuring the smallest element is always at the root.
import heapq
nums = [3, 1, 4, 1, 5]
heapq.heapify(nums) # Convert list into a heap
heapq.heappush(nums, 2) # Push new element
smallest = heapq.heappop(nums) # Remove smallest element
print(smallest) # 1
print(nums) # Remaining heap structure
# Output:
1
[1, 2, 4, 3, 5]
n
th Smallest or Largest Elements
Retrieving the Mainly used for finding top n elements efficiently.
import heapq
nums = [3, 1, 4, 1, 5, 9, 2, 6]
print(heapq.nsmallest(3, nums)) # Get 3 smallest elements
print(heapq.nlargest(3, nums)) # Get 3 largest elements
# Output:
[1, 1, 2]
[9, 6, 5]
bisect
Module)
Binary Search (The bisect
module is used for efficiently searching and inserting in sorted lists (O(log n)
complexity).
bisect_left
& bisect_right
Using bisect_left(arr, x)
finds the leftmost index where x can be inserted, while bisect_right(arr, x)
finds the rightmost index where x can be inserted. They are mainly used for efficiently finding insertion points in sorted data.
import bisect
nums = [1, 3, 3, 7]
index1 = bisect.bisect_left(nums, 3) # Finds first position of 3
index2 = bisect.bisect_right(nums, 3) # Finds last position of 3
print(index1, index2)
# Output:
1 3
Inserting While Maintaining Order
insort_left(arr, x)
inserts x
at the correct position while keeping the list sorted. It is mainly used for maintaining sorted lists while inserting dynamically.
import bisect
nums = [1, 3, 7]
bisect.insort_left(nums, 5) # Insert 5 while maintaining order
print(nums)
# Output:
[1, 3, 5, 7]