Concurrency and Parallelism - CameronAuler/python-devops GitHub Wiki
Table of Contents
- Threading (Concurrency with Shared Memory)
- Multiprocessing (True Parallelism)
- Asyncio (Asynchronous Programming)
Python provides multiple ways to handle concurrent and parallel execution using:
- Threading (for concurrency with shared memory)
- Multiprocessing (for true parallel execution using multiple CPU cores)
- Asyncio (for asynchronous, non-blocking operations)
Technique | Use Case |
---|---|
threading |
Best for I/O-bound tasks where multiple operations wait for external resources (e.g., network requests, file I/O, database queries). Uses multiple threads but does not achieve true parallelism due to the Global Interpreter Lock (GIL). |
multiprocessing |
Ideal for CPU-bound tasks that require intensive computations (e.g., machine learning, image processing, large data analysis). Runs separate processes, allowing true parallel execution across multiple CPU cores. |
asyncio |
Designed for high-performance asynchronous I/O operations where tasks involve waiting (e.g., web scraping, handling thousands of API requests, real-time applications). Uses cooperative multitasking with async and await . |
Threading (Concurrency with Shared Memory)
The threading
module allows concurrent execution of tasks within the same process. However, due to Python’s Global Interpreter Lock (GIL), threads don’t run in parallel for CPU-bound tasks but work well for I/O-bound operations.
Creating and Running Threads
Threads run concurrently but not truly in parallel. Threading is mainly used for I/O-bound tasks (e.g., reading files, network requests).
import threading
def print_numbers():
for i in range(5):
print(f"Number: {i}")
# Creating threads
thread1 = threading.Thread(target=print_numbers)
thread2 = threading.Thread(target=print_numbers)
# Starting threads
thread1.start()
thread2.start()
# Waiting for threads to finish
thread1.join()
thread2.join()
# Output (Order may vary):
Number: 0
Number: 1
Number: 2
Number: 3
Number: 4
Number: 0
Number: 1
Number: 2
Number: 3
Number: 4
Using Locks to Prevent Race Conditions
Since threads share memory, they can cause race conditions. Locks are mainly used for preventing data corruption in shared variables.
import threading
counter = 0
lock = threading.Lock()
def increment():
global counter
for _ in range(100000):
with lock: # Ensures only one thread updates `counter` at a time
counter += 1
thread1 = threading.Thread(target=increment)
thread2 = threading.Thread(target=increment)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print("Final Counter:", counter) # Expected: 200000
Multiprocessing (True Parallelism)
Unlike threading
, the multiprocessing
module creates separate processes, allowing true parallel execution across CPU cores.
Running Processes in Parallel
multiprocessing.Process()
creates separate processes, utilizing multiple CPU cores. This is ideal for CPU-bound tasks (e.g., heavy computations, data processing).
import multiprocessing
def worker():
print(f"Process {multiprocessing.current_process().name} is running")
if __name__ == "__main__":
process1 = multiprocessing.Process(target=worker)
process2 = multiprocessing.Process(target=worker)
process1.start()
process2.start()
process1.join()
process2.join()
# Output (Processes run in parallel):
Process Process-1 is running
Process Process-2 is running
Sharing Data Between Processes
Since each process has its own memory, we use Queues or Shared Memory in order to pass messages between processes safely.
import multiprocessing
def worker(queue):
queue.put("Hello from process")
if __name__ == "__main__":
q = multiprocessing.Queue()
process = multiprocessing.Process(target=worker, args=(q,))
process.start()
process.join()
print(q.get()) # Retrieve data from queue
# Output:
Hello from process
Asyncio (Asynchronous Programming)
asyncio
enables non-blocking, cooperative multitasking using async
and await
.
async
& await
)
Asynchronous Functions (These are mainly used for network requests, database queries, and real-time applications.
import asyncio
async def say_hello():
print("Hello")
await asyncio.sleep(1) # Non-blocking sleep
print("World")
asyncio.run(say_hello()) # Runs the async function
# Output (Executes asynchronously):
Hello
(World prints after 1 second)
Running Multiple Tasks Concurrently
asyncio.gather()
runs tasks concurrently without blocking. This is mainly used for I/O-bound operations like web scraping and API requests.
import asyncio
async def task(name, delay):
print(f"Task {name} starting")
await asyncio.sleep(delay) # Simulates non-blocking work
print(f"Task {name} done")
async def main():
await asyncio.gather(task("A", 2), task("B", 1)) # Run tasks in parallel
asyncio.run(main())
# Output:
Task A starting
Task B starting
Task B done # (After 1 second)
Task A done # (After 2 seconds)
Asyncio with Threading
Run async
tasks in a separate thread. This is mainly used for combining async I/O with CPU-bound tasks.
import asyncio
import threading
def run_asyncio():
asyncio.run(async_task())
async def async_task():
print(f"Running in thread: {threading.current_thread().name}")
await asyncio.sleep(1)
print("Async task done")
thread = threading.Thread(target=run_asyncio)
thread.start()
thread.join()