Python Notes‐2 - LogeshVel/learning_resources GitHub Wiki
GIL
In CPython, the global interpreter lock, or GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. The GIL prevents race conditions and ensures thread safety.
The Global Interpreter Lock (GIL) is a mechanism in Python that allows only one thread to execute at a time in the interpreter. This lock is essential for memory safety in multi-threaded Python programs. However, it limits the effectiveness of multithreading, especially in CPU-intensive tasks, as it prevents multiple threads from running in parallel on multiple cores.
Why Python Developers Use GIL?
What Problem did the GIL Solve? The Global Interpreter Lock (GIL) in Python addresses a critical issue in concurrent programming, a race condition. A race condition occurs when multiple threads simultaneously access and modify shared data, leading to unpredictable and incorrect outcomes.
Consider a shared variable n:
n = 5
Suppose two threads, T1 and T2, modify n as follows:
# T1
n = n + 20
# T2
n = n * 10
The final value of n depends on the execution order of T1 and T2. For instance, if T1 executes before T2, the result differs from if T2 goes first. In the worst case, if both threads access n simultaneously, both might read n as 5, leading to inconsistent modifications.
GIL prevents this by ensuring that only one thread runs at a time, making Python's thread management simpler and safer. However, this comes at the cost of reduced efficiency in multi-core systems, as threads cannot run in parallel, limiting the utilization of multiple processors.
Why GIL is Chosen as the Solution?
The Global Interpreter Lock (GIL) in Python was chosen as a solution primarily for its simplicity and effectiveness in thread-safe memory management. Many Python libraries are built using C or C++ at the backend, where managing concurrent access to memory is crucial. GIL provides an efficient way to ensure thread safety, allowing only one thread to execute at a time, thereby avoiding complex lock management and potential deadlock scenarios.
Moreover, implementing GIL is straightforward and easily integrated with Python's design. While it does slow down multi-threaded programs by preventing true parallel execution on multi-core processors, GIL significantly boosts the performance of single-threaded programs. This is because it requires managing only one lock, reducing the overhead associated with multiple-thread synchronization. Thus, GIL balances between ease of implementation and maintaining performance efficiency in various Python applications.
The Impact on Multi-Threaded Python Programs
The presence of the Global Interpreter Lock (GIL) in Python significantly influences the performance of multi-threaded programs, which can be broadly categorized as CPU-bound or I/O-bound.
-
CPU Bound: These programs, such as image processing or complex mathematical computations, push the CPU to its limits. They do not heavily rely on I/O operations.
-
I/O Bound: These involve significant I/O operations, like file or network access, often waiting for external input/output.
For I/O-bound programs, the GIL's impact is minimal. This is because the lock is released while a thread waits for I/O, allowing other threads to execute.
However, the scenario is different for CPU-bound programs. This is illustrated with the following examples:
Single-Threaded Program:
# Execution time calculation for a single-threaded Python program
import time
def calculateSum(n):
sum = 0
for i in range(n):
sum += i
n = 100000000
startTime = time.time()
calculateSum(n)
endTime = time.time()
print("Time taken in single thread = ", endTime - startTime)
Output:
Time taken in single thread = 3.406636953353882
Multi-Threaded Program:
# Execution time calculation for a multi-threaded Python program
import time
from threading import Thread
def calculateSum(n):
sum = 0
for i in range(n):
sum += i
n = 100000000
T1 = Thread(target=calculateSum, args=(n//2,))
T2 = Thread(target=calculateSum, args=(n//2,))
startTime = time.time()
T1.start()
T2.start()
T1.join()
T2.join()
endTime = time.time()
print("Time taken in multi-thread = ", endTime - startTime)
Output:
Time taken in multi-thread = 5.456082820892334
These examples demonstrate that a single-threaded CPU-bound program can be faster than its multi-threaded counterpart due to GIL's restrictions. Consequently, GIL can introduce delays in multi-threaded CPU-bound Python programs.
Why hasn't the GIL been Removed Yet?
Despite its limitations, the Global Interpreter Lock (GIL) remains in Python due to concerns about backward compatibility and the stability of existing libraries, many of which are written in C/C++. Removing GIL risks breaking this foundational code. Attempts to replace it have led to issues like degraded performance in single-threaded and I/O-bound multi-threaded programs. The Python community prioritizes the reliability and efficiency of the language. Any alternative that significantly slows down existing applications is not favourable. Consequently, despite its drawbacks, GIL is retained to ensure the consistent performance and stability of Python programs.
How to Deal with Python's GIL?
Addressing the challenges posed by Python's Global Interpreter Lock (GIL) can be approached in a couple of effective ways:
-
Using Multiprocessing: A popular method to circumvent GIL limitations is through multiprocessing. By creating multiple processes instead of threads, each process operates with its own Python interpreter and memory space, thereby sidestepping the GIL. However, this approach should be used judiciously as processes are more resource-intensive than threads, and managing multiple processes can introduce significant overhead.
-
Choosing a Different Interpreter: Python supports various interpreters, each with its own characteristics:
CPython: The standard interpreter, written in C, includes GIL.
JPython: A Java-based interpreter.
IronPython: Developed in C#, primarily for .NET environments.
PyPy: An alternative Python interpreter, implemented in Python, focusing on performance and efficiency.
Except for CPython, other interpreters may not have a GIL, offering potential performance benefits. However, compatibility with existing libraries is a critical factor to consider when switching interpreters, as not all Python libraries may be available or function identically across different interpreters.
Conclusion **GIL **in Python is a mutex that gives the control of the Python interpreter only to a single thread at a time. It is necessary because of the existing libraries of Python which are written in C/C++. Various ways are there to bypass GIL in Python like using multiple processes or using a different interpreter itself.
Python concurrency
Concurrency is working on multiple things at the same time. In Python, concurrency can be reached in several ways:
- With threading, by letting multiple threads take turns.
- With multiprocessing, we’re using multiple processes. This way we can truly do more than one thing at a time using multiple processor cores. This is called parallelism.
- Using asynchronous IO, firing off a task and continuing to do other stuff, instead of waiting for an answer from the network or disk.
- By using distributed computing, which basically means using multiple computers at the same time
I/O bound vs CPU bound problems
Python concurrency with threads
While waiting for answers from the network or disk, you can keep other parts running using multiple threads. A thread is an independent sequence of execution. Your Python program has, by default, one main thread. But you can create more of them and let Python switch between them. This switching happens so fast that it appears like they are running side by side at the same time.
Unlike other languages, however, Python threads don’t run at the same time; they take turns instead. The reason for this is a mechanism in Python called the Global Interpreter Lock (GIL). In short, the GIL ensures there is always one thread running, so threads can’t interfere with each other. This, together with the threading library, is explained in detail on the next page.
The takeaway is that threads make a big difference for I/O bound software but are useless for CPU bound software. Why is that? It’s simple. While one thread is waiting for a reply from the network, other threads can continue running. If you make a lot of network requests, threads can make a tremendous difference. If your threads are doing heavy calculations instead, they are just waiting for their turn to continue. Threading would only introduce more overhead: there’s some administration involved in constantly switching between threads.
Python concurrency with asyncio
Asyncio is a relatively new core library in Python. It solves the same problem as threading: it speeds up I/O bound software, but it does so differently.
Python concurrency with multiprocessing
If your software is CPU-bound, you can often rewrite your code in such a way that you can use more processors at the same time. This way, you can linearly scale the execution speed. This type of concurrency is what we call parallelism, and you can use it in Python too, despite the GIL.
Not all algorithms can be made to run in parallel. It is impossible to parallelize a recursive algorithm, for example. But there’s almost always an alternative algorithm that can work in parallel just fine.
There are two ways of using more processors:
- Using multiple processors and/or cores within the same machine. In Python, we can do so with the multiprocessing library.
- Using a network of computers to use many processors, spread over multiple machines. We call this distributed computing. Python’s multiprocessing library, unlike the Python threading library, bypasses the Python Global Interpreter Lock. It does so by actually spawning multiple instances of Python. Instead of threads taking turns within a single Python process, you now have multiple Python processes running multiple copies of your code at once.
The multiprocessing library, in terms of usage, is very similar to the threading library. A question that might arise is: why should you even consider threading? You can guess the answer. Threading is ‘lighter’: it requires less memory since it only requires one running Python interpreter. Spawning new processes has its overhead as well. So if your code is I/O bound, threading is likely good enough and more efficient and lightweight.
PyPy
You are probably using the reference implementation of Python, CPython. Most people do. It’s called CPython because it’s written in C. If you are sure your code is CPU bound, meaning it’s doing lots of calculations, you should look into PyPy, an alternative to CPython. It’s potentially a quick fix that doesn’t require you to change a single line of code.
PyPy claims that, on average, it is 4.4 times faster than CPython. It does so by using a technique called just-in-time compilation (JIT). Java and the .NET framework are other notable examples of JIT compilation. In contrast, CPython uses interpretation to execute your code. Although this offers a lot of flexibility, it’s also very slow.
With JIT, your code is compiled while running the program. It combines the speed advantage of ahead-of-time compilation (used by languages like C and C++) with the flexibility of interpretation. Another advantage is that the JIT compiler can keep optimizing your code while it is running. The longer your code runs, the more optimized it will become.
Threading
import threading
import time
def print_numbers(s):
for i in range(10):
time.sleep(0.5)
print(f"{s}: {i}")
thread1 = threading.Thread(target=print_numbers, args=("Thread 1",))
thread2 = threading.Thread(target=print_numbers, args=("Thread 2",))
thread1.start()
time.sleep(2)
thread2.start()
thread1.join()
thread2.join()
Output:
Thread 1: 0
Thread 1: 1
Thread 1: 2
Thread 1: 3
Thread 2: 0
Thread 1: 4
Thread 2: 1
Thread 1: 5
Thread 2: 2
Thread 1: 6
Thread 2: 3
Thread 1: 7
Thread 2: 4
Thread 1: 8
Thread 2: 5
Thread 1: 9
Thread 2: 6
Thread 2: 7
Thread 2: 8
Thread 2: 9
Get thread info
To get thread name and thread id
threading.current_thread().name
threading.current_thread().ident
In Python's threading
module, a Lock is a synchronization primitive used to prevent multiple threads from accessing shared resources simultaneously. This is crucial in multi-threaded programs to avoid race conditions and ensure data consistency.
What is a Lock?
A Lock
is a low-level mechanism that can be in one of two states:
- Locked (Acquired): When a thread acquires the lock, other threads that try to acquire it will block (wait) until the lock is released.
- Unlocked (Released): When the lock is released, one of the waiting threads can acquire it.
The primary purpose of a lock is to ensure mutual exclusion: only one thread can hold the lock at a time.
How Locks Work
- A thread acquires the lock using the
acquire()
method. - While the lock is held, no other thread can acquire it. If a thread attempts to acquire an already-held lock, it will block until the lock is released.
- The thread releases the lock using the
release()
method.
Using Locks in Python
Here’s a simple example:
import threading
# Shared resource
counter = 0
lock = threading.Lock()
def increment():
global counter
for _ in range(1000000):
with lock: # Acquiring the lock
counter += 1 # Critical section
# Create threads
thread1 = threading.Thread(target=increment)
thread2 = threading.Thread(target=increment)
# Start threads
thread1.start()
thread2.start()
# Wait for threads to finish
thread1.join()
thread2.join()
print(f"Final counter value: {counter}")
Key Points in the Example:
- Shared Resource (
counter
): Thecounter
variable is accessed and modified by multiple threads. - Lock Usage:
- The
with lock:
statement acquires the lock at the beginning of the block and releases it at the end, ensuring that only one thread modifiescounter
at a time. - This prevents race conditions where multiple threads modify
counter
simultaneously.
- The
- Without the Lock: If the lock were not used, the final value of
counter
would likely be incorrect due to thread interference.
Methods of a Lock
-
acquire(blocking=True, timeout=-1)
- Acquires the lock.
- Parameters:
blocking
(defaultTrue
): IfTrue
, the thread will block until the lock is available. IfFalse
, the thread will return immediately if the lock is unavailable.timeout
: Specifies a timeout period (in seconds) to wait for the lock.
- Returns
True
if the lock was acquired,False
otherwise (only ifblocking=False
or timeout expires).
Example:
if lock.acquire(blocking=False): # Non-blocking try: print("Lock acquired!") finally: lock.release() else: print("Could not acquire lock.")
-
release()
- Releases the lock.
- A thread that has acquired the lock must release it; otherwise, other threads cannot proceed.
Example:
lock.acquire() # Acquire lock # Critical section lock.release() # Release lock
ReentrantLock (RLock)
Python provides another lock type called RLock
(Reentrant Lock), which allows a thread to acquire the same lock multiple times. This is useful when the same thread needs to re-enter a critical section.
Example:
import threading
lock = threading.RLock()
def recursive_function(n):
with lock:
if n > 0:
print(f"Lock acquired for n={n}")
recursive_function(n - 1)
print(f"Releasing lock for n={n}")
recursive_function(3)
Difference Between Lock and RLock:
- Lock: A thread cannot re-acquire a lock it already holds without releasing it first.
- RLock: A thread can acquire the same lock multiple times but must release it the same number of times.
Common Pitfalls
-
Deadlocks:
- Occurs when two or more threads are waiting for each other to release locks.
- Example of a deadlock:
import threading lock1 = threading.Lock() lock2 = threading.Lock() def thread1(): with lock1: with lock2: print("Thread 1") def thread2(): with lock2: with lock1: print("Thread 2") t1 = threading.Thread(target=thread1) t2 = threading.Thread(target=thread2) t1.start() t2.start() t1.join() t2.join()
-
Neglecting to Release a Lock:
- If a thread fails to release a lock (e.g., due to an exception), it can block other threads indefinitely.
- Use the
with
statement to ensure proper release.
Best Practices
- Use
with
for lock acquisition and release to avoid errors:with lock: # Critical section
- Use RLock if reentrant locking is required.
- Avoid nested locks unless necessary, and always acquire locks in a consistent order to prevent deadlocks.
- Consider higher-level synchronization mechanisms (e.g.,
Semaphore
,Condition
,Queue
) if locks alone are not sufficient for your use case.
In summary, locks are a fundamental tool for thread synchronization in Python, ensuring that shared resources are accessed safely. However, they require careful handling to avoid common issues like deadlocks and improper usage.
Thread lock - block parameter
The below example will throw an error at the end.
import threading
from threading import Lock
import time
lock = Lock()
def print_numbers(s):
lock.acquire(blocking=False)
for i in range(10):
time.sleep(0.5)
print(f"{s}: {i}")
lock.release()
thread1 = threading.Thread(target=print_numbers, args=("Thread 1",))
thread2 = threading.Thread(target=print_numbers, args=("Thread 2",))
thread1.start()
time.sleep(2)
thread2.start()
thread1.join()
thread2.join()
Output:
Thread 1: 0
Thread 1: 1
Thread 1: 2
Thread 1: 3
Thread 2: 0
Thread 1: 4
Thread 2: 1
Thread 1: 5
Thread 2: 2
Thread 1: 6
Thread 2: 3
Thread 1: 7
Thread 2: 4
Thread 1: 8
Thread 2: 5
Thread 1: 9
Thread 2: 6
Thread 2: 7
Thread 2: 8
Thread 2: 9
Exception in thread Thread-2 (print_numbers):
Traceback (most recent call last):
File "C:\Python311\Lib\threading.py", line 1038, in _bootstrap_inner
self.run()
File "C:\Python311\Lib\threading.py", line 975, in run
self._target(*self._args, **self._kwargs)
File "e:\Documents\python_threading.py", line 12, in print_numbers
lock.release()
RuntimeError: release unlocked lock
Its due to the code error. At first the thread1 starts and acquires the lock and executes but after 2 secs thread two starts and it tries to acquire the lock but it failed. But we have continued to execute the thread2 without acquiring the lock (Eventhough failed to acquire the lock we are able to execute the locked resource due to the poor code. And both the threads executing the locked code by one thread.) At the end we got error Exception in thread Thread-2 (print_numbers): ... RuntimeError: release unlocked lock
Since thread1 acquires the lock and it released. but the thread2 somehow executes the locked code with this blocking=False
and poor code handling, while trying to execute the lock.release code by thread2 we got the exeception. Since thread2 haven't acquired the lock so while trying to release the lock which is already released by thread1 we got this exception.
Good code - for block=False
import threading
from threading import Lock
import time
lock = Lock()
def print_numbers(s):
if lock.acquire(blocking=False):
for i in range(10):
time.sleep(0.5)
print(f"{s}: {i}")
lock.release()
else:
print("Failed to acquire lock")
thread1 = threading.Thread(target=print_numbers, args=("Thread 1",), name='T1')
thread2 = threading.Thread(target=print_numbers, args=("Thread 2",), name='T2')
thread1.start()
time.sleep(2)
thread2.start()
thread1.join()
thread2.join()
Output
Thread 1: 0
Thread 1: 1
Thread 1: 2
Failed to acquire lock
Thread 1: 3
Thread 1: 4
Thread 1: 5
Thread 1: 6
Thread 1: 7
Thread 1: 8
Thread 1: 9
Deamon thread
A daemon thread in Python is a thread that runs in the background and automatically terminates when the main program (or the main thread) finishes. Unlike regular (non-daemon) threads, daemon threads do not prevent the program from exiting when the main thread completes its execution.
Daemon threads are typically used for background tasks that don't need to complete before the program ends, such as logging, monitoring, or cleanup tasks.
Key Characteristics of Daemon Threads
-
Background Execution:
- Daemon threads run in the background and are often used for tasks that run continuously or periodically while the program is active.
-
Termination Behavior:
- When the main program exits, all daemon threads are abruptly stopped, regardless of whether they have completed their task.
- They are not given a chance to clean up or release resources.
-
Setting as Daemon:
- A thread can be set as a daemon by setting its
daemon
attribute toTrue
before calling thestart()
method. - Example:
thread.daemon = True
.
- A thread can be set as a daemon by setting its
-
Default Behavior:
- By default, threads are non-daemon (i.e., they block program termination until they finish execution).
How to Create a Daemon Thread You can create a daemon thread by either:
- Setting
daemon=True
in the thread constructor. - Setting the
daemon
attribute of the thread object before starting the thread.
Example: Creating and Using Daemon Threads
import threading
import time
def daemon_task():
while True:
print("Daemon thread running...")
time.sleep(1)
def non_daemon_task():
print("Non-daemon thread starting...")
time.sleep(5)
print("Non-daemon thread finished.")
# Create a daemon thread
daemon_thread = threading.Thread(target=daemon_task, daemon=True)
# Create a non-daemon thread
non_daemon_thread = threading.Thread(target=non_daemon_task)
# Start both threads
daemon_thread.start()
non_daemon_thread.start()
# Wait for the non-daemon thread to finish
non_daemon_thread.join()
print("Main program finished.")
Output:
Daemon thread running...
Daemon thread running...
Daemon thread running...
Non-daemon thread starting...
Daemon thread running...
Daemon thread running...
Non-daemon thread finished.
Main program finished.
Explanation:
- The daemon thread (
daemon_task
) keeps running in the background, printing "Daemon thread running...". - The non-daemon thread (
non_daemon_task
) starts, performs its work, and finishes after 5 seconds. - Once the non-daemon thread finishes, the main program terminates, and the daemon thread is abruptly stopped.
Use Cases for Daemon Threads
-
Background Services:
- Performing continuous background tasks like monitoring, logging, or data collection.
-
Periodic Cleanup Tasks:
- Clearing temporary files, purging caches, or removing unused resources.
-
Non-Essential Operations:
- Tasks that can be interrupted without impacting the main program.
Important Notes:
-
Abrupt Termination:
- Daemon threads are stopped without cleanup when the main program exits. If the thread holds any resources (e.g., files, network connections), they may not be released properly.
-
Should Not Be Used for Critical Tasks:
- Avoid using daemon threads for tasks that require completion, such as saving data or writing logs, because they may be terminated unexpectedly.
-
Alternatives for Graceful Shutdown:
- Use non-daemon threads if you need to ensure tasks finish before the program ends.
- For daemon-like behavior with cleanup, consider combining a non-daemon thread with proper signal handling or a shutdown mechanism.
Comparison Between Daemon and Non-Daemon Threads
Feature | Daemon Thread | Non-Daemon Thread |
---|---|---|
Purpose | Background tasks that don't block program exit. | Essential tasks that must complete before exit. |
Termination | Abruptly stops when the main program exits. | Keeps the program alive until it completes. |
Default | Not daemon (unless explicitly set). | Always non-daemon unless changed. |
Use Cases | Logging, monitoring, periodic background tasks. | Critical operations like saving data or results. |
Daemon threads are a powerful feature for background processing in Python, but they should be used with caution due to their abrupt termination behavior. Proper understanding ensures they are applied in the right context without unintended side effects.
Semaphores
A semaphore in Python threading is a synchronization primitive that allows a fixed number of threads to access a shared resource at the same time. It is used to control access to resources that have a limited capacity, such as a pool of database connections, file handles, or network connections.
How Semaphore Works
- A semaphore maintains an internal counter, which represents the number of threads that can simultaneously access the shared resource.
- When a thread wants to use the resource, it acquires the semaphore, which decreases the counter by 1.
- If the counter reaches 0, additional threads trying to acquire the semaphore will block until one of the threads holding it releases it, increasing the counter.
Types of Semaphores
-
Bounded Semaphore (
threading.BoundedSemaphore
)- Similar to a regular semaphore but with an upper limit on the counter value.
- Prevents accidental over-release, which could lead to incorrect behavior.
-
Unbounded Semaphore (
threading.Semaphore
)- No restriction on the maximum value the counter can reach.
Methods in a Semaphore
-
acquire(blocking=True, timeout=None)
- Decreases the counter by 1.
- If
blocking=True
(default), the thread waits if the counter is 0. - If
blocking=False
, the thread does not wait and immediately returnsFalse
if the counter is 0. timeout
specifies the maximum time to wait for the semaphore.
-
release()
- Increases the counter by 1, allowing other threads to acquire the semaphore.
Example: Semaphore in Python Scenario: Limiting Access to a Shared Resource Let's say a shared resource (e.g., a server) can only handle 3 threads at a time.
import threading
import time
# Create a semaphore with a maximum of 3 threads
semaphore = threading.Semaphore(3)
def access_resource(thread_id):
print(f"Thread {thread_id} is trying to acquire the semaphore...")
with semaphore: # Acquires the semaphore
print(f"Thread {thread_id} acquired the semaphore. Accessing resource...")
time.sleep(2) # Simulate resource usage
print(f"Thread {thread_id} is releasing the semaphore.")
# Semaphore is released automatically when exiting the `with` block
# Create multiple threads
threads = []
for i in range(6): # More threads than the semaphore limit
t = threading.Thread(target=access_resource, args=(i,))
threads.append(t)
t.start()
# Wait for all threads to finish
for t in threads:
t.join()
print("All threads completed.")
Output:
Thread 0 is trying to acquire the semaphore...
Thread 0 acquired the semaphore. Accessing resource...
Thread 1 is trying to acquire the semaphore...
Thread 1 acquired the semaphore. Accessing resource...
Thread 2 is trying to acquire the semaphore...
Thread 2 acquired the semaphore. Accessing resource...
Thread 3 is trying to acquire the semaphore...
Thread 4 is trying to acquire the semaphore...
Thread 5 is trying to acquire the semaphore...
Thread 0 is releasing the semaphore.
Thread 3 acquired the semaphore. Accessing resource...
Thread 1 is releasing the semaphore.
Thread 4 acquired the semaphore. Accessing resource...
Thread 2 is releasing the semaphore.
Thread 5 acquired the semaphore. Accessing resource...
Thread 3 is releasing the semaphore.
Thread 4 is releasing the semaphore.
Thread 5 is releasing the semaphore.
All threads completed.
Explanation:
-
Semaphore Initialization:
semaphore = threading.Semaphore(3)
allows up to 3 threads to access the critical section simultaneously.
-
Thread Behavior:
- Threads 0, 1, and 2 acquire the semaphore first since its initial value is 3.
- Threads 3, 4, and 5 wait until a slot becomes available (when one of the first three threads releases the semaphore).
-
Releasing the Semaphore:
- When a thread finishes using the resource, it releases the semaphore, allowing another waiting thread to acquire it.
Use Cases for Semaphores
-
Resource Pool Management:
- Controlling access to limited resources like database connections, file handles, or sockets.
-
Rate Limiting:
- Restricting the number of concurrent threads or processes performing a specific task.
-
Producer-Consumer Problems:
- Managing shared buffers where producers add items and consumers remove items.
-
Traffic Control:
- Managing thread traffic to avoid overwhelming a shared resource.
Comparison with Other Synchronization Primitives
Primitive | Purpose | Key Difference |
---|---|---|
Lock | Ensures mutual exclusion (one thread at a time). | Only one thread can access the resource at any time. |
RLock | A reentrant lock for the same thread to acquire multiple times. | Same thread can repeatedly acquire the lock. |
Semaphore | Allows a specific number of threads to access a resource simultaneously. | Counter determines how many threads can proceed. |
Condition | Thread coordination based on signaling between threads. | Requires explicit signaling (e.g., notify() ). |
Semaphores are versatile and powerful synchronization tools, ideal for managing shared resources with a limited capacity. They provide greater flexibility than simple locks by allowing multiple threads to proceed concurrently up to a specified limit.
Race condition
What is a Race Condition?
A race condition occurs in a multi-threaded program when multiple threads access shared resources (e.g., variables, files) simultaneously and at least one thread modifies the resource. The final outcome depends on the timing or sequence of thread execution, leading to unpredictable results.
Why Race Conditions Happen
- In multi-threading, threads execute independently.
- If two or more threads access a shared resource without proper synchronization, their operations can overlap in unintended ways.
- For example, if one thread is writing to a variable while another is reading from or updating it, the result can be inconsistent or incorrect.
Example of a Race Condition Let's look at an example where multiple threads increment a shared counter:
import threading
# Shared resource
counter = 0
def increment():
global counter
for _ in range(100000):
counter += 1 # Critical section: race condition occurs here
# Create two threads
thread1 = threading.Thread(target=increment)
thread2 = threading.Thread(target=increment)
# Start both threads
thread1.start()
thread2.start()
# Wait for both threads to finish
thread1.join()
thread2.join()
print(f"Final counter value: {counter}")
Expected vs. Actual Output
- Expected Output:
200000
(since each thread incrementscounter
100,000 times). - Actual Output: A value less than
200000
(due to race conditions).
The issue arises because the counter += 1
operation is not atomic; it involves multiple steps:
- Reading the value of
counter
. - Incrementing it by 1.
- Writing the updated value back to
counter
.
If two threads perform these steps simultaneously, they can overwrite each other’s updates.
How to Prevent Race Conditions Race conditions can be resolved by using synchronization mechanisms to ensure that only one thread accesses the critical section at a time.
Solution: Using Locks A Lock ensures mutual exclusion by allowing only one thread to execute the critical section at any given time.
Modified Example with a Lock
import threading
# Shared resource
counter = 0
lock = threading.Lock() # Create a lock
def increment():
global counter
for _ in range(100000):
with lock: # Acquire the lock
counter += 1 # Critical section
# Create two threads
thread1 = threading.Thread(target=increment)
thread2 = threading.Thread(target=increment)
# Start both threads
thread1.start()
thread2.start()
# Wait for both threads to finish
thread1.join()
thread2.join()
print(f"Final counter value: {counter}")
Output:
- Final Counter Value:
200000
- The lock ensures that only one thread modifies
counter
at a time.
Other Synchronization Techniques
-
Reentrant Lock (
RLock
):- Useful when a thread needs to acquire the same lock multiple times.
- Example:
lock = threading.RLock()
-
Semaphore:
- Limits the number of threads that can access a resource simultaneously.
- Example:
semaphore = threading.Semaphore(1) # Similar to a lock
-
Condition Variables:
- Used for more complex synchronization, such as signaling between threads.
- Example:
condition = threading.Condition()
-
Thread-safe Data Structures:
- Use thread-safe alternatives like
queue.Queue
orcollections.deque
.
- Use thread-safe alternatives like
Summary
- Race Condition: A situation where the outcome of a program depends on the timing or order of thread execution, leading to inconsistent results.
- Solution: Use synchronization mechanisms like locks, semaphores, or thread-safe data structures to coordinate thread access to shared resources.
- Key Tip: Minimize shared state and use proper synchronization to avoid unpredictable behavior in multi-threaded programs.
Multiprocessing
Multiprocessing in Python refers to a parallel execution framework that allows programs to execute multiple processes concurrently, leveraging multiple CPU cores. This is especially useful for CPU-bound tasks, where the Global Interpreter Lock (GIL) in Python can limit the performance of multithreaded programs.
Python’s multiprocessing
module provides a high-level interface to create and manage processes. It enables parallelism by creating separate processes for different tasks, bypassing the GIL and fully utilizing multicore processors.
Key Features of the multiprocessing
Module:
- Process-Based Parallelism:
- Each process has its own memory space, avoiding data corruption issues caused by shared memory.
- Support for Multiple CPUs:
- Takes advantage of multiple cores, making it ideal for CPU-intensive tasks.
- Interprocess Communication (IPC):
- Provides tools like
Queue
,Pipe
, and shared memory to enable communication between processes.
- Provides tools like
- Process Synchronization:
- Includes synchronization primitives like
Lock
,Event
,Semaphore
, andBarrier
.
- Includes synchronization primitives like
- Creating Processes:
You can create a process using the
Process
class.
Output:import multiprocessing as mp import os import time def do_work(): time.sleep(2) for i in range(1, 6): print(f'{mp.current_process().name}:{os.getpid()} -> {i}') def main(): p1 = mp.Process(name='Process 1', target=do_work) p2 = mp.Process(name='Process 2', target=do_work) p1.start() p2.start() p1.join() p2.join() if __name__ == '__main__': s = time.perf_counter() main() print(f'It takes {time.perf_counter()-s} secs')
Process 1:6324 -> 1 Process 1:6324 -> 2 Process 1:6324 -> 3 Process 1:6324 -> 4 Process 1:6324 -> 5 Process 2:6920 -> 1 Process 2:6920 -> 2 Process 2:6920 -> 3 Process 2:6920 -> 4 Process 2:6920 -> 5 It takes 2.2620684000003166 secs
Advantages:
- Overcomes the GIL, enabling true parallelism for CPU-bound tasks.
- Simplifies writing parallel programs compared to managing threads directly.
- Ensures process isolation, reducing risks of shared memory corruption.
Disadvantages:
- Higher memory usage compared to threading since each process has its own memory space.
- Overhead of inter-process communication.
- Debugging can be harder due to separate memory spaces.
When to Use Multiprocessing:
- CPU-Bound Tasks:
- Tasks that involve heavy computations, such as matrix operations, scientific simulations, or image processing.
- When GIL Becomes a Bottleneck:
- Multiprocessing avoids the GIL, unlike threading.
For tasks that are I/O-bound, consider using threading or asynchronous programming, as the GIL does not significantly impact I/O operations.
To know how many cpu cores are there
import multiprocessing as mp
print(mp.cpu_count())
# Output
# 8
multiprocessing.Pool
How to Use The multiprocessing.Pool
in Python is a convenient way to parallelize tasks using multiple processes. It manages a pool of worker processes, allowing you to execute tasks concurrently.
1. Import the Library
from multiprocessing import Pool
2. Define a Task Function
def square(x):
return x * x
3. Create a Pool and Apply Tasks
map()
: Applies a function to all items in an iterable.apply()
: Applies a function to a single argument.apply_async()
: Same asapply()
, but returns a result asynchronously.starmap()
: Likemap()
, but supports multiple arguments.
if __name__ == "__main__":
with Pool(4) as p: # Create a pool with 4 processes
# Using map
numbers = [1, 2, 3, 4, 5]
results = p.map(square, numbers)
print("Results using map:", results)
# Using apply_async
result = p.apply_async(square, (10,))
print("Result using apply_async:", result.get())
# Using starmap (for functions with multiple arguments)
def multiply(x, y):
return x * y
pairs = [(1, 2), (3, 4), (5, 6)]
results = p.starmap(multiply, pairs)
print("Results using starmap:", results)
Key Points:
- Initialization: Use
Pool(processes=n)
wheren
is the number of worker processes. If not specified, it defaults to the number of CPU cores. - Context Management: Use
with Pool(...) as p
to ensure the pool is closed automatically. - Blocking vs Non-Blocking:
map()
andapply()
are blocking.apply_async()
andstarmap_async()
are non-blocking, returning a result object that you can query later using.get()
.
Considerations:
- Use
if __name__ == "__main__":
to prevent recursive process creation on Windows. - Ensure tasks are CPU-bound for maximum benefit, as I/O-bound tasks work better with
concurrent.futures.ThreadPoolExecutor
.
Multiple functions in Pool execution concurrently
To run multiple functions concurrently using multiprocessing.Pool
, you can use either:
- Submitting tasks with
apply_async()
- Combining
starmap()
ormap()
with a helper function - **Using functools
partial()
andmap()
for the non-blocking execution
Example 1: Using apply_async()
for Multiple Functions
from multiprocessing import Pool
import time
def a(x):
time.sleep(2)
return x
def b(x):
time.sleep(2)
return x
if __name__ == "__main__":
s = time.perf_counter()
with Pool(4) as pool:
res1 = pool.apply_async( a, args=('A', ))
res2 = pool.apply_async( b, args=('B', ))
print(res1.get())
print(res2.get())
print(f"Time: {time.perf_counter()-s} sec")
Output
A
B
Time: 2.416869699999552 sec
Example 2: Using a Helper Function with starmap()
from multiprocessing import Pool
import time
# Define functions
def process_function(func, x):
return func.__name__, func(x)
def a(x):
time.sleep(2)
return x
def b(x):
time.sleep(2)
return x
if __name__ == "__main__":
s = time.perf_counter()
tasks = [(a, 4), (b, 2), (a, 3), (b, 5)]
with Pool(4) as pool:
results = pool.starmap(process_function, tasks)
for func_name, result in results:
print(f"{func_name} result: {result}")
print(f"Time: {time.perf_counter()-s} sec")
Output
a result: 4
b result: 2
a result: 3
b result: 5
Time: 2.424494899999445 sec
3.Using functools partial and map for non-blocking execution
from multiprocessing import Pool
from functools import partial
import time
def func_a(a):
time.sleep(2)
return a
def func_b(a, b):
time.sleep(2)
return a, b
def map_func(func):
return func()
if __name__ == "__main__":
s = time.perf_counter()
with Pool() as pool:
a = partial(func_a, "A")
b = partial(func_b, "A", "B")
res = pool.map(map_func, [a, b])
print(res)
print(f"Time: {time.perf_counter()-s} sec")
Output
['A', ('A', 'B')]
Time: 2.662100499999724 sec
Data sharing issue in multiprocessing
In multiprocessing, each process has its own memory. Process do not share memory. Each process has its own space of memory. So what we expect will not be the answer for the following code.
from multiprocessing import Process
l = [0]
def func():
global l
l.extend([1,2,3])
print('Data inside func: ', l)
if __name__ == "__main__":
p1 = Process(target=func)
p1.start()
p1.join()
print('Data in main', l)
Output
Data inside func: [0, 1, 2, 3]
Data in main [0]
To overcome this issue we have two options
- Pipes
- Queues
Pipes
In multiprocessing in Python, a Pipe is a method for two-way communication between processes. It provides a way to send and receive data between processes in a multiprocessing environment.
How It Works
- A pipe is created using
multiprocessing.Pipe()
, which returns two connection objects:conn1
andconn2
. - Data sent through one connection can be received from the other.
Example: Using Pipe in Multiprocessing
from multiprocessing import Process, Pipe
# Function to run in a child process
def worker(conn):
conn.send("Hello from child process!")
conn.close()
if __name__ == "__main__":
# Create a Pipe
parent_conn, child_conn = Pipe()
# Start a process
p = Process(target=worker, args=(child_conn,))
p.start()
# Receive data from the child process
print("Received:", parent_conn.recv())
p.join()
Key Points
- Bidirectional Communication: By default,
Pipe()
creates a bidirectional pipe. - Duplex Control: Use
Pipe(duplex=False)
for one-way communication. - Data Types: Pipes can send and receive picklable objects (e.g., strings, numbers, lists).
When to Use Pipe vs Queue
- Pipe: Simple and suitable for two processes.
- Queue: Better for multiple processes and complex data sharing. It's thread- and process-safe.
A sentinel value is a special value used to signal the end of a process, data stream, or loop. It acts as a marker that indicates no more data is expected or that a special condition has been reached.
Why Use a Sentinel Value?
End of Data: To mark the end of data in a stream or loop. Termination Signal: To signal processes or threads to stop working. Default Indicator: To represent a value that cannot be confused with valid data.
In multiprocessing, a sentinel value can be used to signal when a process should stop processing data. This is particularly useful when using Pipe for inter-process communication.
from multiprocessing import Process, Pipe
# Define a sentinel value
SENTINEL = "STOP" # can be anything but it should not be a confused with the valid data that can be processed.
# Worker function
def worker(conn):
while True:
data = conn.recv() # Receive data from the main process
if data == SENTINEL:
print("Worker received the sentinel value, terminating.")
break
print(f"Worker received: {data}")
conn.close()
if __name__ == "__main__":
# Create a Pipe
parent_conn, child_conn = Pipe()
# Start a worker process
p = Process(target=worker, args=(child_conn,))
p.start()
# Send some data
parent_conn.send("Task 1")
parent_conn.send("Task 2")
parent_conn.send("Task 3")
# Send the sentinel value to terminate the worker
parent_conn.send(SENTINEL)
# Wait for the process to finish
p.join()
print("Main process finished.")
Key Considerations
- Unique Choice: Ensure the sentinel value is distinct and cannot be confused with valid data.
- Type Selection: Common sentinel values include None, -1, "", float('inf'), or custom objects.
Queues
A Multiprocessing Queue in Python is a thread and process-safe queue provided by the multiprocessing
module. It allows multiple processes to communicate by sharing data efficiently.
How to Use Multiprocessing Queue
-
Import the Module:
from multiprocessing import Process, Queue
-
Creating a Queue:
q = Queue(maxsize=5) # Optional maxsize argument
-
Adding Items to the Queue (Put):
q.put("Hello") q.put(42)
-
Retrieving Items from the Queue (Get):
item = q.get() print(item)
Example: Producer-Consumer Model
from multiprocessing import Process, Queue
import time
# Function for adding items to the queue
def producer(q):
for i in range(5):
q.put(i)
print(f"Produced: {i}")
time.sleep(1)
# Function for consuming items from the queue
def consumer(q):
while True:
item = q.get()
print(f"Consumed: {item}")
if item == 4: # Stop after consuming the last item
break
if __name__ == "__main__":
q = Queue()
p1 = Process(target=producer, args=(q,))
p2 = Process(target=consumer, args=(q,))
p1.start()
p2.start()
p1.join()
p2.join()
Key Methods of Multiprocessing Queue
put(item)
: Adds an item to the queue.get()
: Removes and returns an item from the queue.empty()
: ReturnsTrue
if the queue is empty.full()
: ReturnsTrue
if the queue is full.qsize()
: Returns the approximate size of the queue (not guaranteed to be accurate).
Considerations
- Avoid using
empty()
,full()
, andqsize()
in critical sections as the values can change unexpectedly due to process scheduling. - Always use process-safe techniques to avoid deadlocks or race conditions.
Like Pipe q.get() is a blocking code which will wait indefinitely to get the element. If there is not put but we try to get then that code waits forever.
The timeout
parameter in Python's multiprocessing.Queue
is used with the put()
and get()
methods to limit the time a process waits when interacting with the queue. If the specified time is exceeded, a queue.Full
or queue.Empty
exception is raised, depending on the operation.
How timeout
Works
put(item, block=True, timeout=None)
:block=True
(default): The process waits until space is available ortimeout
is reached.block=False
: The process raisesqueue.Full
if the queue is full immediately.timeout=None
(default): Waits indefinitely ifblock=True
.
get(block=True, timeout=None)
:block=True
(default): The process waits until an item is available ortimeout
is reached.block=False
: Raisesqueue.Empty
if the queue is empty immediately.timeout=None
(default): Waits indefinitely ifblock=True
.
Examples
Using put()
with timeout
from multiprocessing import Process, Queue
import time
def producer(q):
for i in range(5):
try:
q.put(i, block=True, timeout=2)
print(f"Produced: {i}")
except Exception as e:
print(f"Queue full! Could not produce {i}")
time.sleep(1)
if __name__ == "__main__":
q = Queue(maxsize=2)
p = Process(target=producer, args=(q,))
p.start()
p.join()
Using get()
with timeout
from multiprocessing import Process, Queue
import time
def consumer(q):
while True:
try:
item = q.get(block=True, timeout=2)
print(f"Consumed: {item}")
except Exception as e:
print("Queue empty! Exiting...")
break
if __name__ == "__main__":
q = Queue()
q.put(1)
q.put(2)
p = Process(target=consumer, args=(q,))
p.start()
p.join()
Key Points
- Default Behavior: If
timeout
is not specified, it waits indefinitely. - When to Use: Use
timeout
to avoid processes getting stuck while waiting for items or space in the queue. - Exceptions Raised:
queue.Full
: Whenput()
fails due to a full queue.queue.Empty
: Whenget()
fails due to an empty queue.
Lock and Semaphore
Python's multiprocessing
module provides synchronization primitives like Lock
and Semaphore
to manage access to shared resources among multiple processes.
1. Lock
A Lock
is a synchronization primitive that allows only one process to access a critical section at a time. It prevents race conditions by ensuring exclusive access.
Key Methods
acquire(block=True, timeout=None)
: Locks the lock, optionally with a timeout.release()
: Releases the lock.
Example: Using Lock
from multiprocessing import Process, Lock
import time
def worker(lock, i):
lock.acquire()
try:
print(f"Process {i} is working")
time.sleep(1)
finally:
print(f"Process {i} is done")
lock.release()
if __name__ == "__main__":
lock = Lock()
processes = [Process(target=worker, args=(lock, i)) for i in range(5)]
for p in processes:
p.start()
for p in processes:
p.join()
Using with statement
from multiprocessing import Process, Lock
import time
def worker(lock, i):
with lock: # Automatically acquires and releases the lock
print(f"Process {i} is working")
time.sleep(1)
print(f"Process {i} is done")
if __name__ == "__main__":
lock = Lock()
processes = [Process(target=worker, args=(lock, i)) for i in range(5)]
for p in processes:
p.start()
for p in processes:
p.join()
2. Semaphore
A Semaphore
is a more flexible synchronization primitive that allows a limited number of processes to access a resource simultaneously. It manages a counter, decrementing it when a process acquires the semaphore and incrementing it when released.
Key Methods
acquire(block=True, timeout=None)
: Decrements the semaphore counter, blocking if needed.release()
: Increments the semaphore counter.
Example: Using Semaphore
from multiprocessing import Process, Semaphore
import time
# Allow 2 processes to work simultaneously
sem = Semaphore(2)
def worker(sem, i):
sem.acquire()
try:
print(f"Process {i} is working")
time.sleep(2)
finally:
print(f"Process {i} is done")
sem.release()
if __name__ == "__main__":
processes = [Process(target=worker, args=(sem, i)) for i in range(5)]
for p in processes:
p.start()
for p in processes:
p.join()
Using with statement
from multiprocessing import Process, Semaphore
import time
# Allow 2 processes to work simultaneously
sem = Semaphore(2)
def worker(sem, i):
with sem: # Automatically acquires and releases the semaphore
print(f"Process {i} is working")
time.sleep(2)
print(f"Process {i} is done")
if __name__ == "__main__":
processes = [Process(target=worker, args=(sem, i)) for i in range(5)]
for p in processes:
p.start()
for p in processes:
p.join()
Key Differences Between Lock and Semaphore
Feature | Lock | Semaphore |
---|---|---|
Access Control | One process at a time | Multiple processes allowed |
Counter | No counter | Maintains a counter |
Use Case | Mutual exclusion | Resource pooling |
Exceptions Raised | RuntimeError on double release |
ValueError if counter exceeds initial value |
When to Use
- Lock: When only one process should access a resource at a time.
- Semaphore: When a limited number of processes can access a resource simultaneously, like a database connection pool or API rate limiter.
concurrent.futures
concurrent.futures
Overview
The concurrent.futures
module in Python provides a high-level interface for asynchronously executing tasks using threads or processes. It simplifies the implementation of concurrent programming by abstracting much of the complexity involved in managing threads or processes.
Core Concepts
-
Executor Classes:
ThreadPoolExecutor
: Manages a pool of threads for executing tasks concurrently.ProcessPoolExecutor
: Manages a pool of processes for executing tasks concurrently.
-
Future Objects:
- A
Future
represents the result of an asynchronous computation. - You can query a
Future
object to check if the computation is done, wait for it to complete, or retrieve its result.
- A
-
Key Methods:
submit(fn, *args, **kwargs)
: Submits a callable (function/method) to the executor for execution and returns aFuture
object.map(func, *iterables, timeout=None, chunksize=1)
: A parallelized version of the built-inmap
function.shutdown(wait=True)
: Signals the executor to clean up resources when all tasks are done.
-
Features:
- Easy to use with Python's
with
statement, ensuring proper resource management. - Uniform API for both threads and processes.
- Easy to use with Python's
Key Differences from threading
and multiprocessing
Feature | concurrent.futures |
threading |
multiprocessing |
---|---|---|---|
Ease of Use | High-level abstraction | Low-level, manual | Low-level, manual |
Concurrency Model | Thread-based or process-based | Thread-based | Process-based |
Resource Management | Built-in (with statement) |
Manual (e.g., join() ) |
Manual (e.g., join() ) |
Overhead | Slightly higher (abstraction) | Lower | Higher (inter-process communication) |
Global Interpreter Lock (GIL) | Affects ThreadPoolExecutor |
Affects all threads | Not affected |
Parallelism | True parallelism with ProcessPoolExecutor |
Limited by GIL | True parallelism |
How concurrent.futures
Works
Using ThreadPoolExecutor
from concurrent.futures import ThreadPoolExecutor
def square(n):
return n * n
numbers = [1, 2, 3, 4, 5]
with ThreadPoolExecutor(max_workers=3) as executor:
results = executor.map(square, numbers)
print(list(results)) # Output: [1, 4, 9, 16, 25]
Using ProcessPoolExecutor
from concurrent.futures import ProcessPoolExecutor
def factorial(n):
if n == 0:
return 1
return n * factorial(n - 1)
numbers = [5, 10, 15]
with ProcessPoolExecutor(max_workers=2) as executor:
results = executor.map(factorial, numbers)
print(list(results)) # Output: [120, 3628800, 1307674368000]
Working with Future
Objects
from concurrent.futures import ThreadPoolExecutor
import time
def sleep_and_return(n):
time.sleep(n)
return f"Slept for {n} seconds"
with ThreadPoolExecutor(max_workers=2) as executor:
future1 = executor.submit(sleep_and_return, 2)
future2 = executor.submit(sleep_and_return, 3)
print(future1.result()) # Waits for the first task to complete
print(future2.result()) # Waits for the second task to complete
Advantages of concurrent.futures
- Simplified API for both threads and processes.
- Readable and maintainable code using
with
for resource management. - Easily switch between threads and processes by changing the executor class.
- Supports parallelism with
ProcessPoolExecutor
.
Limitations of concurrent.futures
- Slightly higher abstraction overhead compared to
threading
andmultiprocessing
. ThreadPoolExecutor
is still constrained by the GIL, limiting its effectiveness for CPU-bound tasks.
When to Use concurrent.futures
- Use
ThreadPoolExecutor
for I/O-bound tasks like file operations, web scraping, or database queries. - Use
ProcessPoolExecutor
for CPU-bound tasks like numerical computations or data processing.
Comparison Example
Using threading
:
import threading
def task(n):
print(f"Processing {n}")
threads = []
for i in range(5):
t = threading.Thread(target=task, args=(i,))
threads.append(t)
t.start()
for t in threads:
t.join()
Using concurrent.futures
:
from concurrent.futures import ThreadPoolExecutor
def task(n):
print(f"Processing {n}")
with ThreadPoolExecutor(max_workers=5) as executor:
executor.map(task, range(5))
The concurrent.futures
version is more concise and manages resources automatically.
Conclusion
concurrent.futures
provides a high-level interface for concurrent programming with threads and processes.- It abstracts away the complexity of
threading
andmultiprocessing
while still offering flexibility. - The choice between
concurrent.futures
,threading
, andmultiprocessing
depends on the task's complexity, scalability, and specific requirements.