Synchronization, Part 9: Synchronization Across Processes - angrave/SystemProgramming GitHub Wiki
Process Synchronization
You thought that you were using different processes, so you don't have to synchronize? Think again! You may not have race conditions within a process but what if your process needs to interact with the system around it? Let's consider a motivating example
void write_string(const char *data) {
int fd = open("my_file.txt", O_WRONLY);
write(fd, data, strlen(data));
close(fd);
}
int main() {
if(!fork()) {
write_string("key1: value1");
wait(NULL);
} else {
write_string("key2: value2");
}
return 0;
}
When the program is compiled and run, if none of the system calls fail then we should get something that looks like this (given the file was empty to begin with).
key1: value1
key2: value2
or
key2: value2
key1: value1
Interruption
But, there is a hidden nuance. Most system calls can be interrupted
meaning that the operating system can stop an ongoing system call because it needs to stop the process. So barring fork
wait
open
and close
from failing -- they typically go to completion -- what happens if write
fails? If write fails and no bytes are written, we can get something like key1: value1
or key2: value2
. This is data loss which is incorrect but won't corrupt the file. What happens if write gets interrupted after a partial write? We get all sorts of madness. For example,
key2: key1: value1
Solution
So what should we do? We should use a shared mutex! Consider the following code.
pthread_mutex_t * mutex = NULL;
pthread_mutexattr_t attr;
void write_string(const char *data) {
pthread_mutex_lock(mutex);
int fd = open("my_file.txt", O_WRONLY);
int bytes_to_write = strlen(data), written = 0;
while(written < bytes_to_write) {
written += write(fd, data + written, bytes_to_write - written);
}
close(fd);
pthread_mutex_unlock(mutex);
}
int main() {
pthread_mutexattr_init(&attr);
pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
pmutex = mmap (NULL, sizeof(pthread_mutex_t),
PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANON, -1, 0);
pthread_mutex_init(pmutex, &attrmutex);
if(!fork()) {
write_string("key1: value1");
wait(NULL);
pthread_mutex_destroy(pmutex);
pthread_mutexattr_destroy(&attrmutex);
munmap((void *)pmutex, sizeof(*pmutex));
} else {
write_string("key2: value2");
}
return 0;
}
What the code does in main is initialize a process shared mutex using a piece of shared
memory. You will find out what this call to mmap
does later -- just assume for the time being that it create memory that is shared between processes. We can initialize a pthread_mutex_t
in that special piece of memory and use it as normal. To counter write
failing, we have put the write
call inside a while loop that keeps writing so long as there are bytes left to write. Now if all the other system calls function, there should be more more race conditions.
Most programs try to avoid this problem entirely by writing to separate files, but it is good to know that there are mutexes across processes, and they are useful.
What else can you do?
You can use all of the primitives that you were taught previously! Barriers, semaphores, and condition variables can all be initialized on a shared piece of memory and used in similar ways to their multithreading counterparts.
Okay, so when would I use this?
- You don't have to worry about arbitrary memory addresses becoming race condition candidates. This means that only areas that you specifically
mmap
or outside system resources like files are ever in danger. - You get the nice isolation of a processes so if one process fails the system can maintain intact
- When you have a lot of threads, creating a process might ease the system load
Do I need to memorize the specifics?
No, you just need to know that mutexes and other synchronization primitives can be shared across processes.