Introduction - fporciel2/Philosophers GitHub Wiki

Concept

The program is a multithreading solution to the dining philosophers problem.

Situation

There is a round table, N philosophers sit around it and each of them brings a fork in its section of the table.

Each philosopher can do three things: eat, think or sleep; but to eat it must pick two forks.

The life of a philosopher consists of those three activities: to eat, to sleep and to think. From the beginning of the dinner, each philosopher has a precise time to die before which it must eat to not die. After each meal, it has again the same time to die before which it must eat to not die: its purpose is to have dinner for an infinite time. Every time it eats, it does for a precise time to eat and, after each meal, it must sleep for a precise time to sleep.

The time to die, the time to sleep and the time to eat do not change their values during the dinner, while the time to think depends on the availability of the forks. When the dinner begins, each philosopher starts its journey autonomously, without communicate or coordinate with other philosophers.

The simulation

The program takes these inputs: the number of philosophers (nop), expressed as a positive integer value; the time to die (ttd), the time to eat (tte) and the time to sleep (tts), expressed in milliseconds as positive integer values; and optionally at the user's choice the number of times each philosopher must eat (nte).

Any state change of a philosopher is formatted and displayed, according to the philosopher's situation, as follows:

tstmpm X has taken a fork
tstmpm X is eating
tstmpm X is sleeping
tstmpm X is thinking
tstmpm X died

; where tstmpm corresponds to the displayed current timestamp expressed in milliseconds and X corresponds to the displayed philosopher's identification number (a value between one and nop inclusive).

The simulation stops, displaying a log message about the event, when the temporal distance between the last meal of a philosopher and the current timestamp is between ttd and ttd + 10 milliseconds inclusive, and the philosopher is not currently eating.

Race condition

In this program, each thread executes the same code of each other thread: however, due to the different temporal contexts, they execute different code's paths, causing a nondeterministic situation, because the value of the time to think is always unexpected, undefined, conditioned by the availability of the forks.

This situation, and so the entire simulation, constitutes an inevitable (imho?) race condition. Since the access to forks is uncontrollable, i.e conditioned by the unpredictable result of scheduling in determining the order in which philosophers take forks, each philosopher will think all the time required by the scarcity and availability of the forks and its thought will be nondeterministic.

The goal of this program is to manage this race condition properly by preventing each philosopher to steal the fork picked up by another philosopher: that does not eliminate the race condition but avoids errors. As explained below, since stealing a fork from another philosopher doesn't involve write operations, this race condition is not a data race.

Data race

'Race condition' and 'data race' are two different concepts. A data race occurs when two or more threads or processes in a concurrent system access the same memory location concurrently, and at least one of the accesses is a write operation, without proper synchronization mechanisms to ensure order and consistency.

Data races can exist in a program without leading to race conditions. This scenario typically involves operations that are inherently safe to perform in parallel, due to the nature of the data access or the operations being performed. For example, if two threads are incrementing separate elements of an array and these elements are not dependent on each other, this situation can be, in some formal models, technically considered a deterministic data race. This kind of data race leads to undefined behavior under certain specifications and architectures not predisposed to atomicity for that kind of operation.

Conversely, a data race leads to a race condition when the order of execution of the concurrent operations affects the correctness of the program. Race conditions manifest as erratic behavior, where the output of the program may vary between executions even if the input remains the same. This unpredictability stems from the lack of coordination between different concurrent entities accessing the common resources, resulting in a violation of the program's intended sequence of operations. A common example is when two threads attempt to read and update a shared variable without adequate synchronization, leading to potential overwrites and lost updates. Must be noted that in 'Philosophers' the memory is not - strictly speaking - shared, because the changes of states are not finalized to communication between threads.

This program oversees only the second type of data race, preventing a philosopher to display a log on the standard output at the same time as another philosopher.

Mutex

The data race, - that does not co-imply a race condition - is managed by adopting a proper mutual exclusion device called 'mutex'.

In this program each resource accessed concurrently, with its critical section, is "protected" by a mutex. While this approach is used in the solution of both race conditions, it automatically solves only the one that involves a data race. In fact, since the expected output of logging messages displayed by threads gives the timestamp of the event logged, it is not important the order in which messages are displayed, but only that the messages are readable. The mutex protections ensures that standard output is modified by only one thread at a time, so that no message is unreadable due to shared modifications, but this occurs without any regard to the order in which messages are displayed.

However, in the case of the race condition without data race, the mutex will only prevent the undefined behavior caused by the theft of a fork, i.e. by accessing a critical section of another thread, but it will not prevent the race condition given by the variability of the time to think. This first race condition is partially solved by the mutex, because the undefined behaviour could be determined by the timing of operations, but to solve the problem of mortality, i.e. to avoid the starvation, other synchronization strategies are needed. Furthermore, the introduction of the mutex can cause another problem related to timing: the deadlock.

Starvation and deadlock

In this context, starvation can occur (1) if the mutual exclusions used to oversee the main race condition exclude one philosopher to access a resource or (2) if the mutual exclusions used to oversee the main race condition do not exclude a philosopher to access a resource, but cause a philosopher to wait for access to a resource for an amount of time greater than the time to die at a certain time of execution. In this context, a deadlock can occur when each philosopher possesses at least one fork.

Starvation and deadlock are strictly related. Starvation occurs, as said above, when the difference between the current timestamp and the timestamp of the last meal of a philosopher is greater or equal than time_to_die and the philosopher is not currently eating. To solve starvation, a sort of scheduling algorithm must be adopted through mutual exclusion devices, such as the mutex. However, in this situation a deadlock can occur because the mutual access to forks can be unsynchronized. In fact, due to the nondeterministic time_to_think confounding variable, even a well-structured algorithm can cause infinite wait for a fork, i.e. a live-lock, or unfair access to resources. Using only mutexes, there is no known completely fair solution.

With an even number of philosophers, the solution is pretty easy. Every couple of philosophers shares the forks mutually and exclusively, so the synchronization is done only between the two philosophers, without considering the eventual other couples. This prevents starvation. To avoid the deadlock, instead, the resource hierarchy solution is applied. So: if the philosopher has an odd id, it first accesses its own fork, then the fork of its partner; otherwise, the order is inverse. This solution works because the routine of each philosopher involves an internal priority and does not allow the fork with minor priority to be taken before the fork with major priority or while the fork with major priority is not possessed by the same philosopher.

When number_of_philosophers is odd, the solution is less easy. The resource hierarchy solution can imply deadlock or starvation, depending on how it is implemented. Here is an explanation of the problem.

Assuming that there are three philosophers, T1, T2 and T3 and each philosopher has a fork F1, F2 and F3. The access to fork can be circular, but to apply the hierarchy different approaches are allowed and so many combinations of this approach.

Approach 1

T1 takes F1 first and then F2; T2 takes F2 first and then F3; T3 takes F3 first and then F1.

Approach 2

T1 takes F3 first and then F1; T2 takes F1 first and then F2; T3 takes F2 first and then F3.

Approach 3

T1 takes F3 first and then F2; T2 takes F1 first and then F3; T3 takes F2 first and then F1.

These approaches can be combined in many ways, also infinite, for example using a nondeterministic switch between approaches or combining consistently part of one approach with part of another.

Due to the race condition, both deadlock and starvation can happen with all the approaches. Deadlock can happen if T1, T2 and T3 take their 'left' fork and each 'right' fork is the 'left' one of another; starvation can happen if the minimum reasonable times plus the confounding factor are passed to the program as arguments, because the ternary threading involves a greater temporal distance between each meal.

Both the deadlock and the main type of starvation, - which involves that at least one philosopher has never access to the first fork -, have a solution, which is to give to at least two threads priority over the same fork and no posteriority over it to any thread. So, for example:

1 - T1 takes F1 first and then F2; T2 takes F1 first and then F2; T3 takes F2 first and then F3;

2 - T1 takes F1 first and then F3; T2 takes F1 first and then F2; T3 takes F1 first and then F3;

3 - T1 takes F1 first and then F3; T2 takes F1 first and then F3; T3 takes F1 first and then F3.

And so on...

There are solutions to both the problems: the starvation and the unfairness, but they will not be explained in this introduction. The problems will be explained and solved in posterior sections of the Wiki. Here, it is important to note that: (1) the general problem of starvation is not strictly related to the number of threads involved: due to the confounding variable time_to_think, starvation can happen at every moment during the execution, and it will predictably happen after a while; (2) while the second type of starvation cannot be avoided, there is a way to reduce it that consists in to calculate a delay condition to apply to the algorithm by taking into account the degradation in communication between events, using some metaprogramming techniques; (3) there is a general way to make the program more fair by reducing unfairness between the third threads to synchronize differently with odd number_of_philosophers; (4) it is possible to make the resource hierarchy solution fair by applying a modified algorithm that uses a synchronization technique that allows you to switch between binary and ternary behavior between threads and uses the aforementioned degradation reversal techniques. These solutions are beyond the scope of this introduction.

In this introduction, and in the code at the state contemporary to its drafting, the algorithm for the three last threads is the number 1.

The death detection

Since there is no way to solve the second type of starvation and since there is no way to avoid time-related starvation when the input is not reasonable (strictly speaking, the problem is not testable, and so the problem has no empirical solution), death must be detected to stop the simulation when a philosopher dies, i.e. starves.

To do so, a timer is needed to check the eventual death of the philosopher. The timer is a thread that generates the philosopher and communicate with it. The timer shares a resource with a philosopher, start_time, and receives constant updates from the philosopher: start_time is updated by the philosopher by assigning it the current timestamp when it starts eating. The timer repeatedly compares the current timestamps against start_time and, if the difference between the current timestamp and start_time is greater than time_to_die - 10, it locks the standard output using a mutex, prints the message logging the death, waits for time_to_die * 4 milliseconds, and finally detaches the philosopher. Since this process will lock the standard output for time_to_die * 4 milliseconds, any other philosopher will starve attempting to access the standard output, and so any other philosopher will starve.

Three precautions are taken (1) to avoid wrong death, i.e. death while the philosopher is eating, (2) to make the philosophers die without communication with other philosophers, and (3) to not avoid wrong death if the philosopher is eating and starving after the death of another philosopher.

1 - The timer will mutually and exclusively share a flag with its philosopher, to check whether the philosopher is eating before entering the killing scope;

2 - The timer will print the death's log only if the difference between the current timestamp and start_time is minor than (time_to_die * 4) - 40;

3 - The timer will not check whether the philosopher is eating if the difference between the current timestamp and start_time is greater or equal than (time_to_die * 4) - 40.