Multi threading simulation - quasylab/sibilla GitHub Wiki

Simulation classes in Sibilla follow a multi-threading approach and are based on Java Concurrency API. Different scheduling policies are available to execute simulation tasks that can be tailored to fit with the parallelism of the hosting architecture.

A mechanism based on Factory Methods is used to set at the level of Runtime the appropriate SimulationManager.

Currently two kinds of implementation are considered:

  • SequentialSimulationManager: simulation tasks are executed in the order they are submitted, a task is started only when the previous one has finished.
  • ThreadSimulationManager: simulation tasks are executed following a multi-threading approach. Different scheduling algorithms can be used according to the used ExecutorService.

In the following the performance of SEIR Scenario with different parameters and simulation managers are considered.
We let the population size varies from 10 to 1000 individuals and the number of simulation runs ranging in 1, 10, 100, 1000.

The experiments are conducted on a MacBook Pro having 32Gb of RAM and a 2,3 GHz 8-Core Intel Core i9. Tests can be replicated by using the class SimulationManagerTest.

Sequential

Time to perform the simulation runs when a SequentialSimulationManager is considered is reported below:

10 100 1000
1 0.085 0.030 0.044
10 0.032 0.034 0.122
100 0.045 0.113 0.463
1000 0.081 0.516 4.214

We can observe that by using this simulation manager, the performance increases (almost) linear with the length of the simulation and (almost) exponentially with the size of the population.

ThreadSimulationManager and Fixed Thread Pool

When a ThreadSimulationManager is used, simulation is performed following a multighreading approach. In this case an ExecutorService is needed to handle the scheduling of the different threads. Below the resulsts of the simulation with a FixedThreadPool of different sizes is considered. This is a thread pool that reuses a fixed number of threads operating off a shared unbounded queue.

When can observe that if we increase the size of the pool, while remaining under the number of cores in the hosting machine, performance are getting better.

Thread pool size 2

10 100 1000
1 0.017 0.014 0.018
10 0.014 0.019 0.048
100 0.026 0.049 0.313
1000 0.060 0.352 2.936

Thread pool size 4

10 100 1000
1 0.014 0.014 0.019
10 0.015 0.017 0.035
100 0.022 0.040 0.224
1000 0.045 0.225 2.024

Thread pool size 6

10 100 1000
1 0.014 0.014 0.018
10 0.015 0.017 0.033
100 0.021 0.038 0.199
1000 0.041 0.201 1.807

Thread pool size 8

10 100 1000
1 0.013 0.015 0.017
10 0.016 0.017 0.034
100 0.020 0.036 0.189
1000 0.040 0.190 1.727

Thread pool size 10

10 100 1000
1 0.014 0.014 0.018
10 0.015 0.017 0.035
100 0.022 0.036 0.194
1000 0.041 0.198 1.862

ThreadSimulationManager and Cached Threads Pool

We can considered different Executors like, for instance, the CachedThreadPool. This is a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available. These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks. We can observe that, in this case, this configuration outperforms both the sequential manager and the ThreadSimulationManager based on FixedThreadPool.

10 100 1000
1 0.017 0.016 0.022
10 0.018 0.020 0.037
100 0.033 0.038 0.178
1000 0.105 0.219 1.707

ThreadSimulationManager and Work Stealing Pool

Another option is the use of WorkStealingPool. This is a thread pool that maintains enough threads to support the given parallelism level guaranteed by the hosting machine, and may use multiple queues to reduce contention. We can observe that, by using this kind of executor service the performance are not better than the one obtained in the previous case by using cached thread pool, even if the two results are similar.

10 100 1000
1 0.019 0.019 0.027
10 0.020 0.023 0.040
100 0.029 0.043 0.203
1000 0.072 0.222 1.922