Multi threading commands - GreycLab/gmic-community GitHub Wiki
Preface
This guide is intended for filter writers looking to speed up processing by taking advantage of multi-core CPUs. This is just a basic overview for now but will hopefully be expanded later, the parallel commands in G'MIC are a new feature so behaviour may change in future.
- -apply_parallel
If you have two or more images in the stack to be processed with the same command or set of commands this is ideal. A simple example would be a set of images to be blurred:
#Assuming we have multiple images in the stack and
#we want the last 4 processed simultaneously...
#Without parallel:
-blur[-4--1] 5%
#With parallel:
-apply_parallel[-4--1] "-blur 5%"
Note the image list has been removed from the blur command and moved to the parallel instruction - the commands in quotes will be applied to each image in the list at the same time (limited by the number of threads), so they only "see" one image.
- -parallel
If you have a set of images but want different commands run on each one at the same time this is what you need. It's actually the instruction the other parallel commands are based on to make other common multi-thread tasks easy. Here's an example:
#Assuming we have at least two images in the stack...
#Without parallel:
-blur[0] 5%
-mirror[1] x
#With parallel:
-parallel "-blur[0] 5%","-mirror[1] x"
Note we've left out an image list selection on the -parallel command so the full image list is available to the quoted commands, but obviously you can select a list if you wish (and then the quoted commands "see" this as the image list).
- -apply_parallel_channels/overlap
Suppose you have a very slow operation that tends to only be run on a single image, how can it be speeded up using multi-core? Here we present two possible ways of dealing with it:
-apply_parallel_channels
A fairly easy workaround is to simply process each channel separately at the same time (so long as we aren't using commands that need multi-channel to work properly). For example an RGB image in G'MIC is stored with each color in a continuous block of memory, so we can take just the red channel and work on it while we also deal with the green and blue on their own. For example:
# Assuming we have a multi-channel image in the list...
#Without parallel:
-median[0] 9
#With parallel channels:
-apply_parallel_channels[0] "-median 9"
One obvious down side to this approach is that the number of threads you can split into is limited by the number of channels (i.e. this is useless with a single channel image and three channels means three threads maximum regardless of how many cpu cores you have), but this is where our second option comes in handy:
-apply_parallel_overlap
Another neat trick is to split the image up into sections and work on each part at the same time. This is only possible with certain types of operation, but this is what -apply_parallel_overlap does by splitting images into "tiles" (rectangular chunks).
Many image filters work by basing one pixel's value on the pixels surrounding it, so this causes a problem at the edges of the tiles. A workaround is to have some overlap at the tile edges, where each tile contains some of the pixels from the next - it does mean some extra work due to pixels being processed multiple times but so long as the tiles are big enough the gains from multi-core outweigh it.
Here's a simple example:
#Without parallel:
-median[0] 9
#With parallel tiles:
-apply_parallel_overlap[0] "-median 9",24
The "24" is the amount of overlap in pixels - literally the number of pixels added to each tile at connected edges. The amount required will vary depending on each command, the greater the area of surrounding pixels the filter uses the larger an overlap required. Too large an overlap and you lose processing time, so ideally you want to tweak it to be just enough and no more. But when the overlap is too small you will notice "artifacts" along the edges of where the tiles would have been!
- Threads and G'MIC environment
There are two important things to mention about which parts of the G'MIC environment are shared and which are not, between different running threads.
All threads have the current list of images in common. As a consequence, it may be risky for a thread to deeply modify the structure of the images list (by adding or removing images for instance) because of possible concurrent accesses. For instance, if two threads run at the same time and are both intended to add one or several new images to the images list, you cannot assume which of the generated images will be output and appended first to the images list. In that case, we can strongly suggest that each thread names its output images with a label (with command '-name'), so you can easily find the corresponding images afterwards.
For instance, the execution of
-parallel "-testimage2d 300 -name[-1] Thread1","-testimage2d 300 -name[-1] Thread2"
will append two images to the list of images, sometimes in the order { "Thread1","Thread2" }, sometimes in the reverse order.
Each thread has its own variable environment, so passing variable values between running threads is not possible. A thread can access the values of global variables (whose names start with an underscore '_') which have been defined before the thread creation, and can even modify those values, but they remain 'local' to each thread.
When all threads have finished their executions, all global variables used by the first thread are copied back to the mainstream environment.
For instance, the execution of
_global="mainstream"
-parallel "_global=thread1 -echo $_global","_global=thread2 -echo $_global"
-echo $_global
will finally set global variable '$_global' to 'thread1'. But in the second thread, $_global will be set to "thread2" during all the execution of the thread.
So, any required communication between running threads must be done through a named image in the image list, created before the thread execution. For instance :
(0,0) -nm[-1] com
-v - -parallel "-repeat 1e5 -=[com] $>,0 -done",\
"-repeat 1e5 -=[com] $>,1 -done",\
"-do 300,100 -text[-1] @{com,0}\",\"@{com,1},5,5,13,1,255 -w[-1] -rm[-1] -wait 200 -while 1"
-v +
~~~~~~~~
5. A final note
---------------
In the examples above we've referred to images and pixels, but G'MIC doesn't make any assumptions about the content of the data. This means you can use parallel for more or less any calculation with any type of data set; so you can just as easily use the parallel instructions for generic number crunching!