Otter Serial - Otter-Taskification/otter GitHub Wiki
As part of the ExCALIBUR cross-cutting task parallelism research theme, the Otter-Serial API is designed to facilitate data-driven parallelisation of serial code. Using the API, developers may annotate regions of code to be parallelised as well as task regions, loops and synchronisation constraints. By re-compiling and linking the annotated executable to the Otter-Serial runtime library, the loop/task-based structure of the annotated code can be traced. The recorded trace data is used by PyOtter to visualise the annotated code as a task graph and to report the recommended parallelisation strategy.
This video explains the vision behind the Otter API.
Doxygen documentation is available for the Otter-Serial API and can be generated from the root of the main Otter repository with
doxygen .config/doxygen.cfg
The documentation's home page can then be found at doxygen/html/index.html
.
The API exposes functionality for tracing the structure of a serial program. Before the API can be used it must be initialised with otterTraceInitialise()
and when tracing is complete it must be finalised with otterTraceFinalise()
.
The following pairs of functions exist for annotating the start and end of specific code regions:
void otterThreads[Begin|End](...);
void otterTask[Begin|End](...);
void otterLoop[Begin|End]();
void otterLoopIteration[Begin|End]();
The otterSynchroniseTasks()
function specifies a task synchronisation constraint which synchronises all child or descendant tasks of the innermost enclosing task or parallel region at that point.
The otterSynchroniseDescendantTasks[Begin|End]()
functions may be used to annotate a region within which all descendant tasks of the innermost enclosing task or parallel region must be complete before execution can continue. This is analogous to the OpenMP taskgroup
construct.
Some functions in the API require arguments specifying the location where they appear in source. The macro function OTTER_SRC_ARGS()
can be inserted to substitute these arguments at compile-time.
A stub version of the Otter-Serial header, which provides stub/mock versions of the Otter-Serial API declarations, is available here for use by users of software which itself includes or uses Otter but does not require it's own users to install Otter. This file is placed into the public domain (see the license in that file).
This page explains how to use Otter to annotate and trace the structure of a simple example program which calculates the nth Fibonacci number.
The serial program to be parallelised is:
#include <stdio.h>
#include <stdlib.h>
int fib(int n);
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "usage: %s n\n", argv[0]);
return 1;
}
int n = atoi(argv[1]);
int fibn = 0;
// The main calculation which we'd like to parallelise
fibn = fib(n);
printf("f(%d) = %d\n", n, fibn);
return 0;
}
int fib(int n) {
if (n<2) return n;
int i, j;
// Each call to fib() spawns 2 further calls
i = fib(n-1);
j = fib(n-2);
// Output dependency on i and j
return i+j;
}
The Otter API is defined in otter-serial.h
which is installed at the same time as the Otter-Serial library.
Before the API can be used it must be initialised with otterTraceInitialise()
and it must be finalised with otterTraceFinalise()
immediately before the program exits. All call to the API must occur between these initialisation & finalisation calls. The API can therefore be initialised in this way:
#include <otter-serial.h>
int main(int argc, char *argv[]) {
otterTraceInitialise(OTTER_SRC_ARGS());
// Main body of program
{
fibn = fib(n);
}
otterTraceFinalise();
return 0;
}
Each section of code which is a candidate for parallelisation should be wrapped by calls to otterThreads[Begin|end]()
e.g.
otterTraceInitialise(OTTER_SRC_ARGS());
otterThreadsBegin(OTTER_SRC_ARGS());
{
fibn = fib(n);
}
otterThreadsEnd();
otterTraceFinalise();
Use otterTask[Begin|End]()
to indicate code which can be considered as a task, such as the first call to fib()
which begins the calculation:
otterTraceInitialise(OTTER_SRC_ARGS());
otterThreadsBegin(OTTER_SRC_ARGS());
{
otterTaskBegin(OTTER_SRC_ARGS());
fibn = fib(n);
otterTaskEnd();
}
otterThreadsEnd();
otterTraceFinalise();
In this example, each recursive call to fib()
can be considered as a task. In addition, because there is an output dependency on i
and j
, each call to fib()
must wait for the result of the tasks it spawns. This constraint is specified with otterSynchroniseTasks(otter_sync_children)
:
int fib(int n) {
if (n<2) return n;
int i, j;
otterTaskBegin(OTTER_SRC_ARGS());
i = fib(n-1);
otterTaskEnd();
otterTaskBegin(OTTER_SRC_ARGS());
j = fib(n-2);
otterTaskEnd();
otterSynchroniseTasks(otter_sync_children);
return i+j;
}
The annoted program fib.c
can be compiled with:
clang fib.c -lotf2 -lotter-serial -o fib
Use -L/path/to/library
to specify the installation directories for OTF2 and Otter-Serial if these were not installed to a standard location. The defaults are /opt/otf2/lib
for OTF2 and /usr/local/lib
for Otter-Serial.
Running the annotated executable will cause a trace to be generated. By default, trace files are written to trace/
. If either OTF2 or Otter-Serial cannot be loaded, use the LD_LIBRARY_PATH
environment variable to specify their location at runtime e.g.:
LD_LIBRARY_PATH=/opt/otf2/lib ./fib 5
The program will report the location of the generated trace file:
OTTER_TRACE_FOLDER=trace/otter_trace.[pid]