offload_api - noma/ham GitHub Wiki
Everything you need for Offloading is collected in the namespace ham::offload
.
See also the page on
implicit and explicit initialisation,
and the
compile-time configuration.
node_t
Represents the address of a process. Usually, node 0 is the logical host, while all other processes are offload targets.
node_descriptor
This type aggregates information on nodes. Currently there are the following methods.
-
const char* name() const
Returns the name of host where the node's process is running on.
buffer_ptr<T>
A pointer to a contiguous remote buffer, e.g. memory of an offload target.
-
T* get()
Returns the local memory address of the buffer (only valid on the node where buffer was allocated). -
node_t node()
Returns the node where on which the buffer was allocated. -
T& operator[](size_t index)
Array-like access to the buffer. Only usable in code that is executed on the node where the buffer was allocated, e.g. in an offloaded function call. -
buffer_ptr<T> operator+(size_t offset)
Returns abuffer_ptr
to a sub-buffer beginningoffset
elements after this one.
future<T>
A future represents the result of type T
of an asynchronous operation, that may or may not be available yet. Note: The destructor of a valid instance finishes the protocol by calling get()
.
-
bool test()
Returnstrue
if the result is available andfalse
otherwise. -
T get()
Returns the result and blocks until it is available. Do not call on an invalid future. -
bool valid() const
Returnstrue
if the instance is valid. A future is invalid if it was not constructed from asynchronous operation, and after its result was fetched usingget()
.
Note: macros do not reside in namespaces.
f2f(function_address, function_arguments...)
This macro creates a Function
object from given function and its arguments. Use auto to generate a variable of the result or just use it as an expression when calling async()
.
node_t this_node()
Returns the node where the call is executed.
size_t nodes()
Returns the number of available nodes.
future<Functor::result_type> async(node_t node, Functor&& functor)
Asynchronously offloads a functor
to a given node
. The node must not be the logical host. To generate a functor for a given function and a set of arguments use f2f()
. The result is a future
of the functors result_type
that can be used resynchronise on the offload. Ignoring the return value implicitly synchronises the offload by calling the future's destructor.
int add(int a, int b); // kernel
future<int> result = async(node_t node, f2f(add(1,2));
// or just
auto result = async(node_t node, f2f(add(1,2));
result.get();
Functor::result_type sync(node_t node, Functor&& func)
Performs a synchronous offload. Basically the same as async()
, but blocks until the result is ready, and returns it. Equivalent to:
async(...).get(); // call get() on returned future
// or even
async(...); // destructor of the result is called
buffer_ptr<T> allocate(const node_t node, size_t n)
Allocate a buffer with space for n
elements of type T
on node node
. Note: No constructors are called for the buffer elements.
void free(buffer_ptr<T> p)
Free the buffer referenced by p
. Note: No destructors are called for the buffer elements.
future<void> put(T* local_source, buffer_ptr<T>& remote_dest, size_t n)
Host to offload target data transfer. Transfers n
elements of type T
from a local memory location specified be local_source
to a buffer specified by remote_dest
. The resulting future can be used to synchronise on the operation. Ignoring the result implies synchronisation.
void put_sync(T* local_source, buffer_ptr<T>& remote_dest, size_t n)
Explicitly synchronous version of put()
.
future<void> get(buffer_ptr<T> remote_source, T* local_dest, size_t n)
Offload target to host data transfer. Transfers n
elements of type T
from a remote memory location specified be remote_source
to a local buffer specified by local_dest
. The resulting future can be used to synchronise on the operation. Ignoring the result implies synchronisation.
void get_sync(buffer_ptr<T> remote_source, T* local_dest, size_t n)
Explicitly synchronous version of get()
.
void copy(buffer_ptr<T> source, buffer_ptr<T> dest, size_t n)
Experimental, asynchronous direct copy between two offload targets.
void copy_sync(buffer_ptr<T> source, buffer_ptr<T> dest, size_t n)
Experimental, synchronous direct copy between two offload targets.