offload_api - noma/ham GitHub Wiki

HAM-Offload API documentation

Everything you need for Offloading is collected in the namespace ham::offload. See also the page on implicit and explicit initialisation, and the compile-time configuration.

Types


node_t
Represents the address of a process. Usually, node 0 is the logical host, while all other processes are offload targets.


node_descriptor
This type aggregates information on nodes. Currently there are the following methods.

  • const char* name() const
    Returns the name of host where the node's process is running on.

buffer_ptr<T>
A pointer to a contiguous remote buffer, e.g. memory of an offload target.

  • T* get() Returns the local memory address of the buffer (only valid on the node where buffer was allocated).
  • node_t node()
    Returns the node where on which the buffer was allocated.
  • T& operator[](size_t index)
    Array-like access to the buffer. Only usable in code that is executed on the node where the buffer was allocated, e.g. in an offloaded function call.
  • buffer_ptr<T> operator+(size_t offset)
    Returns a buffer_ptr to a sub-buffer beginning offset elements after this one.

future<T>
A future represents the result of type T of an asynchronous operation, that may or may not be available yet. Note: The destructor of a valid instance finishes the protocol by calling get().

  • bool test()
    Returns true if the result is available and false otherwise.
  • T get()
    Returns the result and blocks until it is available. Do not call on an invalid future.
  • bool valid() const
    Returns trueif the instance is valid. A future is invalid if it was not constructed from asynchronous operation, and after its result was fetched using get().

Macros

Note: macros do not reside in namespaces.


f2f(function_address, function_arguments...)
This macro creates a Function object from given function and its arguments. Use auto to generate a variable of the result or just use it as an expression when calling async().


Functions

Utility


node_t this_node()
Returns the node where the call is executed.


size_t nodes()
Returns the number of available nodes.


Offload Kernels


future<Functor::result_type> async(node_t node, Functor&& functor)
Asynchronously offloads a functor to a given node. The node must not be the logical host. To generate a functor for a given function and a set of arguments use f2f(). The result is a future of the functors result_type that can be used resynchronise on the offload. Ignoring the return value implicitly synchronises the offload by calling the future's destructor.

int add(int a, int b); // kernel
future<int> result = async(node_t node, f2f(add(1,2)); 
// or just
auto result = async(node_t node, f2f(add(1,2)); 
result.get();

Functor::result_type sync(node_t node, Functor&& func)
Performs a synchronous offload. Basically the same as async(), but blocks until the result is ready, and returns it. Equivalent to:

async(...).get(); // call get() on returned future
// or even
async(...); // destructor of the result is called

Memory


buffer_ptr<T> allocate(const node_t node, size_t n)
Allocate a buffer with space for n elements of type T on node node. Note: No constructors are called for the buffer elements.


void free(buffer_ptr<T> p)
Free the buffer referenced by p. Note: No destructors are called for the buffer elements.


future<void> put(T* local_source, buffer_ptr<T>& remote_dest, size_t n)
Host to offload target data transfer. Transfers n elements of type T from a local memory location specified be local_source to a buffer specified by remote_dest. The resulting future can be used to synchronise on the operation. Ignoring the result implies synchronisation.


void put_sync(T* local_source, buffer_ptr<T>& remote_dest, size_t n)
Explicitly synchronous version of put().


future<void> get(buffer_ptr<T> remote_source, T* local_dest, size_t n)
Offload target to host data transfer. Transfers n elements of type T from a remote memory location specified be remote_source to a local buffer specified by local_dest. The resulting future can be used to synchronise on the operation. Ignoring the result implies synchronisation.


void get_sync(buffer_ptr<T> remote_source, T* local_dest, size_t n)
Explicitly synchronous version of get().


void copy(buffer_ptr<T> source, buffer_ptr<T> dest, size_t n)
Experimental, asynchronous direct copy between two offload targets.


void copy_sync(buffer_ptr<T> source, buffer_ptr<T> dest, size_t n)
Experimental, synchronous direct copy between two offload targets.


⚠️ **GitHub.com Fallback** ⚠️