The workflow execution - TrentoCrowdAI/crowdhub-api GitHub Wiki

When a workflow starts its execution (workflow-execution.delegate.js) creates a run object. A run is essentially an instance of the execution of a workflow with specific results and state of each block.

Graph structure

The workflow is represented as a graph, so each node is a block, like the do and lambda blocks, while the arcs are oriented by the order of the execution of each block and determine which outputs will be passed as inputs to the next block.

The execution

Preparing the structure

First of all the blocks of the workflow are connected directly with a pointer instead of searching for the associated arc and its the connected block. In this way from a block we can access directly the successive and the previous ones.

The do block are processed differently than the others. They are divided in two different blocks:

The do-publish implements the publication of a job on the specified platform and with the given parameters
The do-wait implements the logic of checking if a job is finished and in this case it retrieves the result of the crowdsourcing platform

This two blocks are directly connected in the workflow graph.

Prepare the execution

A new run is created at this point and the items of the project are retrieved from the database. We are ready to start the execution of the workflow.

Execute blocks

Firstly we search for blocks which don't have parents, so the first to be executed. We run them asynchronously without awaiting for the end of the execution but we save the promise in the block object in order to use it later.

When a block is started it has to retrieve its inputs so it check from the parent blocks and get them output by putting it in an object with the label of the previous block as property name. In this way we can access a specific block input by calling the name of the block which has generate it. If a block has no parents the items of the project are given as its inputs.

If a parent of the block is not finished the functions exit and it will be executed by another parent later.

The workflow-executor now checks if the toCache property of the block is set to true and if a cached value of the block from a previous run is stored in the db. In this case the output value of the block is simply retrieved from the db.

If the previous conditions are not satisfied the block is executed by calling the js file specific for the block.

Now the output value of the block is stored in the db as a cache entity and the block is flagged as finished.

Once finished its execution a block try to start all its children, so all the successive blocks in the execution flow.

Check the state of a run

The run has a property for each block where information about the state of the block is specified (not started, running, finished) and the id of the cache related to the block (really useful to access the output of the block).