Concepts - salmito/leda GitHub Wiki

Basic Concepts

What is Leda?

Leda is a reference implementation intended to add support for multi threaded processing based on [SEDA] 1 principles to the Lua programming language.

It provides Lua interfaces to explicitly define the event workflow of an application and exposes runtime statistics of its internal structures making easy to see where is the bottleneck.

Each workflow's step is defined by a Stage and is implemented by an event handler. Each event is tied to a stage through generic interfaces called Connectors. Stages implement part of the application logic and decides which are the next events fired from there. Thus, workflow is defined as a stage graph which are bonded together through connectors.

Stages

The staged concurrency model introduces a modular approach for the design of concurrent systems combining events and threads. In this approach, processing tasks are divided into a series of self-contained modules (stages) connected by event queues. The concept of stages was introduced by the work on SEDA.

Each stage has its own event handler – which implements the stage functionality – and a thread pool. Threads in the pool remove events from stage's input queue and execute parallel instances of its event handler. The processing event may generate one or more events which are dispatched through output ports bound, in a later step, to asynchronous channels (connectors) that connect communicating stages by delivering event to the appropriate event queue. The stage's event-handler runs on a user-level thread abstraction until the processing is completed. The figure bellow illustrates the basic components of a stage.

Systems using the staged approach can explore its modularity to tune task scheduling and resource sharing. Scheduling parameters and concurrency level of each stage are adjusted on runtime by an external entity called controller, thus allowing application control the level of concurrency at a finer grain and employ appropriate scheduling policies for specific characteristics of each stage. Stages can also be convenient barriers to control access to memory or other resources.

In Leda, an application is composed of a set of stages. A stage consists of a:

  • Event handler
  • A user-level thread pool
  • An event queue
  • Collection of output ports

Each user-level thread has a local memory, which contains the stage's instructions and its private data.

Connectors

Stages interact by sending messages through channels called connectors.

A connector is a message-passing interface that connects one stage's output port with another stage's event queue. Data values appear in the destination event queue in the same order in which they are placed in the connector.

  • A stage can send local data values to other stages via output ports, which are delivered by the connector bound to it.
  • A stage receives data values from other stages only via its event queue.

After binding stages through connectors, the application developer must verify the communication patterns between stages and determine whether the corresponding connectors may carry information that cannot be exchanged between different OS processes (such as file or socket descriptors). Such connectors are then marked as local.

There are two types of connectors:

  • Decoupled connectors: This type of connector can only send events that can be serialized (such as primitive Lua values).
  • Local connectors: This type of connector may send events with resources that cannot be serialized, such as file descriptors and memory pointers, thus both communicating stages must run locally.

Stage Graph

The computations in the Leda's multitasking model can be viewed as a stage-dependency diagram, where dependencies result from communications between two stages. It requires the definition of a stage-interaction graph, which also captures other interactions between stages such as data sharing, thus expressing a task-level parallelism for individual stages. The figure below depicts a example of a stage-interaction graph.

In this model, a stream of data is passed through a succession of stages, each of which performs some task on it. The stage-interaction graph exposes a stream-level parallelism while acting as a pipeline. With the exception of the process initiating the work for the pipeline, the arrival of new data triggers the execution of a new task by a stage in the pipeline. Each stage can be viewed as a consumer of the data items produced by the stage preceding it (or, alternatively, each stage in the pipeline can be viewed as a producer of data for the stage following it). Therefore, the pipeline is a chain of producers and consumers.

The pipeline does not need to be a linear chain. Instead, it is a directed graph. Stages could form pipelines in form of:

  • Linear or multidimensional arrays
  • Trees
  • General graphs with or without cycles

Possible stage-interaction graph topologies:

Clustering and Mapping

The Leda's multitasking model is closely related to Ian Foster's Task/Channel Model described in his book [Designing and Building Parallel Programs] (http://www.mcs.anl.gov/~itf/dbpp/).

Task/channel model of Ian Foster describes an algorithm design methodology for a general style of computation; i.e., a computational model. This methodology structures the design process as four distinct phases: partitioning, communication, agglomeration, and mapping. It encourages the development of scalable algorithms by delaying machine-dependent considerations until the later steps.

The first two steps look for parallelism in the problem. However, the design obtained at this point probably doesn't map well onto a real machine. If the number of tasks greatly exceed the number of processors, the overhead will be strongly affected by how the tasks are assigned to the processors. How must we combine tasks in order to map them effectively onto processors?

In the third step of the application design, analog to Foster's agglomeration, the stages in the application graph are partitioned into clusters, which will be mapped to independent execution units. This partition must comply with the communication patterns defined before, i.e., stages communicating through local connectors must reside in the same cluster. This requirement allows the programmer to rely on the guarantee that the communicating stages may freely exchange process-related information.

The fourth and final step, mapping, occurs at the moment of running the application, when each cluster is mapped to a Leda OS process running on an arbitrary host. The fact that this mapping is delayed to this step allows the developer to run the same partitioned graph over different distribution scenarios.