SO 5.7 Tutorials Under The Hood - Stiffstream/sobjectizer GitHub Wiki
In this document, we'll try to look under the SObjectizer's hood and see from which parts SObjectizer consists and how it works.
We'll start from that thing as SObjectizer Environment (or SOEnv for short). SOEnv is a container inside that all SObjectizer-related entities are created and are working: agents, coops, dispatchers, message boxes, timers, and so on. Like on that picture:
To start work with SObjectizer it is necessary to create and run an instance of SOEnv. For example, in that code, a developer creates an instance of SOEnv manually as an instance of so_5::wrapped_env_t
class:
int main() {
so_5::wrapped_env_t sobj{...};
... // Some application logic.
return 0;
}
This instance starts working automatically and finishes its work when an object of type so_5::wrapped_env_t
will be destroyed.
It is possible to have several instances of SOEnv in an application. All allows to run several separate SObjectizer's instances in one program:
int main() {
so_5::wrapped_env_t first_soenv{...};
so_5::wrapped_env_t second_soenv{...};
...
so_5::wrapped_env_t another_soenv{...};
... // Some application logic.
return 0;
}
It allows to get the following picture:
SOEnv not only holds various things like coops and dispatchers, but it also controls them.
For examples, at the start of SOEnv the following entities should be run:
- the timer that will handle all delayed and periodic messages;
- the default dispatcher for agents those won't be bound to any specific dispatchers.
And all those entities should be stopped at SObjectizer's shutdown.
When a user wants to add agents to a SOEnv, the SOEnv should perform the registrations of a coop with new agents. And when a user wants to remove agents the SOEnv should perform the deregistration of the coop.
SOEnv holds a repository of registered coops. During the registration and deregistration procedures, SOEnv modifies that repository.
So, as a resume:
- SOEnv owns such entities as the timer, the default dispatcher and coops;
- SOEnv starts the timer and the default dispatchers;
- SOEnv performs the registration and the deregistration of coops;
- at the shutdown SOEnv deregister all live coops, then stops the default dispatcher, and then stops the timer.
There is an interesting thing inside a SOEnv that makes SOEnv more complex than it seems. It is SObjectizer Environment Infrastructure (or env_infrastructure for short). To explain what is it and what it for it is necessary to tell about some conditions in which SObjectizer was used sometimes.
When SObjectizer-5 was born, SOEnv used multithreading for everything: the timer was implemented as a separated thread; there was a separate thread for the deregistration of coops; the default dispatcher was a separate thread too.
Because SObjectizer is intended for simplification of the development of multithreaded applications it was considered (and is considered now) as an appropriate approach.
But with time SObjectizer was used in conditions, we didn't think before. And usage of multiple threads inside SOEnv was too expensive for such conditions.
As an example: a small application that sleeps most of the time. It awakens periodically, checks the presence of some information, handles this information if it present and then sends that information to some destination. All the work can be done on a single worker thread. We wanted that application been as lightweight as possible, because of that we wanted to avoid the creation of any additional worker threads.
Another example: a small application that works with the network via Asio, but the application logic is expressed via SObjectizer agents. We wanted to share the working context between Asio and SObjectizer. We also wanted to avoid the duplication of features: if Asio supports timers then there is no need to run another timer thread inside SObjectizer. It is better to teach SObjectizer to use Asio timers for serving delayed and periodic messages.
To allow usage of SObjectizer in such specific conditions we implemented such thing as env_infrastructure. In C++ the env_infrastructure is an interface with some methods. An object that implements that interface is created at the start of a SOEnv, and then SOEnv uses that object performing its work.
There are several different implementations of env_infrastructure in SObjectizer: ordinary multithreaded one; single threaded that is not thread-safe; single threaded that is thread-safe. Plus there are a couple of env_infrastructure implementations in so5extra project: they are based on Asio, one thread-safe, but the second not.
If a user wants he/she can implement own env_infrastructure, but it is not an easy task. And we, developers of SObjectizer, can't guarantee that your env_infrastructure implementation will be compatible with future releases of SObjectizer. It is because env_infrastructure is integrated with SObjectizer's internals so tight.
Working with SObjectizer a developer usually uses the following things:
- agents, where the application logic is implemented (or part of that logic);
- dispatchers, that schedules execution of agent's events;
- messages and message-boxes, those used for information interchange between agents.
In that section, we'll speak about agents and dispatchers. And in the next one, we'll speak about message-boxes.
Let's start from dispatchers because the understanding of dispatcher allows us to understand the vinaigrette of agents, coops, and disp_binders.
The key moment in SObjectizer is the fact that SObjectizer delivers messages to agents by itself. There is no need to call some receive
method to receive a message. Instead, an agent creates a subscription to a message. And when the appropriate message is sent it will be delivered to the subscribed agent by calling appropriate agent's method.
However, there is the main question: where the method of agent-receiver is called? On the context of which worker thread does agent handle subscribed messages?
A dispatcher is a thing that provides a working context for agents. Roughly speaking a dispatcher owns one or more worker threads on those calls of agent's methods are performed.
There are eight ready to use dispatcher types in SObjectizer: from very simple ones (like one_thread or thread_pool) to much more sophisticated (like adv_thread_pool or prio_dedicated_threads::one_per_prio). A developer can create as many dispatchers as he/she wants.
Let's assume that we have to write an application that acquires some information from attached devices, handles that information somehow, stores it into a DB, broadcasts that information to the outside world via some MQ-broker. And the interaction with attached devices will be synchronous, data processing can be rather complex and multilevel.
We can create a separate one_thread dispatcher to work with every attached device. In that case, all operations with the device will be performed on a separate thread and blocking of that thread won't have an influence on other threads. Another one_thread dispatcher can be created for working with a DB. And thread_pool dispatcher can be used for all other tasks:
It means that if a developer chooses SObjectizer as a tool then it is one of the major tasks of the developer to create a required set of dispatchers and bind agents to those dispatchers.
An agent has no need to check the presence of incoming messages. The dispatcher to that the agent is bound will call the agent when a new message for that agent arrives.
But there is a question: how will a dispatcher know that some message is sent to an agent?
It is a good question because in "the classical" Actor Model every actor has own message queue. In first versions of SObjectizer-5 we used the same approach: every agent had own message queue. When a message was sent to an agent it was stored in the agent's queue and agent's dispatcher received a special notification about the presence of a new message for the agent. So every "send" operation required the update of two queues: agent's event queue and dispatcher's notification queue.
This scheme had some benefits but all of them was ruled out by a huge drawback: it was inefficient. Because of that, we refused from agent's queues quickly. Now a dispatcher owns a message queue, not to an agent.
There is very simple logic: if a dispatcher decided where and when an agent handles its messages then let the dispatcher owns the message queue. So now we have the following picture:
An event_queue is a binding point between an agent and a dispatcher. An event_queue is an object with a specific interface, that object pushes an agent's message to the dispatcher's demands queue.
A dispatcher owns an event_queue object. A dispatcher determines the implementation of an event_queue, how much event_queue objects will be created, will an event_queue object be unique for every agent or several agents will work with the same event_queue objects, and so on.
An agent isn't bound to a dispatcher initially, the binding is done later, as a part of an agent's registration procedure. After binding to a dispatcher the agent receives and stores a reference to an event_queue. When a message is sent to the agent the message will be passed to that event_queue instance and event_queue takes responsibility for pushing that message to the dispatcher's queue.
However, there are several moments in an agent's life where the agent hasn't a reference to an event_queue. The first moment is the time between the creation of an agent and the binding of the agent to a dispatcher. The second moment is the time in the deregistration procedure when the agent is already unbound from the dispatcher but not destroyed yet. If a message is sent to an agent in some of those moments then the message will be ignored because there is no event_queue for the agent.
The run of an agent is performed in four steps.
On the first step, a developer creates an empty coop (some details below).
On the second step, a developer creates an instance of an agent. Agents in SObjectizer are represented by instances of ordinary C++ classes, so the creation of an agent looks like the creation of C++ object of the corresponding type.
On the third step, the agent from step two should be added to the coop from step one. A coop is a unique thing that is present only in SObjectizer (AFAIK). A coop is a group of agents which should be added to and removed from SObjectizer in a transaction manner. It means that if a coop contains three agents then all those agents should be added to SObjectizer successfully or no one of them should be added. Similarly, all three agents should be removed from SObjectizer or all three agents should continue their work.
The need in coops was discovered soon after the start of SObjectizer-5 life. It became obvious that agents would be created by groups, not by single instances. Coops were invented to simplify the life of a developer: there is no need to control the creation of the next agent and remove previously created agents if the creation of a new agent fails.
Thus, on the third step, a developer fills the coop with agents. After that, the fourth step is performed: coop is being registered. It can look like that:
so_5::environment_t & env = ...; // SOEnv for a new coop.
// Step 1: create a coop.
auto coop = env.make_coop();
// Step 2: create a new agent for the coop.
auto a = std::make_unique<my_agent>(... /*args for agent's constuctor*/);
// Step 3: move the agent into the coop.
coop->add_agent(std::move(a));
...
// Step 4: register the coop.
env.register_coop(std::move(coop));
But usually it is done in a more compact form:
so_5::environment_t & env = ...; // SOEnv for a new coop.
env.introduce_coop([](so_5::coop_t & coop) { // Step 1 is done automatically.
// Perform steps 2 and 3.
coop.make_agent<my_agent>(... /*args for agent's constructor*/);
...
}); // Step 4 is done automatically.
During the registration procedure, a developer passes a filled coop to SOEnv and SOEnv performs several actions. SOEnv asks resources from dispatchers for new agents, calls so_define_agent
methods on agents, binds agents to dispatchers, sends a special message for every agent (it is necessary for a call to so_evt_start
). Naturally, with the rollback of previously performed actions in case of errors.
When a coop is registered the agents from that coop are in SObjectizer (in a particular SOEnv to be precise) and can do their work.
Maybe the most important part of the registration of a coop is binding of agents to dispatchers. Only after that binding, an agent receives an actual reference to event_queue. And this makes delivery of messages to the agent possible.
After a successful registration of a coop we'll have the following picture:
"Binding of agents to dispatcher" was mentioned several times above. But what is it? How SObjectizer understand which agent should be bound to which dispatcher?
Special objects called disp_binders play an important role here. They serve the task of binding agents to dispatchers during the registration of a coop. And also the task of unbinding during the deregistration procedure.
There is a special interface in SObjectizer. It should be implemented by every disp_binder. A specific implementation of disp_binder depends on dispatcher type. Every dispatcher implements own disp_binder.
To bind an agent to a dispatcher a developer has to create a disp_binder and specify that binder during the addition of an agent to the coop. In principle the code of filling a coop should look like:
auto & disp = ...; // A reference to a dispatcher to be used.
env.introduce_coop([&](so_5::coop_t & coop) {
// Create an agent with a specific dispatcher binder.
coop.make_agent_with_binder<my_agent>(disp.binder(),
... /* args for agent's constructor */);
...
});
An important nuance: coop owns disp_binders and knows which disp_binders are used by which agents. So the real picture of a registered coop will look like that:
Yet another key moment in SObjectizer that should be discussed briefly is message boxes (or mboxes in SObjectizer's terminology).
The presence of mboxes distinguishes SObjectizer from other frameworks that implement "the classical" Actor Model. In "the classical" Actor Model messages are sent directly to destination actors. Because of that, a sender should have a reference to the receiver.
SObjectizer has roots not only in Actor Model, but also in a Publish-Subscribe approach. Because of that a send in 1:N mode is initially implemented in SObjectizer. And because of that messages in SObjectizer are sent not to agents but to mboxes. There can be just one receive behind a mbox. Or there can be many (many hundreds of thousands) of receivers. Or no one.
Because messages are sent to mboxes but not directly to receivers, we have to add another term that is not present in "the classical" Actor Model, but is a cornerstone in Publish-Subscribe: a subscription to a message from a mbox. If an agent wants to receive messages in SObjectizer it has to create a subscription to an appropriate message type. No subscription -- no messages will be delivered.
There are two types of mboxes in SObjectizer. The first one is multi-producer/multi-consumer (MPMC). This type is used in M:N interaction mode. The second type is multi-producer/single-consumer (MPSC). This type was added later and intended to be used for efficient interaction in M:1 mode.
Initially, there were only MPMC mboxes, because M:N mode is enough for solving most of the tasks. MPMC can be used even for M:1 mode (additional mboxes can be created for every receiver in that case). But usage of MPMC mboxes in M:1 is not efficient. Because of that MPSC mboxes were added to SObjectizer-5.
MPMC-mbox delivers a message to all subscribed agents. How many those agents (just one, several thousand or no one) is just a detail of work. MPMC-mbox holds a map of message types and lists of subscribers. A working scheme of MPMC-mbox can be expressed this way:
Here Msg1, Msg2, ..., MsgN are the types of messages on that agents are subscribed.
MPSC-mbox is much simpler than MPMC-mbox, and because of that, it works more efficient. Only a reference to the target agent is in MPSC-mbox:
If we will try to tell about message delivery in SObjectizer we end up with the following picture:
A message is sent to a mbox. Mbox selects a receiver (in the case of MPMC-mbox all subscribers of that message type, in the case of MPSC-mbox the owner of mbox) and gives the message to the receiver (receivers).
The receiver looks for an actual reference to an event_queue. If such reference is present then the message is pulled to the event_queue. Otherwise, the message will be ignored.
If the message is pulled to an event_queue then event_queue stores the message in a dispatcher queue. What it will be the queue depends on the type of the dispatcher.
The dispatcher takes messages from its queues. When the message will be extracted from a queue the receiver of the message will be called on the context of the dispatcher (yet again: a dispatcher provides worker thread for calling event-handlers of agents).