How to scale the demo stack - datacratic/rtbkit GitHub Wiki
Once the Demo Stack is running, a common question is how can it be scaled? Here is a quick overview of how it can be achieved. Note that if you want to test the material presented here, you will need at least 4 machines with multiple cores to play with.
Scaling
RTBkit is built to be scalable. Most of the work that needs to scale is limited to the following 3 types of component: routers, agents and augmenters. Those are the scalable components.
Every other components are currently designed as singletons since they normally deal with small amount of traffic. Except for the master banker, they will eventually migrate to be scalable.
Servers
One possible way to set up multiple servers is to create a machine containing all singleton processes along with a variable number of nodes containing the scalable components.
The idea is to simply load balance incoming bid requests to each one of these nodes.
Note that as the number of request increases, some components will incur a significant load on the Carbon process to record metrics. We found that the best way to work around this is to create one instance of Carbon per server and point every process to their localhost
. This will avoid saturating a central database and will enable you to use Graphite's Federated Storage with minimal fuss.
Launch Sequence
Here is an example of a launch sequence that achieves this set up. Note that we use the launcher here for illustration purposes only so that it's easier to play with. A serious production environment will probably includes more robust administration tools.
The launch sequence presented here contains 3 nodes. The node named st1
is for singletons processes while st2
and st3
are identical scalable nodes containing each a router and an agent.
{
"nodes": [
{
"name": "st1",
"root": ".",
"tasks": [
{
"name": "monitor",
"root": ".",
"path": "build/x86_64/bin/monitor_service_runner",
"arg": [
"-N", "st1.monitor",
"-B", "sample.bootstrap.json"
],
"log": true
},
{
"name": "mock-ad-server",
"root": ".",
"path": "build/x86_64/bin/ad_server_connector_ex",
"arg": [
"-N", "st1.mock-ad-server",
"-B", "sample.bootstrap.json"
],
"log": true
},
{
"name": "logger",
"root": ".",
"path": "build/x86_64/bin/data_logger_ex",
"arg": [
"-N", "st1.logger",
"-B", "sample.bootstrap.json", "--log-dir", "./logs/data/"
],
"log": true
},
{
"name": "agent-configuration",
"root": ".",
"path": "build/x86_64/bin/agent_configuration_service_runner",
"arg": [
"-N", "st1.agent-configuration",
"-B", "sample.bootstrap.json"
],
"log": true
},
{
"name": "banker",
"root": ".",
"path": "build/x86_64/bin/banker_service_runner",
"arg": [
"-N", "st1.banker",
"-B", "sample.bootstrap.json", "-r", "localhost:6379"
],
"log": true
},
{
"name": "augmentor",
"root": ".",
"path": "build/x86_64/bin/augmentor_ex_runner",
"arg": [
"-N", "st1.augmentor",
"-B", "sample.bootstrap.json"
],
"log": true
},
{
"name": "post-auction",
"root": ".",
"path": "build/x86_64/bin/post_auction_runner",
"arg": [
"-N", "st1.pal",
"-B", "sample.bootstrap.json"
],
"log": true
}
]
},
{
"name": "st2",
"root": ".",
"tasks": [
{
"name": "router",
"root": ".",
"path": "build/x86_64/bin/router_ex",
"arg": [
"-N", "st2.router",
"-B", "sample.bootstrap.json"
],
"log": true
},
{
"name": "fixed-price-agent",
"root": ".",
"path": "build/x86_64/bin/bidding_agent_ex",
"arg": [
"-N", "st2.agent",
"-B", "sample.bootstrap.json"
],
"log": true
}
]
},
{
"name": "st3",
"root": ".",
"tasks": [
{
"name": "router",
"root": ".",
"path": "build/x86_64/bin/router_ex",
"arg": [
"-N", "st3.router",
"-B", "sample.bootstrap.json"
],
"log": true
},
{
"name": "fixed-price-agent",
"root": ".",
"path": "build/x86_64/bin/bidding_agent_ex",
"arg": [
"-N", "st3.agent",
"-B", "sample.bootstrap.json"
],
"log": true
}
]
}
]
}
This file is available in the RTBkit repository and is named sample.launch.scale.json
.
Notice that the process names were updated to contain the node name. This is not strictly required but processes need unique names to be able to work together in the same installation. Thus, instead of having something like router1
and router2
as router names, the launch sequence now has st2.router
and st3.router
. This has the nice side effect of grouping metrics reports in Graphite.
Note that launch sequence uses the sample bootstrap file that points to a local ZooKeeper. Unless your ZooKeeper is configured on each machine as an ensemble, the bootstrap needs to be updated to point to your instance of ZooKeeper.
Also, make sure redis is properly configured on st1
to be accessible by the master banker.
Router
If the router uses one of the HTTP exchange connectors, it can use multiple cores by spawning multiple worker threads. Thus, a single router can easily saturate a whole server. Adjust the number of workers depending on the incoming traffic and the network latency that you observe.
For example, you can specify 8 threads for the mock exchange connector like so:
[
{
"exchangeType": "mock",
"listenPort": 12339,
"bindHost": "0.0.0.0",
"auctionVerb": "POST",
"auctionResource": "/auctions",
"performNameLookup": false,
"numThreads": 8
}
]
Mock Exchange
The last piece of the puzzle is the mock exchange. Unless you have an existing endpoint that produces bid requests, this is required to make the stack run. Note that the mock exchange has a built-in ad server that sends an occasional win. If the stack doesn't receive wins for some time, it will enter slow mode and process less than 100 bid requests per second. When using a bid request source other than the mock exchange, make sure there is a way to send win notifications.
Generating bid request can be intensive. We'll use st4
for that.
{
"name": "st4",
"root": ".",
"tasks": [
{
"name": "mock-exchange",
"root": ".",
"path": "build/x86_64/bin/mock_exchange_runner",
"arg": [
"-b", "st2.staging:12339",
"-b", "st3.staging:12339",
"-w", "st1.staging:12340",
"-t", "8"
],
"log": true
}
]
}
For convenience, this node is also present in the file sample.launch.scale.json
.
Note that here, the URLs of the routers and the ad server needs to be specified explicitly to the mock exchange. To avoid the need for a load balancer, the mock exchange supports the publication of bid requests to multiple endpoints. Those URLs should be edited to reflect your situation.
The last parameter represents the number of worker threads that the mock exchange will use. Increasing this number will increase the maximum QPS.
Does it work?
The only way to "see" that the stack is doing the right thing is to fire up Graphite and look at the metrics reported by the various processes. Of course, the logs are still a valuable source of information but they tend to be mostly useful when something fails to start or simply crashes.
I suggest reading how to do Monitoring using Graphite and the list of Graphite Keys. Knowing how to navigate those metrics and learn what to expect from them takes a bit to time to get used to but it ends up being a precious debugging tool for developers and for operations.
Multiple Data Centers
At this point, one can ask how it can be taken a step further and scale with multiple instances of the stack. This is typical of multiple data centers. In such a scenario, each data center is almost autonomous but needs to communicate some information globally at low volume and with tolerance to high latency.
For example, the master banker holds a global budget for accounts and needs to be shared. Thus, the system is designed such that each data center will be able to reach it.
The location of processes can be found in the bootstrap file using the location
property. So, you'll need one bootstrap file per location and use it for all processes of that location.
{
"installation": "rtb-test",
"location": "mtl",
...
}
Currently, the connection scope of each kind of component is hard coded.
Only the router and the post auction loop will try to connect to the master banker regardless of its location. Every other component will only connect to other components that publish the same location. New components can select the desired behaviour by specifying the local
parameter when connecting to a service class using the various Zmq
classes.
ZooKeeper
A quick note about ZooKeeper.
Since our service discovery uses ZooKeeper, it needs to be shared across all data centers. We found that the latency can cause stability issues and render the stack almost unusable. The clients will receive the session timeout message, be dropped and will cause services to reconnect frequently. This can be prevented by simply setting the TickTime
to something very large like 15000ms.
Since we use ephemeral nodes, this means that a service that needs to be restarted (in case of a crash for example) will have to wait for ZooKeeper to detect that it's not longer connected before being able to be permitted to recreate those nodes.
Therefore, if you see the launcher sleeping away the ZooKeeper timeout, don't worry, this is what this message is about.