Using nsm - rallytac/pub GitHub Wiki

Using Network State Machine (nsm)

nsm is a Python tool we've put together to coordinate responsibility for resources in a shared network.

That's it! Probably all you need to know about nsm. Isn't that awesome ... ??!! . . . . .

OK, that was a dreadful attempt at humor, so let's just drop that attitude and get a little more serious...

What's It Do?

Alright, the best way to explain the need for and operation of nsm is to explain by way of example. So, let's go with a typical example where something like EBS acts as a gateway between a MANET radio network and the organization's enterprise network. This utility can be found in our public repository.

Something like this:

Now, for purposes of, say, redundancy, we setup two EBS instances so that if one of them falls over (actually that would never happen because our software has absolutely no bugs and has NEVER EVER EVER been known to crash!), we'd want the other one to automatically take over. Also, we need to be absolutely sure that only one EBS is active at any one time because if two are active, we'd end up with a packet loop and nasty things would ensue.

This situation would create a loop and kill the system pretty quickly as both instances of EBS will be passing the SAME traffic between the MANET and the enterprise network - and therefore EACH OTHER!

We want something like the following where only one EBS is active (green) while the other is idle (orange).

Of course we could modify EBS to check in with it's "partner" and coordinate with it who is active. But say we're not using EBS. Maybe we're using Rallypoints, or Engage Activity Recorders, or someone else's software that we have no hope of changing. Even if we could change the software to support this kind of thing, we'd need to change all of those components. And for those that cannot be changed, we're S-O-L! So we need something else to take care of this very specialized task and which can be used in a generic fashion for all situations where we want to apply the "Highlander Principle" ("There can be only one").

OK, we made up that business of "Highlander Principle". It's not a thing. But it does sound cool - no?

Anyway ... let's just add a little to our requirement. Perhaps, instead of just 2 instances of EBS in our example, we want 3. Maybe 4? Maybe 137? (Yeah, we've seen that kind of thing before.) And ... let's say that instead of just EBS, we have a combination of stuff on each gateway machine - let's say an instance of EBS and an RP on each of our 137 machines. Modifying all those components for each machine and having each instance on each machine coordinating with other instances of itself on the other 136 machines; will become quite painful and messy very quickly.

Here's an idea of that - gnarly huh?

So, we figured that we'd build something outside of all this that is solely responsible for this hairy scary coordination business and leave the other goodies to perform the jobs they're specialized in.

How nsm works is by communicating with other instances of itself on other machines on the network to determine who is responsible for a resource. Now, as resource is simply something you give a name to. The term doesn't mean anything special to nsm, its just a name that has meaning for your setup. In our example, we'll name our resource "ebs-gateway" to represent EBS. We want nsm to determine which ebs-gateway is to be active and, when that determination is made, take some form of action.

We'll get to the action stuff and dive more into multiple resources and other fun stuff further on. Right now we'll take a look at how nsm operates.

How's It Work?

In our example where we have two EBS instances, we'll add nsm to each computer housing an EBS instance. Then, we configure nsm to communicate with other nsm instances over the MANET - and exclusively over the MANET. We say exclusively because what we're fundamentally interested in having only one EBS at a time connected to the ... MANET!

It looks something like this.

Here we've installed nsm on each machine, told it to use the network interface card that connects to the MANET, and gave it a shared address inside the MANET to chat on - this is typically IP multicast but you also tell nsm to use IP broadcast.

And there we have it: the little blue boxes talk to each other over the shared blue "pipe" inside the MANET, coordinating between each other who's in charge of what.

Its really important to understand that the "pipe" inside the MANET cannot be a point-to-point connection. Rather, it has to be a point-to-multipoint connection like multicast or broadcast. The reason for this is twofold: first, you'd have to know the IP addresses of the machines running nsm and keep those updated on each nsm instance as things change. Second, if we have more than two instances of nsm, it quickly becomes more and more cumbersome to manage that IP addressing business. So, simply using multicast (or broadcast if you really have to) solves the problem.

Also, and probably most important frankly, is that nsm speaks to its peers over the MANET - not the enterprise network. That's because what we're trying to figue out is who is connected to the MANET, not who's connected to the enterprise network. Hopfully that makes sense.

States

As it's name hopefully conveys, nsm is a state machine that operates over a network. Those states are idle, goingActive, and active.

idle means that nsm is just that - idle. It's not doing anything other than listening to the network (that shared pipe on the MANET).
goingActive means that nsm has decided it should activate soon - but it's not active yet.
active, of course, means that its active.

When nsm starts up, it starts in the idle state. It attaches to the shared pipe and listens for a little while to see if there are any declarations from other nsm instances about their state(s). If it doesn't hear any declarations nsm transitions to a goingActive state and begins sending out declarations of its own indicating its intent to go active. This goes on for a little while and, assuming nsm doesn't receive declarations from other nsm instances about them either going active or active, it finally transitions to an active state.

Now, in the idle state, nsm does not transmit any traffic - it just listens. When it goes into the goingActive state, it begins sending traffic (the declaration) asserting that it is going active. When it transitions to the active state, it continues sending traffic but, this time, the content of the traffic is an indication that the nsm is now active. Pretty straightforward.

These declarations - be they for goingActive or active - are typically sent every second. You might think this will use up a ton of bandwidth. And you'd be right depending on what you think a ton is. For us, a ton of traffic is anything more than 1kbps. nsm typically uses 1-3kbps bandwidth. So, on networks that have megabits of bandwidth or even hundreds of kilobits of bandwidth, nsm's minor traffic load is not even noise. But here at RTS we don't like using bandwidth that much, so we're always trying to get that traffic profile down as much as we can. We'll just agree that we're being paranoid and leave it at that.

Anyway ... once an nsm instance has gone active, all the other nsm instances go into idle state and stop sending any traffic. So the bottom-line is that 99% of the time, only the active nsm instance is sending traffic on the network. That remaining 1% covers the situation where the MANET is not delivering packets quickly enough (around 5 seconds delay) to the other nsm instances which, in turn, will start their transitioning process and begin sending their own traffic. However, as soon as an instance that is goingActive (or even active) sees a declaration from another instance that it's either going active or is, in fact, already active with a higher election token; the receiving instance shuts up immediately and goes back to an idle state. Its kinda cool!

Election Token

We mentioned the term election token above. There's nothing particularly magical about it. All the token is is a random number that nsm dynamically calculates for each resource its responsible for (we'll talk about multiple resources shortly). The token is used to determine who should actually go active in the event two or more instances of nsm figure they're the one to go active. Imagine this: two nsm instances - A and B. Imagine they start up on their respective machines at the same time, attach to the MANET at the same time, and start their timing counters at the same time. In practice there's about a 0.000001% (not an empircal number by the way) chance of that actually happening. But, imagine it does.

In that situation, A and B are going to get into a situaiton where they'll simultaneously enter goingActive and begin transmitting their declarations. Of course, each peer will ostensibly receive the other peer's packet at around the same time. Without a token of some sort, neither end will know whether to continue going active or to back down to an idle state. So we have a token. Let's say A calculates a token of 55 and B calculates a token of 137. When A receives B's declaration, it compares B's token of 137 to its own token of 55. Its obvious that A at that point determines that B must win and A (being the good citizen it is), acknowledges B winning the "election" and backs down to an idle state. B on the other hand does the same thing and determines that it should continue on its path to declare its intention to go active for a little while and, eventually transition to an active state.

All the while while B stays online and active, A will just listen for declarations and stay in its idle state. However, if B goes away, A will reach a timeout point and transition to the goingActive state. At that point it recalculates its token; tossing away the previous value of 55. From then on - at least for the duration of goingActive and (potentially) active, the new token value is used. Let's say its 7721.

OK, let's assume that in this scenario, B didn't actualy "go away", A just stopped receiving B' declarations - maybe because the network wasn't feeling well, tons of packets got corrupted, the MANET briefly partioned, or whatever. So now we have both A and B active!

That would be bad if nsm was stupid software. But it's not (we think). Rather, both A and B, upon receiving the others' declaration both do the same thing as before - looking at that token to decide who wins. Well, A will see B's value of 137 and compare it to its token of 7721. A won't do anything - it'll just stay active. B on the other hand does the same thing and, obviously, figures out right away its lost an "election" it wasn't even part of and has effectively been ousted from office so to speak. (This is sounding too political right now, isn't it?)

Anyway, the point is that nsm works really hard to ensure that only one instance is active at a time and in the event of two (or more) active instances, quickly resolves the dispute. Given timing issues (such as network propagation delay and other nasties), there's always a chance in those situations that there will briefly be two or more active instances of nsm but that kind of "glaring" is rare and very short-lived - generally no longer than a second or two if it ever happens at all.

So ... What Now?

Right, we've covered how nsm works and how it talks to its peers. But, so what? What does nsm actually do with this terrific capability. Well, the answer is ... nothing!

Actually, it doesn't do nothing. It does something. But something generic in nature. We didn't want to have custom versions of nsm for every customer and for every scenario. So, we've kept the core nsm logic simple and straightforward (and unchanged) and put in place the ability for nsm to run operating system command-lines every time a state transition happens. Those command-lines are defined by you rather than nsm so you can do anything you want when your command-line (usually a script of some sort) is invoked.

When nsm enters its idle state, it runs a command-line specified as onIdle in its configuration. Likewise, when it transitions to goingActive or active, it'll run the onGoingActive and onActive command-lines respectively. The contents of those command-lines/scripts/whatever is entirely up to you to define.

In our example of ensuring that only one EBS is active, you can imagine that we'd want to start EBS running when nsm goes active, and stop EBS running otherwise.

Here's some examples where we have EBS on a Linux machine running under systemd:

onIdle.sh

#!/bin/bash
sudo systemctl stop engagebridged

onGoingActive.sh

#!/bin/bash
# Nothing to be done for EBS during goingActive

onActive.sh

#!/bin/bash
sudo systemctl start engagebridged

These scripts are only invoked during state transition - so you don't need to be concerned about the script being called multiple times for the same state. However, to be safe, you may want to do some checking of your own for the state of your goodies. In our example of using systemd, its no big problem as we can call systemd to start EBS as often as we want - it'll just ignore a start request if EBS is already running. Same goes for stopping it.

Configuration

Configuring nsm - like pretty much all our software - is done using a JSON file. By default, nsm will look for nsm_conf.json in the directory where the nsm executable is located but you can override this with a command-line option (see below).

Here's abbreviated example of a bare-minimum configuration. We'll look at the full configuration later on.

{
    "id": "nsm-kit-01",

    "networking": {
        "interfaceName": "en0",
        "address": "239.17.18.19",
        "port": 8513
    },

    "resources":[        
        "ebs-gateway"
    ],

    "run": {
        "onIdle": "./onIdle.sh",
        "beforeGoingActive": "./beforeGoingActive.sh",
        "onGoingActive": "./onGoingActive.sh",
        "beforeActive": "./beforeActive.sh",
        "onActive": "./onActive.sh"
    }
}

The id field must UNIQUELY identify the instance of nsm. If you screw this up and have duplicate id's in your environment, bad things will happen. In the example we've given the instance an ID but, if you don't, nsm will automatically generate one every time it starts. Actually, unless you specifically want to set your own IDs, just leave the id blank and let nsm take care of it.

Next, the interfaceName is important because it tells nsm which of the computer's network interfaces are to be used for connectivity to the MANET.

The address and port are the multicast address and port to use - all instances need to use the same address and port obviously. (If you want to use broadcast, set the address to 255.255.255.255.)

Next, there's an array/list of resource names. For our initial example, we've just got one resource name: ebs-gateway.

Finally, the run section tells nsm what commands to run at various times. We already spoke about onIdle, onGoingActive, and onActive. But what about those others? Well, beforeGoingActive will be invoked before nsm even tries to enter the goingActive state. Similarly beforeActive will be invoked before nsm actually tries to go active.

In the case of beforeGoingActive, nsm expects the invocation to return a string that defines/overrides the range within which its random token is to be calculated. By default, this range is from 1000000 to 2000000, meaning that the token will be calculated to fall inside this range. But, let's say that you've got some cool logic on your box which you use to determine that the local machine is more "important" than any other "common" machine. If that's the case, your script for beforeGoingActive would, say, return something like 2000001-3000000.

Now, your local nsm's token is totally going to win whatever election dispute there is because it's token has tipped the scales to its own advantage. (Here we go again with the polictical stuff ...!)

Whatever logic you want to use to determine your token range is entirely up to you. Some thoughts, though, on this:

The machine is a beefed-up core server system vs a mobile kit machine.
The machine has better networking capabilities than anyone else.
CPU or memory or network utilization on the machine has historically trended low.
Organizational security policy dictates the machine must be used if available.
Whatever else takes your fancy.

The beforeActive script, on the other hand, is pretty much like a sanity check where nsm gets final permission to enter the active state. So, let's say that at the last-minute we want our machine to back off for whatever reason, we can cancel the operation. All that nsm cares about here is whether the script returns 1 or 0. If it returns 1, nsm will proceed to active. If the script returns 0, nsm will back off and return to an idle state.

Here's some examples:

beforeGoingActive.sh

#!/bin/bash
echo "2000001-3000000"

beforeActive.sh

#!/bin/bash
if [ some_magic_condition_is_satisfied_and_we_can_continue_to_active ]; then
    echo "1"
else
    echo "0"
fi

Detailed Configuration

Finally, the delicious geeky stuff. Here's a full example of nsm_conf.json:

{
        "id": "",
        "priority": 0,
        "networking": {
                "interfaceName": "en0",
                "address": "239.17.18.19",
                "port": 8513,
                "ttl": 64,
                "tos": 56,
                "cryptoPassword": "",
                "txOversend": 0,
                "rxLossPercentage": 0,
                "rxErrorPercentage": 0,
                "txLossPercentage": 0,
                "txErrorPercentage": 0
        },
        "resources": [
                "ebs_gateway"
        ],
        "run": {
                "onIdle": "./onIdle.sh",
                "beforeGoingActive": "./beforeGoingActive.sh",
                "onGoingActive": "./onGoingActive.sh",
                "beforeActive": "./beforeActive.sh",
                "onActive": "./onActive.sh"
        },
        "timing": {
                "txIntervalSecs": 1,
                "transitionWaitSecs": 5,
                "internalMultiplier": 1
        },
        "electionToken": {
                "start": 1000000,
                "end": 2000000
        },
        "logging": {
                "level": 1,
                "dashboard": false,
                "logCommandOutput": false
        }
}

And here's the breakdown:

id As explained earlier, this is the unique ID for this instance of nsm. Unless you have a specific need to track instances, just leave this field empty.
priority A priority level from 0 - 255 indicating the overall priority of the nsm instance. The default is 0. Nodes with lower priorities will always lose the election to higher priority nodes. Nodes within the same priority level will elect between each other based on their calculated tokens. This means that if a lower priority node is going active or is already active, that state will be taken away from them if a higher priority node joins the infrastructure for the same resource(s).
networking.interfaceName The name of the network card/device that connects the machine to the MANET.
networking.address The multicast IP address that all instances of nsm will communicate on via the MANET. Specify 255.255.255.255 if you'd rather have nsm use broadcast than multicast.
networking.port The multicast/broadcast IP port that nsm must use.
networking.ttl The Time-To-Live value that nsm will set on the UDP packets it transmits. Make sure this value is sufficiently large for packets to traverse the entire MANET.
networking.tos The Type Of Service value set in UDP packets by the transmitter to hint to the network about the importance of nsm's packets. Ideally we want these packets to get the highest possible priority so nsm defaults to a value of 56 which means "network control packet". However, your mileage may vary based on the operating system that nsm is running on and/or whether all the components in your network pathway supports TOS packet marking. See the Wikipedia article for more information about the values you could use.
networking.cryptoPassword If specified, this string will be used as a baseline to generate an AES128 key with which to encrypt UDP traffic. Be aware that the encryption logic that nsm uses is more than simple AES128. Rather, each packet contains not just the encrypted data but some additional goodies to help protect against intrusion, replay attacks, timing attacks, and so on. Its probably a little bit of overkill but we're kinda paranoid around here. Of course, each nsm instance needs to use the same key. And one more thing, the encryption increases bandwidth utilization by around 75%. That sounds like a lot - and it is - but its that way because nsm's payload is so small that the overhead added by the encryption will take it from around 200 bytes per packet (depending on resource count) to around 350 bytes per packet.
networking.txOversend Tells nsm how many additional packets to send every time it transmits. A zero value will only result in a single packet being sent each time. A value of 1 will cause 1 extra to be sent each time, 2 for two extra packets, and so on. This is useful if you're experiencing significant packet loss on your MANET - in which case you should probably sort that problem out as soon as possible.
networking.rxLossPercentage Specifies the percentage of received packets that nsm should throw away. This is only really useful if you're troubelshooting your system and want to see how nsm will deal with packet loss. For example, if you want to simulate 15% packet loss on your MANET, plug in 15 here. Generally, though, leave this at 0.
networking.rxErrorPercentage Sort of like networking.rxLossPercentage but, in this case, simulates corruption of packets on your MANET.
networking.txLossPercentage Pretty much the same as networking.rxLossPercentage but simulates loss at the transmitting end.
networking.txErrorPercentage Same as networking.rxErrorPercentage but, in this case, simulates corruption at the transmitting end.
resources This is an array/list of resources that you want nsm to manage. All instances of nsm managing the same resources need to have the same list - obviously. Now, the reason this is a list and not just a single value is that nsm can handle multiple resources simultaneously. We'll delve into that in the next section.
run.onIdle You know what this is from earlier - run this command when nsm transitions to an idle state.
run.beforeGoingActive Run this before transitioning to goingActive. See above for more.
run.onGoingActive Run this once transitioned to goingActive.
run.beforeActive Explained before. If this command returns 1 continue to transitioning to active. Otherwise, revert to idle.
run.onActive Run this once nsm has transitioned to active.
timing.txIntervalSecs How often to transmit packets when in goingActive or active state. Generally you should leave this at 1 or you run the risk of causing some network-wide race conditions.
timing.transitionWaitSecs How long to wait before transitioning to a new state. The default of 5 is a pretty good number but feel free to play with it. However, if you try to make it less that 3 seconds, nsm will bark at you and refuse to operate because there's all kinds of race conditions across you network that can be created if you mess this up.
timing.internalMultiplier By default this is set to 1 - and you should leave it that way. However, if you want to play with nsm or want transitions, packet transmissions, and all that happen faster or slower, set a value here that will be multiplied with timing.txIntervalSecs and timing.transitionWaitSecs. For example: if you want to artificially slow things down - say to half the normal speed - set a value of 2. If you want to speed things up - say to double the normal pace - set a value of 0.5. Be careful with this setting (and the other timing values for that matter) as misconfiguration and mismatched timings on different nsm instances can have serious ramifications for the network-wide state machine.
electionToken.start The start of the election token range. Override this at runtime with the firstvalue returned from run.beforeGoingActive as described earlier.
electionToken.end The end of the election token range. Override this at runtime with the second value returned from run.beforeGoingActive as described earlier.
logging.level Levels are 0 (FATAL), 1 (ERROR), 2 (WARNING), 3 (INFO), and 4 (DEBUG). Generally you'd want to leave this at 1.
logging.dashboard Presents a cute little dashboard where you can see your resources, their states, packet details, and so on. Only useful when you're running nsm in a terminal. Of no use when nsm is running the background - where it usually would live. Generally leave this at false.
logging.logCommandOutput If NOT in dashboard mode, shows the output from the commands invoked during state transitions. Useful if you're debugging your commands. Generally leave this at false.

IMPORTANT: This is where we absolve ourselves of any responsibility for things going south on you if you configure nsm incorrectly. Consider your settings changes VERY carefully when you make them.

Multiple Resources

As promised, let's take a quick look at the resources section of the configuration file. In our examples we've only shown a single named resource (ebs-gateway) being managed. But you can specific a whole lot of resources if that makes sense. For example:

.
.
.
    "resources": [
                "ebs_gateway",
                "pli_monitor",
                "netmonitor",
                "system_database",
                "video_stream1",
                "video_stream2"
        ],
.
.
.

The logic described in this document applies to each resource in this list, independently. So, let's say that on the machines in the network, we have EBS (as in our examples), and we also have an application that is handling Position Location Information (PLI), a network monitor, some sort of distributed database, and two video streams. And we want to make sure that only one instance of each of those resources is active at any one time.

Now, we could just do this on a machine-by-machine basis and use a single resource - say primary_server - and when it goes active, turn on our EBS, PLI, network monitor, database, and the video streams. Everything will, of course, happen on that one machine.

But if we'd like nsm to effectively distribute those resource management tasks across the network according to its election algorithm, we split our resources out as shown in the list above. In that case, an instance of nsm might be active for the EBS resource, while another is handling the database and the first video stream, while a third instance is handling the PLI business and the second video stream.

That is hopefully pretty straightforward to understand but there's the question of those command lines ... how do they know *which resource they're being invoked for?

Well, that's simple - nsm will pass the resource name as the first parameter to the command line. So, say for instance we specify ./onActive.sh for the run.onActive element in the configuration, the actual command that nsm will execute will be:

./onActive.sh <resource_name>

So, if the ebs_gateway resource goes active, the command invoked will be:

./onActive.sh ebs_gateway

If the video_stream1 resource goes active, the command invoked will be:

./onActive.sh video_stream1

... and so on.

Also, these commands are invoked seperately for each resource's state transition - nsm does not batch up command-line invocations for multiple resources as it doesn't really make much sense given that each resource can transition at different times to the others.

Asymmetric Resource Lists

By the way, not all nsm instances have to have the same resource list. So you might have instance A have:

.
.
.
    "resources": [
                "ebs_gateway",
                "netmonitor",
                "system_database",
        ],
.
.
.

And instance B have:

.
.
.
    "resources": [
                "ebs_gateway",
                "pli_monitor",
                "video_stream1",
                "video_stream2"
        ],
.
.
.

In that example, A will always "win" for netmonitor and system_database, while B will always win for pli_monitor, video_stream1, and video_stream2. Responsibility for the shared resource - ebs_gateway - would be duked out by A and B.

Many, Many Instances

Our examples have just shown two instances: A and B. Don't let that fool you, nsm will (should?) scale as much as you need. If you have 10 instances or 100 or 1,000; it's core logic remains unchanged. And, because nsm uses multicast (or broadcast) and because only the active (or going active) instance actually transmits traffic on the network, there's very little bandwidth being used no matter how many instances you have.

In our internal testing we scaled an nsm setup to just shy of 100 instances and all worked well with network bandwidth seldom exceeding 2kbps. When we started forcing errors and purposefully mismatch timings, we did have some spikes - but those spikes seldom went above 10kbps. And that was at 65% packet loss, all kinds of corruption, and so on.

Other Areas Of Use

Hopefully you think that nsm is just the coolest toy you've ever seen and are in awe of the spectular software we develop here at RTS. (Please say you are!)

Regardless, you're likely thinking that this thing can be used for other purposes than just MANET interaction. You're right - nsm doesn't really care if you're using a MANET. All it wants to do is communicate with other instances of itself and automatically elect an active instance. It doesn't even need to connect to a MANET for its shared pipe - you can happily use nsm on your regular network for all kinds of situations where you need to apply the Highlander Principle (OK, now that term is a thing - we like it too!)

For example, use nsm to build out hot-standby systems for redundancy purposes. Or use it to solve issues with routing and ingress/egress on your enterprise network. Maybe you want to do something like have a guarantee that only a single controller entity will be active for your sexy new distributed logic engine (nah, we don't even know what that means but it sounds awesome).

Have fun!