Handling network updates - Nymeria25/DCIT GitHub Wiki

Food for thought: expect node failure.

The nodes in a P2P network are both server and client. Having stateful servers which keep track of the nodes in the network, we must make sure they are in sync with the clients. In other words, the list of nodes should be the same on server side and client side, for each node in the network. When does the network change?

join operations
signoff operations
node failure

Current architecture:

The join and signoff operations start as a request to one of the servers and then spread throughout the entire network. While the list of nodes are updated on every server, in parallel, each client updates his own list of nodes, by addressing "his own server". The node failures, however, are discovered on client side, whenever an RPC call is performed. This would yield in updating the lists of nodes for every server in the network and subsequently for the clients. (We implemented this with try/catch blocks.)

Tests:

some nodes fail
some nodes sign off
some nodes fail and other sign off
the master node fails
the master node signs off
some nodes fail (including master) and others sign off
some nodes sign off (including master) and others fail

++ all of the above mixed with join operations and reconnections of previous nodes.