Handling network updates - Nymeria25/DCIT GitHub Wiki
Food for thought: expect node failure.
The nodes in a P2P network are both server and client. Having stateful servers which keep track of the nodes in the network, we must make sure they are in sync with the clients. In other words, the list of nodes should be the same on server side and client side, for each node in the network. When does the network change?
- join operations
- signoff operations
- node failure
Current architecture:
The join and signoff operations start as a request to one of the servers and then spread throughout the entire network. While the list of nodes are updated on every server, in parallel, each client updates his own list of nodes, by addressing "his own server". The node failures, however, are discovered on client side, whenever an RPC call is performed. This would yield in updating the lists of nodes for every server in the network and subsequently for the clients. (We implemented this with try/catch blocks.)
Tests:
- some nodes fail
- some nodes sign off
- some nodes fail and other sign off
- the master node fails
- the master node signs off
- some nodes fail (including master) and others sign off
- some nodes sign off (including master) and others fail
++ all of the above mixed with join operations and reconnections of previous nodes.