Coordinator nodes cluster - radumarias/rfs GitHub Wiki

Master cluster

We will have a cluster of coordinator nodes in Raft.

Main functionality is to know which nodes are up so we can send those to clients so they will do load balancing or if we need to do it. We can keep a Set of up nodes, health check between all and modify that Set.

Masterless cluster

If we don't want the penalty of a single master at a time we might explore CRDTs with Redis Sets to enforce constraints like uniqueness of filenames in same folder.

We could keep a modifiable (add, remove) Set or Map of nodes, health check between all and modify that structure and converge that.

For example we can have a Map like

  • key: node_key: String, can be the node IP
  • value: state: bool

We do healthcheck between all nodes and we send messages like (node_key, state). We rely on the fact that after a node goes down it will take at least few minutes to be replaced. We will process those messages and set the state for each node and the structure will converge to same state.
This could differ in the case of temporarily network issues, for that when we receive a state == false we will check again the node real state to avoid cases where an initial node down message was sent which is now obsolete as the node might be up again.

Load balancer

Ideally client would interact with a load balancer cluster (again with master kr masterless) which will contain the sharding distribution logic and will serve the client requests by directly communicating with available nodes.

Viewstamped Replication Revisited

https://pmg.csail.mit.edu/papers/vr-revisited.pdf