zookeeper - seven/seven.github.io GitHub Wiki

http://zookeeper.apache.org/doc/current/recipes.html https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeperArticles

ZooKeeper is a high-performance coordination service for distributed applications.

Zookeeper replicates its data to multiple servers, which makes the data highly reliable and available.

It exposes common services so you don’t have to write them from scratch – such as:

configuration management synchronization You can use it to implement:

leader election presence protocols

Architecture of ZAB – ZooKeeper Atomic Broadcast protocol Definitions

**leader and followers- in ZooKeeper cluster, one of the nodes has a leader role and the rest have followers roles. The -leader is responsible for accepting all incoming state changes from the clients and replicate them to itself and to the --followers. read requests are load balanced between all followers and leader. transactions – client state changes that a leader propagates to its followers.

**‘e’ – the epoch of a leader. epoch is an integer that is generated by a leader when he start to lead and should be larger than epoch’s of previous leaders.

**‘c’ – a sequence number that is generated by the leader, starting at 0 and increasing. it is used together with an epoch to order the incoming clients state changes.

**‘F.history’ – follower’s history queue. used for committing incoming transactions in the order they arrived. outstanding transactions – the set of transactions in the F.History that have sequence number smaller than current COMMIT sequence number.

ZAB Implementation

clients read from any of the ZooKeeper nodes. clients write state changes to any of the ZooKeeper nodes and this state changes are forward to the leader node.

ZooKeeper uses a variation of two-phase-commit protocol for replicating transactions to followers. When a leader receive a change update from a client it generate a transaction with sequel number c and the leader’s epoch e (see definitions) and send the transaction to all followers. a follower adds the transaction to its history queue and send ACK to the leader. When a leader receives ACK’s from a quorum it send the the quorum COMMIT for that transaction. a follower that accept COMMIT will commit this transaction unless c is higher than any sequence number in its history queue. It will wait for receiving COMMIT’s for all its earlier transactions (outstanding transactions) before commiting.