Zookeeper status is inactive in a cluster - ganeshahv/Contrail_SRE GitHub Wiki

Problem

contrail-status shows zookeeper as inactive.

Traceback on the zookeeper docker logs

2020-07-23 18:07:17,242 [myid:] - INFO  [main:QuorumPeerConfig@136] - Reading configuration from: /conf/zoo.cfg
2020-07-23 18:07:17,265 [myid:] - INFO  [main:QuorumPeer$QuorumServer@185] - Resolved hostname: 172.168.1.79 to address: /172.168.1.79
2020-07-23 18:07:17,265 [myid:] - INFO  [main:QuorumPeer$QuorumServer@185] - Resolved hostname: 172.168.1.69 to address: /172.168.1.69
2020-07-23 18:07:17,266 [myid:] - INFO  [main:QuorumPeer$QuorumServer@185] - Resolved hostname: 172.168.1.46 to address: /172.168.1.46
2020-07-23 18:07:17,266 [myid:] - INFO  [main:QuorumPeerConfig@398] - Defaulting to majority quorums
2020-07-23 18:07:17,269 [myid:4] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2020-07-23 18:07:17,270 [myid:4] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2020-07-23 18:07:17,270 [myid:4] - INFO  [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2020-07-23 18:07:17,282 [myid:4] - INFO  [main:QuorumPeerMain@130] - Starting quorum peer
2020-07-23 18:07:17,289 [myid:4] - INFO  [main:ServerCnxnFactory@117] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2020-07-23 18:07:17,292 [myid:4] - INFO  [main:NIOServerCnxnFactory@89] - binding to port /172.168.1.79:2181
2020-07-23 18:07:17,299 [myid:4] - INFO  [main:QuorumPeer@1159] - tickTime set to 2000
2020-07-23 18:07:17,299 [myid:4] - INFO  [main:QuorumPeer@1205] - initLimit set to 5
2020-07-23 18:07:17,299 [myid:4] - INFO  [main:QuorumPeer@1179] - minSessionTimeout set to -1
2020-07-23 18:07:17,299 [myid:4] - INFO  [main:QuorumPeer@1190] - maxSessionTimeout set to -1
2020-07-23 18:07:17,306 [myid:4] - ERROR [main:QuorumPeer@294] - Setting LearnerType to PARTICIPANT but 4 not in QuorumPeers.
2020-07-23 18:07:17,307 [myid:4] - INFO  [main:QuorumPeer@1470] - QuorumPeer communication is not secured!
2020-07-23 18:07:17,307 [myid:4] - INFO  [main:QuorumPeer@1499] - quorum.cnxn.threads.size set to 20
2020-07-23 18:07:17,309 [myid:4] - INFO  [main:FileSnap@86] - Reading snapshot /data/version-2/snapshot.100000028
2020-07-23 18:07:17,513 [myid:4] - ERROR [main:QuorumPeerMain@92] - Unexpected exception, exiting abnormally
java.lang.RuntimeException: My id 4 not in the peer list
    at org.apache.zookeeper.server.quorum.QuorumPeer.startLeaderElection(QuorumPeer.java:719)
    at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:638)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)

Reason

The myid file is not getting updated to the value in the zookeeper conf file.

Workaround

1]. Check the zookeer conf file to get the myids of the nodes in the cluster

docker exec -it configdatabase_zookeeper_1 cat /conf/zoo.cfg

2]. Compare the myid on the node to the ID specified in the conf file

cat /var/lib/docker/volumes/configdatabase_config_zookeeper/_data/myid

3]. If there is a mismatch in the myid, edit it to the correct value based on the conf file.

4]. Restart the zookeeper docker. docker restart configdatabase_zookeeper_1