CloudNative: etcd - x893675/note GitHub Wiki

etcd

etcd分布式键值数据库

  • 使用raft协议实现强一致性

应用场景:

  • 服务发现
  • 分布式锁
  • 消息订阅/发布
  • 选举

etcd集群

创建etcd集群的3种方式:

  • static
  • dns discovery
  • etcd discovery

常用命令记录:

docker run --rm -it \
--net host \
k8s.gcr.io/etcd:3.4.3-0 etcdctl \
--endpoints http://192.168.234.137:2379,http://192.168.234.137:3379,http://192.168.234.137:4379 --write-out="table" endpoint status
#--endpoints http://192.168.234.137:2379 --write-out="table" member list
#--endpoints http://192.168.234.137:2379 endpoint health --cluster
#--endpoints http://192.168.234.137:2379 member add infra2 --peer-urls=http://192.168.234.137:4380

statis方式创建etcd集群

该方式为 kubeadm 创建高可用k8s集群所有方式, 先创建一个单节点的etcd集群,通过etcd运行时添加成员的方式增加集群节点

环境说明:

  • infra0: 192.168.234.137, 2379, 2380, 2381
  • infra1: 192.168.234.137, 3379, 3380, 3381
  • infra2: 192.168.234.137, 4379, 4380, 4381

步骤:

  • 本地启动一个 etcd 容器:

    docker run -d \
    --net host \
    k8s.gcr.io/etcd:3.4.3-0 etcd \
    --name infra0 \
    --advertise-client-urls http://192.168.234.137:2379 \
    --initial-advertise-peer-urls http://192.168.234.137:2380 \
    --listen-peer-urls http://192.168.234.137:2380 \
    --listen-client-urls http://192.168.234.137:2379,http://127.0.0.1:2379 \
    --listen-metrics-urls http://127.0.0.1:2381 \
    --initial-cluster=infra0=http://192.168.234.137:2380 \
    --snapshot-count=10000
    
  • 添加一个节点, 注意 initial-cluster参数

    docker run --rm -it \
    --net host \
    k8s.gcr.io/etcd:3.4.3-0 etcdctl \
    --endpoints http://192.168.234.137:2379 member add infra1 --peer-urls=http://192.168.234.137:3380
    
    docker run -d \
    --net host \
    k8s.gcr.io/etcd:3.4.3-0 etcd \
    --name infra1 \
    --advertise-client-urls http://192.168.234.137:3379 \
    --initial-advertise-peer-urls http://192.168.234.137:3380 \
    --listen-peer-urls http://192.168.234.137:3380 \
    --listen-client-urls http://192.168.234.137:3379,http://127.0.0.1:3379 \
    --listen-metrics-urls http://127.0.0.1:3381 \
    --initial-cluster=infra1=http://192.168.234.137:3380,infra0=http://192.168.234.137:2380 \
    --initial-cluster-state=existing \
    --snapshot-count=10000
    
  • 添加第三个节点, 注意 initial-cluster参数

    docker run --rm -it \
    --net host \
    k8s.gcr.io/etcd:3.4.3-0 etcdctl \
    --endpoints http://192.168.234.137:2379 member add infra2 --peer-urls=http://192.168.234.137:4380
    
    docker run -d \
    --net host \
    k8s.gcr.io/etcd:3.4.3-0 etcd \
    --name infra1 \
    --advertise-client-urls http://192.168.234.137:3379 \
    --initial-advertise-peer-urls http://192.168.234.137:3380 \
    --listen-peer-urls http://192.168.234.137:3380 \
    --listen-client-urls http://192.168.234.137:3379,http://127.0.0.1:3379 \
    --listen-metrics-urls http://127.0.0.1:3381 \
    --initial-cluster=infra1=http://192.168.234.137:3380,infra0=http://192.168.234.137:2380 \
    --initial-cluster-state=existing \
    --snapshot-count=10000
    

dns discovery

TODO..

etcd discovery

TODO..

etcd 无法写入数据

etcd 默认不会自动 compact,需要设置启动参数,或者通过命令进行compact,如果变更频繁建议设置,否则会导致空间和内存的浪费以及错误。Etcd v3 的默认的 backend quota 2GB,如果不 compact,boltdb 文件大小超过这个限制后,就会报错:Error: etcdserver: mvcc: database space exceeded,导致数据无法写入。

解决办法:

  1. etcdctl --endpoints http://localhost:2379 --write-out=table endpoint status
  2. etcdctl --endpoints http://localhost:2379 alarm list
  3. etcdctl --endpoints http://localhost:2379 --write-out=json endpoint status | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*'
  4. etcdctl --endpoints http://localhost:2379 compact 495214
  5. etcdctl --endpoints http://localhost:2379 defrag
  6. etcdctl --endpoints http://localhost:2379 --write-out=table endpoint status
  7. etcdctl --endpoints http://localhost:2379 alarm disarm

参考链接