Cilium multi cluster POC on AWS EKS - johnzheng1975/devops_way GitHub Wiki
- Connect two kubernetes, let they can access each other thorugh Pod IP, backup each other.
- Prepare istio-multi cluster
PodCIDR ranges in all clusters must be non-conflicting.
- Challenge: EKS do not support "PodCIDR allocation"
-
Solution: Combine Cilium with AWS VPC CNI
- Create two EKS with differnt VPC cidr (10.248.0.0/18; 192.168.0.0/16)
- Use "AWS VPC CNI" to assign Pod IP address, which will belong to VPC cidr
- Install and Use Cilium with "chainingMode"
Etcd need be managed by Cilium using etcd-operator. Use a TLS protected etcd cluster with Cilium.
- Solution: Install cilium with managed etcd
# For eks creation, you can use terraform or eksctl, with specified the VPC range.
# For cilium installation, you can use below
$curl -LO https://github.com/cilium/cilium/archive/v1.6.8.tar.gz
$tar xzvf v1.6.8.tar.gz
$cd cilium-1.6.8/
$cd install/kubernetes/
$helm3 template cilium --namespace kube-system --version 1.6.8 --set global.etcd.enabled=true --set global.etcd.managed=true --set global.cni.chainingMode=aws-cni --set global.masquerade=false --set global.tunnel=disabled --set global.nodeinit.enabled=true > cilium.yaml
$cat cilium.yaml
$kubectl create -f cilium.yaml
$kubectl get pods -A -o wideNodes in all clusters must have IP connectivity between each other. The network between clusters must allow the inter-cluster communication.
-
Solution:
- Create AWS Peering Connections for these two VPC
- Change routetable for each vpc, point to each other
- Change "inbound rules" of security group binded on two cluster nodes, allow "All Traffic" for intenal access (10.248.0.0/18; 192.168.0.0/16)
Cilium interacts with the Linux kernel to install BPF program which will then perform networking tasks and implement security rules. In order to install BPF programs system-wide, CAP_SYS_ADMIN privileges are required. These privileges must be granted to cilium-agent.
- Solution:: Edit cilium daemonset, add rights "SYS_ADMIN"
-
Command:: kubectl edit ds -n kube-system cilium;
Make sure below is SYS_ADMIN is added
securityContext: capabilities: add: - NET_ADMIN - SYS_ADMIN - SYS_MODULE privileged: true
Cilium requires access to the host networking namespace.
- Solution:: Edit cilium daemonset, add rights "SYS_ADMIN"
-
Command:: kubectl edit ds -n kube-system cilium;
Make sure below exists:
hostNetwork: true
For each cluster, make name/ ID unique.
kubectl -n kube-system edit cm cilium-config
[ ... add/edit ... ]
cluster-name: cluster1
cluster-id: "1"
Create etcd external service for each cluster, apply below yaml file.
apiVersion: v1
kind: Service
metadata:
name: cilium-etcd-external
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
spec:
type: LoadBalancer
ports:
- port: 2379
selector:
app: etcd
etcd_cluster: cilium-etcd
io.cilium/app: etcd-operator
- Clone the cilium/clustermesh-tools repository.
$git clone https://github.com/cilium/clustermesh-tools.git $cd clustermesh-tools - Extract the TLS certificate, key and root CA authority for each cluster.
$./extract-etcd-secrets.sh - Repeat this step for all clusters, copy the result to same folder.
- Generate a single Kubernetes secret from all the keys and certificates extracted.
$./generate-secret-yaml.sh > clustermesh.yaml - Ensure that the etcd service names can be resolved.
$./generate-name-mapping.sh > ds.patch #The ds.patch will like: #spec: # template: # spec: # hostAliases: # - ip: "10.138.0.18" # hostnames: # - cluster1.mesh.cilium.io # - ip: "10.138.0.19" # hostnames: # - cluster2.mesh.cilium.io #Apply it $kubectl -n kube-system patch ds cilium -p "$(cat ds.patch)" - Apply clustermesh.yaml, prepared secrets/ keys
kubectl -n kube-system apply -f clustermesh.yaml - Restart all pods for cilium, cilium-operator, cilium-etcd, etcd-operator, coredns
kubectl -n kube-system delete pods --all - Wait 10 - 40 minutes, until all pods works and not restart again.
- Check
cilium node listcan get full list of nodes discovered.tryc2@ip-172-31-0-31:~/istio-install/istio-1.3.8$ k exec -ti cilium-scwh8 -n ks -- cilium node list Name IPv4 Address Endpoint CIDR IPv6 Address Endpoint CIDR ctryc2/ip-10-248-11-160.us-west-2.compute.internal 10.248.11.160 10.160.0.0/16 ctryc2/ip-10-248-5-152.us-west-2.compute.internal 10.248.5.152 10.136.0.0/16 ctryc2/ip-10-248-7-35.us-west-2.compute.internal 10.248.7.35 10.35.0.0/16 ctryc3/ip-192-168-36-48.us-west-2.compute.internal 192.168.36.48 10.48.0.0/16 ctryc3/ip-192-168-4-254.us-west-2.compute.internal 192.168.4.254 10.254.0.0/16 ctryc3/ip-192-168-67-107.us-west-2.compute.internal 192.168.67.107 10.107.0.0/16 ctryc3/ip-192-168-84-58.us-west-2.compute.internal 192.168.84.58 10.58.0.0/16
-
One cluster pod can access another cluster pod through IP
# Pods IP in one cluster (tryc2) tryc2@ip-172-31-0-31:~/istio-install/istio-1.3.8$ k get pods -n legacy -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES httpbin-5446f4d9b4-x4jfw 1/1 Running 0 18h 10.248.0.25 ip-10-248-5-152.us-west-2.compute.internal <none> <none> sleep-5bbf6b4f77-5zp2t 1/1 Running 0 18h 10.248.6.155 ip-10-248-7-35.us-west-2.compute.internal <none> <none> # Pods IP in another cluster (tryc3) tryc3@ip-172-31-0-31:~/clustermesh-tools$ k get pods -n legacy -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES httpbin-5446f4d9b4-6hvnb 1/1 Running 0 18h 192.168.13.185 ip-192-168-4-254.us-west-2.compute.internal <none> <none> sleep-5bbf6b4f77-v7dfq 1/1 Running 0 18h 192.168.68.184 ip-192-168-84-58.us-west-2.compute.internal <none> <none> # The pod in cluster tryc2 can access the pod ip in cluster tryc3 directly tryc2@ip-172-31-0-31:~/istio-install/istio-1.3.8$ k exec -ti sleep-5bbf6b4f77-5zp2t -n legacy -- curl 192.168.13.185/ip { "origin": "10.248.7.35" } # The pod in cluster tryc3 can access the pod ip in cluster tryc2 directly tryc3@ip-172-31-0-31:~/clustermesh-tools$ k exec -ti sleep-5bbf6b4f77-v7dfq -n legacy -- curl 10.248.0.25/ip { "origin": "192.168.84.58" } -
Support global service, point to the pods cross two clusters. You can implement this through add
io.cilium/global-service: "true"in annotations.# Deploy in cluster1 kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.7.1/examples/kubernetes/clustermesh/global-service-example/cluster1.yaml # Deploy in cluster2 kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.7.1/examples/kubernetes/clustermesh/global-service-example/cluster2.yaml # Access the service, you can get response from pods which are in different clusters tryc3@ip-172-31-0-31:~/clustermesh-tools$ k exec -ti x-wing-5db7fc5c8f-2xhxj -- curl rebel-base {"Galaxy": "Alderaan", "Cluster": "Cluster-2"} tryc3@ip-172-31-0-31:~/clustermesh-tools$ k exec -ti x-wing-5db7fc5c8f-2xhxj -- curl rebel-base {"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
- Refer to http://docs.cilium.io/en/stable/gettingstarted/clustermesh/#troubleshooting
Use similar command to make sure each cilium pods works well
tryc2@ip-172-31-0-31:~/istio-install/istio-1.3.8$ kubectl get pods -n kube-system -l k8s-app=cilium |grep cilium|awk '{print $1}'|xargs -i sh -c 'echo "\n";kubectl logs {} -n kube-system|grep " remote "' level=info msg="New remote cluster configuration" clusterName=ctryc3 config=/var/lib/cilium/clustermesh/ctryc3 kvstoreErr="<nil>" kvstoreStatus= subsys=clustermesh level=info msg="Connection to remote cluster established" clusterName=ctryc3 config=/var/lib/cilium/clustermesh/ctryc3 kvstoreErr="<nil>" kvstoreStatus= subsys=clustermesh level=info msg="Established connection to remote etcd" clusterName=ctryc3 config=/var/lib/cilium/clustermesh/ctryc3 kvstoreErr="<nil>" kvstoreStatus="etcd: 1/1 connected, lease-ID=1cd771353dce2904, lock lease-ID=1cd771353dce2906, has-quorum=true: https://ctryc3.mesh.cilium.io:2379 - 3.3.12 (Leader)" subsys=clustermesh level=info msg="New remote cluster configuration" clusterName=ctryc3 config=/var/lib/cilium/clustermesh/ctryc3 kvstoreErr="<nil>" kvstoreStatus= subsys=clustermesh level=info msg="Connection to remote cluster established" clusterName=ctryc3 config=/var/lib/cilium/clustermesh/ctryc3 kvstoreErr="<nil>" kvstoreStatus= subsys=clustermesh level=info msg="Established connection to remote etcd" clusterName=ctryc3 config=/var/lib/cilium/clustermesh/ctryc3 kvstoreErr="<nil>" kvstoreStatus="etcd: 1/1 connected, lease-ID=6da571353d29d95b, lock lease-ID=6da571353d29d95d, has-quorum=true: https://ctryc3.mesh.cilium.io:2379 - 3.3.12 (Leader)" subsys=clustermesh level=info msg="New remote cluster configuration" clusterName=ctryc3 config=/var/lib/cilium/clustermesh/ctryc3 kvstoreErr="<nil>" kvstoreStatus= subsys=clustermesh level=info msg="Connection to remote cluster established" clusterName=ctryc3 config=/var/lib/cilium/clustermesh/ctryc3 kvstoreErr="<nil>" kvstoreStatus= subsys=clustermesh level=info msg="Established connection to remote etcd" clusterName=ctryc3 config=/var/lib/cilium/clustermesh/ctryc3 kvstoreErr="<nil>" kvstoreStatus="etcd: 1/1 connected, lease-ID=73cd71353d503821, lock lease-ID=73cd71353d503823, has-quorum=true: https://ctryc3.mesh.cilium.io:2379 - 3.3.12" subsys=clustermesh - If you find one cilium not healty, restart it manually.
I think this is may focus on POC, still need improve for production environment.
- etcd nodes operated by the etcd-operator will not use persistent storage. Once the etcd cluster looses quorum, the etcd cluster is automatically re-created by the cilium-etcd-operator. Cilium will automatically recover and re-create all state in etcd. This operation can take couple of seconds
John: may take even 10 minutes to recovery after all kube-system pods restartand may cause minor disruptions as ongoing distributed locks are invalidated and security identities have to be re-allocated. (http://docs.cilium.io/en/stable/gettingstarted/k8s-install-etcd-operator/#limitations) - The cilium pod may unhealthy, or cannot join in clustermesh sometimes, until you restart it manually.
- http://docs.cilium.io/en/stable/gettingstarted/k8s-install-etcd-operator/#k8s-install-etcd-operator
- http://docs.cilium.io/en/stable/gettingstarted/clustermesh/
- https://docs.cilium.io/en/v1.6/gettingstarted/k8s-install-eks/
- https://cilium.io/blog/2019/03/12/clustermesh/
- http://docs.cilium.io/en/stable/install/system_requirements/#firewall-requirements