K8s Cluster Setup Guide with Alcor - futurewei-cloud/alcor-int GitHub Wiki

Instructions to deploy a new Kubernetes cluster with Alcor

This Kubernetes cluster setup guide shows how to deploy Alcor (a.k.a. Mizar MP) Control Plane and Mizar Data Plane to a new Kubernetes cluster or an existing cluster for container network provisioning.

Tested version:

mizar-mp:7e4d0c3 + mizar:856d5e9

I. System Requirement

Before testing, the followings are required:

(Pure Ubuntu, version: 18.04-Bionic) or (VirtualBox + Ubuntu, version: 18.04-Bionic) recommended. Ubuntu WSL on top of MS Windows is not recommended.
VPC limit check per region – e.g.) 5 VPCs per a region in Futurewei’s current AWS environment.

NOTE: To start deploy deployment on existing Kubernetes cluster, skip to "Build and Deployment" step.

II. Setup Testing Environment – AWS and Kubernetes CLI

One way to deploy K8s on AWS is the follow: Instruction web page: https://medium.com/containermind/how-to-create-a-kubernetes-cluster-on-aws-in-few-minutes-89dda10354f4

Based on the instruction from the web page, install AWS cli and Kubernetes CLI.
Preparation: AWS account, AWS access key ID (Access key & Secret Access Key)
Follow 1 ~ 5 of the Instruction web page.
Create an AWS S3 bucket. There may be "Location error" without “--create-bucket-configuration LocationConstraint=us-west-2"

aws s3api create-bucket --bucket ${bucket_name} --region us-west-2 --create-bucket-configuration LocationConstraint=us-west-2

aws s3api put-bucket-versioning --bucket ${bucket_name} --versioning-configuration Status=Enabled
Follow 7~8 of the Instruction web pages.

Create a Kubernetes cluster definition. In this example, VPC cluster name is ericli2.k8s.local and you can change it.

kops create cluster
--master-count=1
--node-count=3
--node-size=t2.xlarge
--zones=us-west-2a
--name=ericli2.k8s.local 
--image=ami-0ff280954dd7aafd5 
--networking flannel
--yes

Create Public & Private Key pair. If a user enters a key name (e.g. mizar-mp-test), it generates public key (e.g. mizar-mp-test.pub) and private key (mizar-mp-test) pair. Upload public key to AWS as followings. This command may change public key name as ${KOPS_CLUSTER_NAME}+Fingerprint value. So, check the actual key name and change the private key file name accordingly.

ssh-keygen

kops create secret sshpublickey admin -i ~/mizar_mp_test.pub --name ${KOPS_CLUSTER_NAME} --yes
kops update cluster --name ${KOPS_CLUSTER_NAME}
kops validate cluster // It takes time, so wait and try several times until every node is ready

III. Build and Deployment

Data Plane Deployment

To deploy Mizar simply do the following:

kubectl apply -f https://raw.githubusercontent.com/futurewei-cloud/mizar/master/etc/k8s/kube-mizar.yaml
To validate that the deployment is complete, run
1. kubectl get pods -o wide --all-namespaces | grep transit | awk '{print $4}'
2. And verify that all statuses are "Running"

Host Control Agent Deployment

For each K8s node of a VPC (e.g. 3 nodes and 1 master in the sample VPC at section II-6) 0. View the EXTERNAL-IP of the 3 nodes:

kubectl get nodes -o wide

Login to each K8s node and get the latest code on the machine:
- git clone --recurse-submodules -j8 https://github.com/futurewei-cloud/mizar-mp.git ~/mizar-mp
Build a Control Agent on each K8s node. (may take about 10mins to build the dependencies):
- cd ~/mizar-mp/AlcorControlAgent/etc/k8s/images/scripts && sudo ./aca-node-init.sh

On the machine with kubectl setup, start the Control Agent DaemonSet:

kubectl apply -f https://raw.githubusercontent.com/futurewei-cloud/alcor-control-agent/master/etc/k8s/aca-daemonset.yaml

Controller Deployment

On the machine with kubectl setup, change directory to AlcorController
1. cd ~/mizar-mp/AlcorController/
Deploy Redis
1. kubectl apply -f ./kubernetes/db/redis-deployment.yaml
2. kubectl apply -f ./kubernetes/db/redis-service.yaml
3. kubectl get svc redis-master
Update controller with latest redis configuration
1. RedisClusterIP=$(kubectl get service redis-master -o jsonpath="{.spec.clusterIP}")
2. RedisPort=$(kubectl get service redis-master -o jsonpath="{.spec.ports[*]['port']}")
3. sed -e "s/\${redis_host}/${RedisClusterIP}/" -e "s/\${redis_port}/${RedisPort}/" ./src/resources/application-k8s-template.properties > ./src/resources/application-k8s.properties
Update controller with latest host IP and Mac

Manual update
1. Get nodes' list
  
  kubectl get node -o wide | grep mizar
2. Update ./config/machine.json. To do it, for each node with kernel-version containing mizar, find its host IP (i.e. Internal-IP) and host mac (by ssh into the node) and do ifconfig eth0, e.g.,
  
  ssh -i ./kops [email protected] -t sudo ip addr show eth0 | grep "ether\b" | awk '{print $2}'.
Automatic update

If you want to update automatically, please refer the following wiki page.

creating machine.json

Build a docker container for controller and push to registry
1. ./scripts/build.sh
  1. For Mac, run the corresponding commands instead, that is:
    1. brew install maven
    2. brew tap AdoptOpenJDK/openjdk
    3. mvn clean
    4. mvn compile
    5. mvn install -DskipTests
2. sed -e "s/\${DevEnv}/k8s/" -i Dockerfile
  1. For Mac, the -i option doesn't work, so just replace ${DevEnv} to k8s manually in Dockerfile.
3. MyCtrlImage=<your docker id>/<Controller image name of your choice>,
  e.g., MyCtrlImage=fwnetworking/controller
4. sudo docker build -t $MyCtrlImage .
5. sudo docker tag ${MyCtrlImage}:latest ${MyCtrlImage}:0.1
6. sudo docker login
7. sudo docker push ${MyCtrlImage}:0.1
Clean up before deployment
1. kubectl delete svc alcor
2. kubectl delete deployment alcor // If your deployment is the first one, there will be "Not Found Error".
Deploy Controller app
1. sed -e "s|\${ControllerImage}|${MyCtrlImage}|" -i kubernetes/app/controller-deployment.yaml
  1. For Mac, the -i option doesn't work, so just replace ${ControllerImage} manually,
2. kubectl apply -f kubernetes/app/controller-deployment.yaml
3. kubectl get deployments -o wide
Deploy Controller service
1. kubectl expose deployment alcor --type=LoadBalancer --name=alcor
2. kubectl get svc -o wide
3. kubectl get po -A

Sanity Test and Create default VPC/Subnet

Get the controller service ip:
- AlcorSvcIp=$(kubectl get svc | grep alcor | awk '{print $4}')
Confirm the controller is up:
- curl ${AlcorSvcIp}:8080/actuator/health
Deploy a sample VPC with one subnet and one port:
- ./scripts/sampleVpcTest.sh $AlcorSvcIp 8080 false

CNI-plugin Deployment

Deploy as Kubernetes DaemonSet

see CNI plugin daemonset deployment for details.

Manual Deployment from Source Code

You need golang (go version 1.12.x verified & recommended) for the build. You can install version v1.12.6 from https://golang.org/doc/install.
1. wget https://dl.google.com/go/go1.12.16.linux-amd64.tar.gz
2. sudo tar -C /usr/local -xzf go1.12.16.linux-amd64.tar.gz
3. export PATH=$PATH:/usr/local/go/bin
Build cni plugin from source code, if have not done so before
1. git clone https://github.com/futurewei-cloud/mizar-mp.git ~/mizar-mp
2. cd ~/mizar-mp/Plugins/cniplugin/cmd
3. go build -o mizarmp
copy the built mizarmp binary to /opt/cni/bin/
1. sudo cp mizarmp /opt/cni/bin
put mizarmp.conf file under /etc/cni/net.d/, and remove any other files if any. Two configurations related to Alcor Controller:
1. mpurl: This is the controller service URL, use ${AlcorSvcIp}:8080
2. hostId: This is the host id. Use kubectl get nodes -o wide | grep mizar | awk '{print $1}' and find the matching id with current host
3. An example of mizarmp.conf:
```
{
  "cniVersion": "0.3.1",
  "name": "mizarmp-test",
  "type": "mizarmp",
  "mpurl": "http://ab78f6402365042a9b60db57287e3bf3-1721552294.ca-central-1.elb.amazonaws.com:8080",
  "subnet": "a87e0f87-a2d9-44ef-9194-9a62f178594e",
  "project": "3dda2801-d675-4688-a63f-dcda8d327f50",
  "hostId": "ip-172-20-38-125.ca-central-1.compute.internal"
}
```

IV. Testing and Monitoring

Sanity Testing

Show how to deploy two containers in the subnets and verify connectivity

kubectl run --image=nginx --replicas=3
kubectl apply -f https://raw.githubusercontent.com/luksa/kubernetes-in-action/master/Chapter08/curl.yaml
kubectl get pod -o wide
assuming the nginx pods created have ip addr as ip-a & ip-b, respectively, run
1. kubectl exec curl ping ip-a,
2. kubectl exec curl curl http://ip-b

Netdata monitoring integration

On the machine with kubectl setup, install helm https://helm.sh/docs/intro/install/
Use Netdata Helm chart for kubernetes deployments: https://github.com/netdata/helmchart
1. git clone https://github.com/netdata/helmchart ~/helmchart
2. helm install ~/helmchart -g --set service.type=LoadBalancer -f ~/mizar-mp/Monitoring/netdata/values.yaml
Access netdata dashboard by:
1. kubectl get services to find the "EXTERNAL-IP" of "netdata" service
2. Web browser URL: [EXTERNAL-IP of netdata]:19999
Here is how it would look like: https://github.com/futurewei-cloud/mizar-mp/wiki/Monitoring:-Netdata