K8s Cluster Setup Guide with Alcor - futurewei-cloud/alcor-int GitHub Wiki

Instructions to deploy a new Kubernetes cluster with Alcor

This Kubernetes cluster setup guide shows how to deploy Alcor (a.k.a. Mizar MP) Control Plane and Mizar Data Plane to a new Kubernetes cluster or an existing cluster for container network provisioning.

Tested version:

  • mizar-mp:7e4d0c3 + mizar:856d5e9

I. System Requirement

Before testing, the followings are required:

  • (Pure Ubuntu, version: 18.04-Bionic) or (VirtualBox + Ubuntu, version: 18.04-Bionic) recommended. Ubuntu WSL on top of MS Windows is not recommended.
  • VPC limit check per region – e.g.) 5 VPCs per a region in Futurewei’s current AWS environment.

NOTE: To start deploy deployment on existing Kubernetes cluster, skip to "Build and Deployment" step.

II. Setup Testing Environment – AWS and Kubernetes CLI

One way to deploy K8s on AWS is the follow: Instruction web page: https://medium.com/containermind/how-to-create-a-kubernetes-cluster-on-aws-in-few-minutes-89dda10354f4

  1. Based on the instruction from the web page, install AWS cli and Kubernetes CLI.

  2. Preparation: AWS account, AWS access key ID (Access key & Secret Access Key)

  3. Follow 1 ~ 5 of the Instruction web page.

  4. Create an AWS S3 bucket. There may be "Location error" without “--create-bucket-configuration LocationConstraint=us-west-2"

    aws s3api create-bucket --bucket ${bucket_name} --region us-west-2 --create-bucket-configuration LocationConstraint=us-west-2

    aws s3api put-bucket-versioning --bucket ${bucket_name} --versioning-configuration Status=Enabled

  5. Follow 7~8 of the Instruction web pages.

  6. Create a Kubernetes cluster definition. In this example, VPC cluster name is ericli2.k8s.local and you can change it.

    kops create cluster
    --master-count=1
    --node-count=3
    --node-size=t2.xlarge
    --zones=us-west-2a
    --name=ericli2.k8s.local 
    --image=ami-0ff280954dd7aafd5 
    --networking flannel
    --yes
    
  7. Create Public & Private Key pair. If a user enters a key name (e.g. mizar-mp-test), it generates public key (e.g. mizar-mp-test.pub) and private key (mizar-mp-test) pair. Upload public key to AWS as followings. This command may change public key name as ${KOPS_CLUSTER_NAME}+Fingerprint value. So, check the actual key name and change the private key file name accordingly.

    ssh-keygen

    kops create secret sshpublickey admin -i ~/mizar_mp_test.pub --name ${KOPS_CLUSTER_NAME} --yes

  8. kops update cluster --name ${KOPS_CLUSTER_NAME}

  9. kops validate cluster // It takes time, so wait and try several times until every node is ready

III. Build and Deployment

Data Plane Deployment

To deploy Mizar simply do the following:

  1. kubectl apply -f https://raw.githubusercontent.com/futurewei-cloud/mizar/master/etc/k8s/kube-mizar.yaml
  2. To validate that the deployment is complete, run
    1. kubectl get pods -o wide --all-namespaces | grep transit | awk '{print $4}'
    2. And verify that all statuses are "Running"

Host Control Agent Deployment

For each K8s node of a VPC (e.g. 3 nodes and 1 master in the sample VPC at section II-6) 0. View the EXTERNAL-IP of the 3 nodes:

  • kubectl get nodes -o wide
  1. Login to each K8s node and get the latest code on the machine:
    • git clone --recurse-submodules -j8 https://github.com/futurewei-cloud/mizar-mp.git ~/mizar-mp
  2. Build a Control Agent on each K8s node. (may take about 10mins to build the dependencies):
    • cd ~/mizar-mp/AlcorControlAgent/etc/k8s/images/scripts && sudo ./aca-node-init.sh

On the machine with kubectl setup, start the Control Agent DaemonSet:

  • kubectl apply -f https://raw.githubusercontent.com/futurewei-cloud/alcor-control-agent/master/etc/k8s/aca-daemonset.yaml

Controller Deployment

  1. On the machine with kubectl setup, change directory to AlcorController

    1. cd ~/mizar-mp/AlcorController/
  2. Deploy Redis

    1. kubectl apply -f ./kubernetes/db/redis-deployment.yaml
    2. kubectl apply -f ./kubernetes/db/redis-service.yaml
    3. kubectl get svc redis-master
  3. Update controller with latest redis configuration

    1. RedisClusterIP=$(kubectl get service redis-master -o jsonpath="{.spec.clusterIP}")
    2. RedisPort=$(kubectl get service redis-master -o jsonpath="{.spec.ports[*]['port']}")
    3. sed -e "s/\${redis_host}/${RedisClusterIP}/" -e "s/\${redis_port}/${RedisPort}/" ./src/resources/application-k8s-template.properties > ./src/resources/application-k8s.properties
  4. Update controller with latest host IP and Mac

  • Manual update

    1. Get nodes' list

      kubectl get node -o wide | grep mizar

    2. Update ./config/machine.json. To do it, for each node with kernel-version containing mizar, find its host IP (i.e. Internal-IP) and host mac (by ssh into the node) and do ifconfig eth0, e.g.,

      ssh -i ./kops [email protected] -t sudo ip addr show eth0 | grep "ether\b" | awk '{print $2}'.

  • Automatic update

    If you want to update automatically, please refer the following wiki page.

    creating machine.json

  1. Build a docker container for controller and push to registry
    1. ./scripts/build.sh
      1. For Mac, run the corresponding commands instead, that is:
        1. brew install maven
        2. brew tap AdoptOpenJDK/openjdk
        3. mvn clean
        4. mvn compile
        5. mvn install -DskipTests
    2. sed -e "s/\${DevEnv}/k8s/" -i Dockerfile
      1. For Mac, the -i option doesn't work, so just replace ${DevEnv} to k8s manually in Dockerfile.
    3. MyCtrlImage=<your docker id>/<Controller image name of your choice>,
      e.g., MyCtrlImage=fwnetworking/controller
    4. sudo docker build -t $MyCtrlImage .
    5. sudo docker tag ${MyCtrlImage}:latest ${MyCtrlImage}:0.1
    6. sudo docker login
    7. sudo docker push ${MyCtrlImage}:0.1
  2. Clean up before deployment
    1. kubectl delete svc alcor
    2. kubectl delete deployment alcor // If your deployment is the first one, there will be "Not Found Error".
  3. Deploy Controller app
    1. sed -e "s|\${ControllerImage}|${MyCtrlImage}|" -i kubernetes/app/controller-deployment.yaml
      1. For Mac, the -i option doesn't work, so just replace ${ControllerImage} manually,
    2. kubectl apply -f kubernetes/app/controller-deployment.yaml
    3. kubectl get deployments -o wide
  4. Deploy Controller service
    1. kubectl expose deployment alcor --type=LoadBalancer --name=alcor
    2. kubectl get svc -o wide
    3. kubectl get po -A

Sanity Test and Create default VPC/Subnet

  1. Get the controller service ip:
    • AlcorSvcIp=$(kubectl get svc | grep alcor | awk '{print $4}')
  2. Confirm the controller is up:
    • curl ${AlcorSvcIp}:8080/actuator/health
  3. Deploy a sample VPC with one subnet and one port:
    • ./scripts/sampleVpcTest.sh $AlcorSvcIp 8080 false

CNI-plugin Deployment

Deploy as Kubernetes DaemonSet

see CNI plugin daemonset deployment for details.

Manual Deployment from Source Code

  1. You need golang (go version 1.12.x verified & recommended) for the build. You can install version v1.12.6 from https://golang.org/doc/install.
    1. wget https://dl.google.com/go/go1.12.16.linux-amd64.tar.gz
    2. sudo tar -C /usr/local -xzf go1.12.16.linux-amd64.tar.gz
    3. export PATH=$PATH:/usr/local/go/bin
  2. Build cni plugin from source code, if have not done so before
    1. git clone https://github.com/futurewei-cloud/mizar-mp.git ~/mizar-mp
    2. cd ~/mizar-mp/Plugins/cniplugin/cmd
    3. go build -o mizarmp
  3. copy the built mizarmp binary to /opt/cni/bin/
    1. sudo cp mizarmp /opt/cni/bin
  4. put mizarmp.conf file under /etc/cni/net.d/, and remove any other files if any. Two configurations related to Alcor Controller:
    1. mpurl: This is the controller service URL, use ${AlcorSvcIp}:8080
    2. hostId: This is the host id. Use kubectl get nodes -o wide | grep mizar | awk '{print $1}' and find the matching id with current host
    3. An example of mizarmp.conf:
    {
      "cniVersion": "0.3.1",
      "name": "mizarmp-test",
      "type": "mizarmp",
      "mpurl": "http://ab78f6402365042a9b60db57287e3bf3-1721552294.ca-central-1.elb.amazonaws.com:8080",
      "subnet": "a87e0f87-a2d9-44ef-9194-9a62f178594e",
      "project": "3dda2801-d675-4688-a63f-dcda8d327f50",
      "hostId": "ip-172-20-38-125.ca-central-1.compute.internal"
    }
    

IV. Testing and Monitoring

Sanity Testing

Show how to deploy two containers in the subnets and verify connectivity

  1. kubectl run --image=nginx --replicas=3
  2. kubectl apply -f https://raw.githubusercontent.com/luksa/kubernetes-in-action/master/Chapter08/curl.yaml
  3. kubectl get pod -o wide
  4. assuming the nginx pods created have ip addr as ip-a & ip-b, respectively, run
    1. kubectl exec curl ping ip-a,
    2. kubectl exec curl curl http://ip-b

Netdata monitoring integration

  1. On the machine with kubectl setup, install helm https://helm.sh/docs/intro/install/
  2. Use Netdata Helm chart for kubernetes deployments: https://github.com/netdata/helmchart
    1. git clone https://github.com/netdata/helmchart ~/helmchart
    2. helm install ~/helmchart -g --set service.type=LoadBalancer -f ~/mizar-mp/Monitoring/netdata/values.yaml
  3. Access netdata dashboard by:
    1. kubectl get services to find the "EXTERNAL-IP" of "netdata" service
    2. Web browser URL: [EXTERNAL-IP of netdata]:19999
  4. Here is how it would look like: https://github.com/futurewei-cloud/mizar-mp/wiki/Monitoring:-Netdata