1.5.2 Load balanced deployment - grzzboot/pingpong-service GitHub Wiki

Deploying a load balanced (High Availability) deployment

The days when we prayed for our single monolith service to have endless uptime is over! Today we build micro service architectures made up of dosens of services that have multiples replicas behind reliable load balancing to guarantee uptime to a desired extent. Many companies have fought hard to build infrastructure for this but very few have succeeded. Being a medium size IT-business this is an expensive project and requires you to hire some of the best brains in the business...

As you may have guessed by now GKE solves this problem more or less by nature and you don't need to pay anything extra for it. GKE keeps track of your so called replicas, that make up the same deployment, check that they are healthy and when they are not it automatically takes action.

To enable load balancing we need to add an extra component in our solution which is called a Service.

Assuming that you're still in the single-deployment folder of the cloned repo do a delete of the current deployment:

kustomize build . | kubectl delete -f -

This will wipe out the namespace and all, but thats OK cause we're gonna put it back again in a better state!

Now change directory over to the load-balanced-deployment directory.

cd ../load-balanced-deployment

List the files to see what's in there. You should see something like this:

-rw-r--r--  1 user  group  1119 Dec  8 16:25 deployment.yaml
-rw-r--r--  1 user  group   148 Dec  8 16:34 kustomization.yaml
-rw-r--r--  1 user  group   139 Dec  8 14:54 namespace.yaml
-rw-r--r--  1 user  group   113 Dec  8 14:55 service.yaml

One new file; - service.yaml. The deployment file now contains a replicas: 2 property which means that Kubernetes should start and maintain two identical instances of the service at all times. Also we've added an entry in the kustomization.yaml, it contains a line to include the service.yaml in the YAML blob to be produced.

The service specification is relatively simple, it just points out the port that it targets on the pods inside the deployment. The tie is handled by the kustomize component that connects the different objects together using selectors.

apiVersion: v1
kind: Service
metadata:
  name: pingpong-service
spec:
  type: ClusterIP
  ports:
    - port: 8080

Go ahead and run kustomize to deploy this setup:

kustomize build . | kubectl apply -f -

Inspection of the environment

Type the commands below to view the deployment, pods and services respectively.

Deployment

kubectl get deployments -n pingpong

Should give you something like:

NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
pingpong-deployment   2/2     2            2           4m13s

Already here you can see indications of that you have two replicas (instances) of the pingpong-service running.

Pods

kubectl get pods -n pingpong

Should give you something like:

NAME                                  READY   STATUS    RESTARTS   AGE
pingpong-deployment-5fd978c7b-25k9x   1/1     Running   0          4m58s
pingpong-deployment-5fd978c7b-62sw8   1/1     Running   0          4m58s

Yeah, there's two of them!

Services

kubectl get services -n pingpong

Should give you something like:

NAME               TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
pingpong-service   ClusterIP  172.21.76.101   <none>        8080:30821/TCP   6s

If all this looks more or less the same in your console(s) then we're ok!

Verify load balancing

Since we still haven't exposed the service on the Internet we can't try them directly from a browser, or a command line tool like curl, across the Internt. Port-forwarding may sound like a good idea but unfortunately we will end up on one of the two pods and stay there, even if we port-forward to the service object.

BUT, don't worry, the pingpong-service Docker image has been prepared for this scenario and we will also take the opportunity to learn how to "get into" a running container. This can be useful when trying to figure out problems of various sorts.

To attach to the pingpong-service container (yes the container inside the pod) we are going to use the kubectl exec command using the name of one of the pods, doesn't matter which one...

kubectl exec -it -n pingpong <name-of-pod> -- /bin/sh

kubectl exec mean execute something on the pod in the specified namespace. What is executed is specified after --. In this case we start the shell on an alpine Linux container. Adding the flag -it makes the shell interactive. Had we not specified -it the exec would have terminated immediately.

After this I recommend that you attach to the logs of each of the pods to see load balancing in action. Do this in two different consoles so that we keep the shell inside the container open. Refer to the previous section to learn how to attach to the logs of a pod.

Now, in the interactive container shell type;

curl http://pingpong-service:8080/ping

Run this a couple of times and you should be able to see some action in the logs of BOTH services. It may not be absolutely 50/50 but to some extent they share the requests coming in.

In the next section you'll learn how to expose your service to the Internet.

⚠️ **GitHub.com Fallback** ⚠️