1.5.2 Load balanced deployment - grzzboot/pingpong-service GitHub Wiki
The days when we prayed for our single monolith service to have endless uptime is over! Today we build micro service architectures made up of dosens of services that have multiples replicas behind reliable load balancing to guarantee uptime to a desired extent. Many companies have fought hard to build infrastructure for this but very few have succeeded. Being a medium size IT-business this is an expensive project and requires you to hire some of the best brains in the business...
As you may have guessed by now GKE solves this problem more or less by nature and you don't need to pay anything extra for it. GKE keeps track of your so called replicas, that make up the same deployment, check that they are healthy and when they are not it automatically takes action.
To enable load balancing we need to add an extra component in our solution which is called a Service.
Assuming that you're still in the single-deployment folder of the cloned repo do a delete of the current deployment:
kustomize build . | kubectl delete -f -
This will wipe out the namespace and all, but thats OK cause we're gonna put it back again in a better state!
Now change directory over to the load-balanced-deployment directory.
cd ../load-balanced-deployment
List the files to see what's in there. You should see something like this:
-rw-r--r-- 1 user group 1119 Dec 8 16:25 deployment.yaml
-rw-r--r-- 1 user group 148 Dec 8 16:34 kustomization.yaml
-rw-r--r-- 1 user group 139 Dec 8 14:54 namespace.yaml
-rw-r--r-- 1 user group 113 Dec 8 14:55 service.yaml
One new file; - service.yaml.
The deployment file now contains a replicas: 2
property which means that Kubernetes should start and maintain two identical instances of the service at all times. Also we've added an entry in the kustomization.yaml, it contains a line to include the service.yaml in the YAML blob to be produced.
The service specification is relatively simple, it just points out the port that it targets on the pods inside the deployment. The tie is handled by the kustomize component that connects the different objects together using selectors.
apiVersion: v1
kind: Service
metadata:
name: pingpong-service
spec:
type: ClusterIP
ports:
- port: 8080
Go ahead and run kustomize to deploy this setup:
kustomize build . | kubectl apply -f -
Type the commands below to view the deployment, pods and services respectively.
kubectl get deployments -n pingpong
Should give you something like:
NAME READY UP-TO-DATE AVAILABLE AGE
pingpong-deployment 2/2 2 2 4m13s
Already here you can see indications of that you have two replicas (instances) of the pingpong-service running.
kubectl get pods -n pingpong
Should give you something like:
NAME READY STATUS RESTARTS AGE
pingpong-deployment-5fd978c7b-25k9x 1/1 Running 0 4m58s
pingpong-deployment-5fd978c7b-62sw8 1/1 Running 0 4m58s
Yeah, there's two of them!
kubectl get services -n pingpong
Should give you something like:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
pingpong-service ClusterIP 172.21.76.101 <none> 8080:30821/TCP 6s
If all this looks more or less the same in your console(s) then we're ok!
Since we still haven't exposed the service on the Internet we can't try them directly from a browser, or a command line tool like curl, across the Internt. Port-forwarding may sound like a good idea but unfortunately we will end up on one of the two pods and stay there, even if we port-forward to the service object.
BUT, don't worry, the pingpong-service Docker image has been prepared for this scenario and we will also take the opportunity to learn how to "get into" a running container. This can be useful when trying to figure out problems of various sorts.
To attach to the pingpong-service container (yes the container inside the pod) we are going to use the kubectl exec command using the name of one of the pods, doesn't matter which one...
kubectl exec -it -n pingpong <name-of-pod> -- /bin/sh
kubectl exec
mean execute something on the pod in the specified namespace. What is executed is specified after--
. In this case we start the shell on an alpine Linux container. Adding the flag-it
makes the shell interactive. Had we not specified-it
the exec would have terminated immediately.
After this I recommend that you attach to the logs of each of the pods to see load balancing in action. Do this in two different consoles so that we keep the shell inside the container open. Refer to the previous section to learn how to attach to the logs of a pod.
Now, in the interactive container shell type;
curl http://pingpong-service:8080/ping
Run this a couple of times and you should be able to see some action in the logs of BOTH services. It may not be absolutely 50/50 but to some extent they share the requests coming in.
In the next section you'll learn how to expose your service to the Internet.