1.9 Hybrid connectivity - grzzboot/pingpong-service GitHub Wiki

We have lots of On-Prem stuff...

... Cloud is not doable for us...

On the contrary, there is nothing undoable in combining Cloud with On-Prem. If that problem had not been addressed by the big Cloud providers then the concept of Cloud computing wouldn’t have survived day one.

All major Cloud providers offers a range of solutions for interconnecting your Cloud resources with your On-Prem resources. In GCP these methods are known as Hybrid Connectivity.

There are two different main variants available at present (2020-09-19):

  • Cloud Interconnect
  • Cloud VPN

Cloud Interconnect

Cloud Interconnect comes in two flavors, one is a dedicated solution and one is a partner solution. The dedicated solution is basically you operating your stuff inside a datacenter that already has a network connection to Googles network. The partner solution is about connecting to a partner that in their turn have a dedicated connection. You can see a schematic view of the two variants here. The image is taken from the Cloud Interconnect page in the Cloud Console.

Cloud Interconnect variants

Since one needs to do really(!) complicated and expensive stuff to set any of those two up these are not solutions that we can explore here at all. You can read more about them in the GCP documentation.

If you require very high bandwidth between Cloud and On-Prem this may be the way to go for you. If your bandwidth requirements are more ordinary then continue reading about Cloud VPN.

Cloud VPN

The Cloud VPN variant is the one that we are going to explore in this example. It is also the only one of the two that you can setup by yourself, given of course that you have a VPN gateway On-Prem to connect to.

For many scenarios Cloud VPN is good enough but, as mentioned above, if you have very high bandwidth requirements then maybe the Cloud Interconnect solutions are better!

The VPN solution comes in two variants, a Classic static route non-high availability VPN and a dynamic route high availability VPN.

The SLA of non HA VPN is 99.9% and for Classic VPN it is 99.99%. Note however that this is the guarantee that you get from GCP. The weakest link will be the limiting factor so the VPN Gateway on the On-Prem side must also live up to this.

In this example we are going to explore both VPN variants by connecting two different GCP project sites, one being the pingpong-site1 that we’ve already worked on.

Prepare the existing project site

We’re gonna keep each project site simple in terms of our own services and focus on the interconnect part.

Start by removing any present resources in the existing pingpong-site1-gcp-demo, except the cluster if you still have it. This includes databases and caches previously created, we won’t need them anymore.

If you've deleted the cluster used previously refer to the GKE section to set it up again. You need only a single node cluster for this.

Create a second project site

Remember the previous section where we sat up the VPC for the pingpong-site1-project?

We specified the network ranges of our VPC and the reason for that was mentioned as being able to connect to another Cloud project. Actually, whether or not is a Cloud project in GCP, AWS, Azure or your own On-Prem network makes little difference. All we need are two different networks with distinct network ranges so that we can refer internal addresses/ranges without conflicts.

So the network that we currently use in pingpong-site1-gcp-demo has the following network range:

Entire network:
172.20.0.0/15

Nodes:
172.20.0.0/16

Pods:
172.21.128.0/17

Services:
172.21.0.0/17

It means that we consume all addresses between 172.20.0.0 and 172.21.255.255. So those addresses can't exist in any network that we connect, that would cause chaos.

We're going to create a new GCP project with the following VPC network setup:

Entire network:
172.22.0.0/15

Nodes:
172.22.0.0/16

Pods:
172.23.128.0/17

Services:
172.23.0.0/17

This ensures no collisions!

Go ahead and create a new project, just like you created the first one, and name it pingpong-site2-gcp-demo. You can give it another name, but again, the images referred to in the k8s-parts will need to be changed accordingly.

Before we can perform any SDK actions against the new project we must set our config to point to it!

gcloud config set project pingpong-site2-gcp-demo

Now, SDK commands will execute against the proper project. Create a new VPC network and subnets in the new project with the following commands:

gcloud compute networks create pingpong-site2-net \
  --subnet-mode=custom
gcloud compute networks subnets create pingpong-site2-subnet \
  --network=pingpong-site2-net \
  --region=europe-west3 \
  --range=172.22.0.0/16 \
  --secondary-range=pods=172.23.128.0/17 \
  --secondary-range=services=172.23.0.0/17

Once created go ahead and create a cluster just like the one in the first project:

gcloud beta container clusters create "pingpong-site2-cluster" \
  --zone "europe-west3-a" \
  --no-enable-basic-auth \
  --release-channel "regular" \
  --machine-type "n1-standard-1" \
  --image-type "COS" \
  --disk-type "pd-standard" \
  --disk-size "100" \
  --scopes "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" \
  --num-nodes "1" \
  --enable-cloud-logging \
  --enable-cloud-monitoring \
  --enable-ip-alias \
  --network "projects/pingpong-site2-gcp-demo/global/networks/pingpong-site2-net" \
  --subnetwork "projects/pingpong-site2-gcp-demo/regions/europe-west3/subnetworks/pingpong-site2-subnet" \
  --cluster-secondary-range-name "pods" \
  --services-secondary-range-name "services" \
  --default-max-pods-per-node "110" \
  --addons HorizontalPodAutoscaling,HttpLoadBalancing \
  --enable-autoupgrade \
  --enable-autorepair \
  --enable-tpu

Also, since we're in a different Cloud project than the old one we don't have any access to the Docker images built there and hence we must again perform gcloud builds submit --config=cloudbuild.yaml to produce the images that we're going to need in pingpong-site2-gcp-demo as well.

There are ways to avoid this, for instance by creating a common shared project for images and then configuring a service account for pulling images from there, but that's too much out of scope for this exercise.

Create static IP internal endpoints

Since Kubernetes internal pod and service IP:s are quite unreliable, and can't be predicted in any way, we're going to allocate a static internal IP address inside each project that we can tie to Kubernetes services and use as targets when communicate between services inside the different networks.

Start by allocating one in pingpong-site2-gcp-demo:

gcloud compute addresses create pingpong-site2-static-internal-ip \
  --region europe-west3 \
  --subnet pingpong-site2-subnet \
  --addresses 172.22.1.0

Then change project in the console and do the same for pingpong-site1-gcp-demo:

gcloud compute addresses create pingpong-site1-static-internal-ip \
  --region europe-west3 \
  --subnet pingpong-site1-subnet \
  --addresses 172.20.1.0

Note that the IP addresses above are specified hard! You SHOULD be able to obtain these addresses unless you've really been playing around and allocated one of them to some other resource. If so, either wipe out the other resource or adapt your config during the remainder of this example.

Let's deploy

For this example we're, again, going to deploy the pingpong-simple-service but with a slightly different config. The simple service has been equipped with two profiles where it uses a Spring @Scheduled-bean to ping another service using a given hostname/IP-address every 5 seconds. To no ones surprise those profiles points to the IP addresses that we just allocated above!

Make sure you are configured to pingpong-site1-gcp-demo and have made a handshake with the pingpong-site1-cluster:

gcloud container clusters get-credentials pingpong-site1-cluster --zone=europe-west3-a

Navigate to the k8s/hybrid-connectivity/site1 folder and run kustomize to deploy.

Switch project, handshake with the pingpong-site2-cluster and navigate to k8s/hybrid-connectivity/site2-folder and run kustomize to deploy on site-2.

Now, since your SDK can only be connected to one project at the time it is not going to be possible to tail the logs of both services, but you can use the web console and navigate to Workloads under the Kubernetes Engine section for each project.

Looking into the logs of either one of the services you'll see that none of them is able to speak to one another because we have no VPN in place yet. Their logs should be full of Failed to Ping service: <Url> messages.

Create a VPN

Continue this showcase by setting up either of the variants of VPN:

⚠️ **GitHub.com Fallback** ⚠️