openshift 4.7 hybrid small topology 1.4 ahr manual - apigee/ahr GitHub Wiki
[ ] WIP: 0%
?. Define CIDRs
-
Machine Network: 10.0.0.0/16
- Control Plane Subnet: 10.0.0.0/20
- Compute Subnet: 10.0.16.0/20
-
Cluster Network: 10.1.0.0/20
-
Service Network: 10.2.0.0/16
?. Prepare environment variables configuration
cat <<"EOF" > ~/ocp-network.env
export OPENSHIFT_SA=ocp-sa
export BASE_DOMAIN=exco-ocp.com
export NETWORK=ocp
export REGION=us-central1
export ZONE_1=us-central1-a
export ZONE_2=us-central1-c
export NETWORK_CIDR=10.0.0.0/9
export BASTION_NETWORK=ocp
#export BASTION_REGION=europe-west1
#export BASTION_SUBNET=machine-subnet
#export BASTION_ZONE=europe-west1-b
export BASTION_REGION=us-central1
export BASTION_ZONE=us-central1-a
export BASTION_SUBNET=machine-subnet
# last: 10.0.255.255 65,536
export MACHINE_NETWORK_CIDR=10.0.0.0/16
export MACHINE_SUBNET=machine-subnet
# last: 10.0.15.255 4,096
export MACHINE_SUBNET_CIDR=10.0.0.0/20
# The IP address pools for pods.
# last: 10.7.255.255 262,144
export CLUSTER_NETWORK_CIDR=10.4.0.0/14
export CLUSTER_NETWORK_HOST_PREFIX=23
# The IP address pools for services.
# last: 10.8.255.255, 64,536
export SERVICE_NETWORK_CIDR=10.8.0.0/16
export CONTROL_PLANE_SUBNET=control-plane-subnet
# last: 10.0.31.255 4,096
export CONTROL_PLANE_SUBNET_CIDR=10.0.16.0/20
export COMPUTE_SUBNET=compute-subnet
# last: 10.0.47.255 4,096
export COMPUTE_SUBNET_CIDR=10.0.32.0/20
EOF
?. Define PROJECT environment variable
export PROJECT=<project-id>
?. Source environment variables
source ~/ocp-network.env
?. Create ocp network and firewall rules
gcloud compute networks create $NETWORK --project=$PROJECT --subnet-mode=custom
gcloud compute firewall-rules create $NETWORK-allow-internal --network $NETWORK --allow tcp,udp,icmp --source-ranges $NETWORK_CIDR
gcloud compute firewall-rules create $NETWORK-allow-access --network $NETWORK --allow tcp:22,tcp:3389,icmp
?. Create subnets
gcloud compute networks subnets create $MACHINE_SUBNET --project=$PROJECT --range=$MACHINE_SUBNET_CIDR --network=$NETWORK --region=$BASTION_REGION
gcloud compute networks subnets create $CONTROL_PLANE_SUBNET --project=$PROJECT --range=$CONTROL_PLANE_SUBNET_CIDR --network=$NETWORK --region=$REGION
gcloud compute networks subnets create $COMPUTE_SUBNET --project=$PROJECT --range=$COMPUTE_SUBNET_CIDR --network=$NETWORK --region=$REGION
?. Create Cloud NAT instance
gcloud compute routers create ocp-router \
--network $NETWORK \
--region=$REGION
gcloud compute routers nats create ocp-nat \
--region=$REGION \
--router=ocp-router \
--auto-allocate-nat-external-ips \
--nat-all-subnet-ip-ranges
?. WARNING: For simplicity we made SA a project owner. It still is better than using your credentials on a shared VM. You might want to concider list of more granular roles before doing this in production.
export INSTALLER_SA_ID=installer-sa
gcloud iam service-accounts create $INSTALLER_SA_ID
roles='roles/owner'
for r in $roles; do
gcloud projects add-iam-policy-binding $PROJECT \
--member=serviceAccount:$INSTALLER_SA_ID@$PROJECT.iam.gserviceaccount.com \
--role=$r
done
gcloud compute instances create bastion \
--network $BASTION_NETWORK \
--subnet $BASTION_SUBNET \
--zone=$BASTION_ZONE \
--machine-type=e2-standard-2 \
--service-account $INSTALLER_SA_ID@$PROJECT.iam.gserviceaccount.com \
--scopes cloud-platform
?. SSH into bastion host
?. Make sure required utilities are installed
sudo apt install -y mc dnsutils git jq
?. Repeat cat <<"EOF" > ~/ocp-network.env
command from the Network planning section at the bastion host.
cat <<EOF > ~/ocp-network.env
...
?. Define PROJECT environment variable again. This time in the current shell
export PROJECT=<project-id>
?. Source Environment Variables
source ~/ocp-network.env
?. Set current project
gcloud config set project $PROJECT
?. Enabling required API services in GCP
gcloud services enable \
compute.googleapis.com \
cloudapis.googleapis.com \
cloudresourcemanager.googleapis.com \
dns.googleapis.com \
iamcredentials.googleapis.com \
iam.googleapis.com \
servicemanagement.googleapis.com \
serviceusage.googleapis.com \
storage-api.googleapis.com \
storage-component.googleapis.com
?. Create an ssh key, no password for OCP nodes troubeshooting
ssh-keygen -t ed25519 -N '' -f ~/.ssh/id_openshift
?. Start ssh-agent
eval "$(ssh-agent -s)"
?. Add key to the session
ssh-add ~/.ssh/id_openshift
?. Define directories for downloads, OpenShift configuration files, and OpenShift intallation artefacts and logs.
mkdir ~/_downloads
mkdir -p ~/openshift-config
mkdir -p ~/openshift-install
?. Create OpenShift Service Account with required roles
gcloud iam service-accounts create $OPENSHIFT_SA \
--description="OpenShift SA" \
--display-name="OpenShift SA"
?. Define roles
export roles='roles/owner'
for r in $roles; do
gcloud projects add-iam-policy-binding $PROJECT \
--member=serviceAccount:$OPENSHIFT_SA@$PROJECT.iam.gserviceaccount.com \
--role=$r
done
?. Download SA's key in JSON format
gcloud iam service-accounts keys create ~/openshift-config/$OPENSHIFT_SA.json --iam-account=$OPENSHIFT_SA@$PROJECT.iam.gserviceaccount.com
?. Download and upload pull secret. You can find it from your RedHat account.
Keep the file name as a ~/pull-secret.txt
for compatibility with commands.
Use Settings cog/Upload file if you have it at you local file system or copy-paste it.
?. Download OCP Installer
REF: https://cloud.redhat.com/openshift/install
cd ~/_downloads
curl -LO https://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.7.52/openshift-install-linux.tar.gz
curl -LO https://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.7.52/openshift-client-linux.tar.gz
?. Untar oc, kubectl, and openshift-install utilities to ~/.bin directory
mkdir ~/bin
source ~/.profile
tar xvf ~/_downloads/openshift-install-linux.tar.gz -C ~/bin
tar xvf ~/_downloads/openshift-client-linux.tar.gz -C ~/bin
?. Define SSH_KEY and PULL_SECRET variables
export SSH_KEY=$(cat ~/.ssh/id_openshift.pub)
export PULL_SECRET=$(cat ~/pull-secret.txt)
?. Validate non-empty values of SSH_KEY and PULL_SECRET variables
echo $SSH_KEY
echo $PULL_SECRET
?. Define install-config.yaml
in the ~/openshift-config folder.
Remember, this file will get deleted after openshift-install config files are created in the ~/openshift-install directory. Thus, if you want to start from scratch, you better have the original install config file copy in the ~/openshift-config directory.
cat <<EOF > ~/openshift-config/install-config.yaml
apiVersion: v1
baseDomain: $BASE_DOMAIN
controlPlane:
hyperthreading: Enabled
name: master
platform:
gcp:
type: e2-standard-4
zones:
- $ZONE_1
- $ZONE_2
osDisk:
diskType: pd-ssd
diskSizeGB: 48
replicas: 3
compute:
- hyperthreading: Enabled
name: worker
platform:
gcp:
type: e2-standard-4
zones:
- $ZONE_1
- $ZONE_2
osDisk:
diskType: pd-standard
diskSizeGB: 96
replicas: 2
metadata:
name: hybrid-cluster
networking:
clusterNetwork:
- cidr: $CLUSTER_NETWORK_CIDR
hostPrefix: $CLUSTER_NETWORK_HOST_PREFIX
machineNetwork:
- cidr: $MACHINE_NETWORK_CIDR
networkType: OpenShiftSDN
serviceNetwork:
- $SERVICE_NETWORK_CIDR
platform:
gcp:
projectID: $PROJECT
region: $REGION
network: $NETWORK
controlPlaneSubnet: $CONTROL_PLANE_SUBNET
computeSubnet: $COMPUTE_SUBNET
pullSecret: '$PULL_SECRET'
fips: false
sshKey: "$SSH_KEY"
publish: Internal
EOF
?. Let's use GOOGLE_CREDENTIALS variable to configure OCP install role
export GOOGLE_CREDENTIALS=~/openshift-config/$OPENSHIFT_SA.json
?. Copy `install-config.yaml' file into ~/openshift-install folder.
cp ~/openshift-config/install-config.yaml ~/openshift-install
?. Launch OCP installation. It could take 30-40 minutes to install it.
cd ~/openshift-install
time openshift-install create cluster --dir .
Sample Output:
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/student-00-0a66f0db2708/openshift-install/auth/kubeconfig'
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.hybrid-cluster.exco-ocp.com
INFO Login to the console with user: "kubeadmin", and password: "D6HT6-bN7hu-yJq9k-FGN4h"
INFO Time elapsed: 36m44s
real 36m44.574s
user 0m51.641s
sys 0m3.263s
Make a note of kubectl credentials file as well as OCP Console URLs and user/password.
The api server serves requests at a configure load balancer $INFRA_ID-api-internal
or port 6443 of any master node.
?. Configure openshift-install's kubeconfig as account default
mkdir -p ~/.kube
cp ~/openshift-install/auth/kubeconfig ~/.kube/config
?. Configure kubectl auto-completion
sudo apt install -y bash-completion
source ~/.profile
source <(kubectl completion bash)
?. Get infrastructure ID
export OCP_INFRA_ID=$(oc get -o jsonpath='{.status.infrastructureName}{"\n"}' infrastructure cluster)
echo $OCP_INFRA_ID-master
?. Identify IP Address for api server's internal LB
gcloud compute forwarding-rules describe $OCP_INFRA_ID-api-internal --region $REGION --format="value(IPAddress)"
10.0.16.3
?. A sample request to get an OCP version from a bastion host looks like:
curl -k https://10.0.16.3:6443/version
Sample Output:
{
"major": "1",
"minor": "20",
"gitVersion": "v1.20.0+7d0a2b2",
"gitCommit": "7d0a2b269a27413f5f125d30c9d726684886c69a",
"gitTreeState": "clean",
"buildDate": "2021-04-16T13:08:35Z",
"goVersion": "go1.15.7",
"compiler": "gc",
"platform": "linux/amd64"
}
As we deliberatly installed a private cluster only, you would need to use any preffered technique to access it from your computer.
It is trivial to configure GCLB to expose it.
Alternativaly, an easy and secure way is to use IAP TCP forwarding via gcloud command.
?. Bastion: Verify correct OCP cluster infrastructure ID value
echo $OCP_INFRA_ID-master
?. Bastion: Create a firewall rule
gcloud compute firewall-rules create $NETWORK-apiserver-allow-ingress-from-iap \
--direction=INGRESS \
--action=allow \
--rules=tcp:6443 \
--network=$NETWORK \
--target-tags=$OCP_INFRA_ID-master,$OCP_INFRA_ID-worker \
--source-ranges=35.235.240.0/20
?. Bastion: Grant $USER permission to use IAP
gcloud projects add-iam-policy-binding $PROJECT \
--member=user:$USER@qwiklabs.net \
--role=roles/iap.tunnelResourceAccessor
At your PC/Laptom, open a new terminal window. We need to keep it open while tunnel is running.
?. Local Terminal: at the terminal of your PC, log into correct account
gcloud auth login
?. Local Terminal: Correct as appropriate env vars values
export OCP_INFRA_ID=<populate>
export HOST=$OCP_INFRA_ID-master-0
export PROJECT=<populate>
export ZONE=us-central1-a
?. Local Terminal: Start the tunnel to API Server
gcloud compute start-iap-tunnel $HOST 6443 \
--local-host-port=localhost:6443 \
--zone=$ZONE \
--project $PROJECT
Expected output:
Testing if tunnel connection works.
Listening on port [6443].
?. Another Local Terminal: Open another terminal window.
?. Another Local Terminal: Sudo-edit /etc/hosts
. Append following lines.
NOTE: Make sure a correct cluster value is used:
127.0.0.1 api.hybrid-cluster.exco-ocp.com
127.0.0.1 console-openshift-console.apps.hybrid-cluster.exco-ocp.com
127.0.0.1 oauth-openshift.apps.hybrid-cluster.exco-ocp.com
?. Another Local Terminal: Download a kubectl config file and set up session KUBECONFIG variable pointing on it
export KUBECONFIG=$PWD/config
?. Another Local Terminal: Verify connection setup
kubectl get nodes
The list of nodes of your cluster will be displayed.
An OCP Console is available as an internal load balancer, configured by OCP cluster for router-default
service in openshift-ingress
namespace.
?. Another Local Terminal: In another local terminal window, open port forwarded to a local port.
sudo `which kubectl` --kubeconfig $KUBECONFIG port-forward -n openshift-ingress service/router-default 443:443
Expected Output:
Forwarding from 127.0.0.1:443 -> 443
Forwarding from [::1]:443 -> 443
Keep this window open as long as you are working with OCP Console
?. Open https://console-openshift-console.apps.hybrid-cluster.exco-ocp.com
URL in your browser.
Ignore certificate warning and proceed.
Use kubeadmin and its password from the install log output to login.
We would need around 6.6 vCPUs to install given Apigee Hybrid instance topology. That would required 3 worker nodes. For a default typical OCP installation, 3 masters, 3 workers, considering we want to have a bastion host in the same region (otherwise openshift-install process is getting tricker), we are constrained by GCP region CPU quota limit: 24 per region. Taking into account that bootstrap node is removed after installation, however not before it prevents 2nd worker node to get up, there are two cluster operations we want to perform to have enough compute:
-
add another worker node (4 vCPUs)
-
configure masters to be able to run workloads.
?. Display list of machine sets
oc get machinesets -n openshift-machine-api
?. Scale a worker-a machine set to two replicas
oc scale --replicas=2 machineset $OCP_INFRA_ID-worker-a -n openshift-machine-api
?. Configure master nodes as schedulable
oc patch schedulers.config.openshift.io cluster --type json \
-p '[{"op": "add", "path": "/spec/mastersSchedulable", "value": true}]'
This command will remove your master's noSchedule taint. This also will add the worker label to the master nodes.
?. Clone ahr repository
cd ~
git clone https://github.com/apigee/ahr.git
?. Add ahr utilities directory to the PATH
export AHR_HOME=~/ahr
export PATH=$AHR_HOME/bin:$PATH
?. Define HYBRID_HOME
installation folder
export HYBRID_HOME=~/apigee-hybrid-install
mkdir -p $HYBRID_HOME
?. Define install directory location and install environment file configuration
export HYBRID_ENV=$HYBRID_HOME/hybrid-1.4.env
?. Clone hybrid 1.4 template configuration
mkdir -p $HYBRID_HOME
cp $AHR_HOME/examples/hybrid-sz-s-1.4.sh $HYBRID_ENV
?. Source configuration environement variables
source $HYBRID_ENV
?. Enable required APIs
ahr-verify-ctl api-enable
?. Configure SCC rule to allow gke-connect container as root [Anthod prerequisite requirement]. Create a manifest file.
cat <<EOF > $HYBRID_HOME/gke-connect-scc.yaml
# Connect Agent SCC
apiVersion: v1
kind: SecurityContextConstraints
metadata:
name: gke-connect-scc
allowPrivilegeEscalation: false
# This is redundant with non-root + disallow privilege escalation,
# but we can provide it for defense in depth.
requiredDropCapabilities:
- ALL
runAsUser:
type: MustRunAsNonRoot
seLinuxContext:
type: RunAsAny
supplementalGroups:
type: MustRunAs
ranges:
- min: 1
max: 65535
fsGroup:
type: MustRunAs
ranges:
- min: 1
max: 65535
volumes:
- secret
readOnlyRootFilesystem: true
seccompProfiles:
- docker/default
users:
groups:
- system:serviceaccounts:gke-connect
EOF
?. Apply the manifest
oc create -f $HYBRID_HOME/gke-connect-scc.yaml
?. Register Cluster with GCP Anthos hub
ahr-cluster-ctl anthos-hub-register
?. Create KSA account for anthos-user whose credentials [token] we will use to log into the cluster.
ahr-cluster-ctl anthos-user-ksa-create
?. Use commands from the output of the previous command to get the token and register the cluster via GCP Console Kubernetes Engine/Cluster web page
You can now use Anthos UI to control state and status of your OCP cluster.
?. Install cert manager [Apigee hybrid prerequisite]
kubectl apply --validate=false -f $CERT_MANAGER_MANIFEST
?. Grant anyuid SCC (based on this link) to install ASM.
oc adm policy add-scc-to-group anyuid system:serviceaccounts:istio-system
Create a regional static IP address we will use for istio ingress gateway configuration.
gcloud compute addresses create runtime-ip --region $REGION
?. Fetch the ip address of the runtime-ip address and persist it in the $HYBRID_ENV config file.
export RUNTIME_IP=$(gcloud compute addresses describe runtime-ip --region $REGION --format='value(address)')
sed -i -E "s/^(export RUNTIME_IP=).*/\1$RUNTIME_IP/g" $HYBRID_ENV
?. Define variables to generate istio-operator.yaml manifest file from provided template
export ASM_PROFILE=asm-multicloud
sed -i -E "s/^(export ASM_PROFILE=).*/\1$ASM_PROFILE/g" $HYBRID_ENV
export ASM_RELEASE=$(echo "$ASM_VERSION"|awk '{sub(/\.[0-9]+-asm\.[0-9]+/,"");print}')
?. Make a local copy of the istio operator template in case we need to customise it.
```sh
cp $AHR_HOME/templates/istio-operator-$ASM_RELEASE-$ASM_PROFILE.yaml $HYBRID_HOME/istio-operator-ocp-template.yaml
?. Resolve template into a final manifest state.
ahr-cluster-ctl template $HYBRID_HOME/istio-operator-ocp-template.yaml > $ASM_CONFIG
The $ASM_CONFIG file now contains the IstioOperator spec we are going to use for ASM installation.
?. Download ASM distribution. It contains istioctl binary. This source syntax will also add istioctl location to the PATH value.
source <(ahr-cluster-ctl asm-get $ASM_VERSION)
?. Install ASM/istio
istioctl install -f $ASM_CONFIG
?. Fetch apigectl distribution and setup PATH variable
source <(ahr-runtime-ctl get-apigeectl $HYBRID_HOME/$APIGEECTL_TARBALL)
?. Provisioning Apigee org, env, and env-group
ahr-runtime-ctl install-profile small asm-gcp -c apigee-org
?. Apigee hybrid SCC. Apply the following before proceeding with any apigeectl commands. We need permissions to allow apigee logger containers to access local file system.
cat <<EOF > $HYBRID_HOME/apigee-scc.yaml
# Apigee SCC
apiVersion: v1
kind: SecurityContextConstraints
metadata:
name: apigee-scc
allowedCapabilities:
- 'IPC_LOCK'
- 'SYS_RESOURCE'
priority: 9
runAsUser:
type: RunAsAny
seLinuxContext:
type: RunAsAny
fsGroup:
type: RunAsAny
users:
groups:
- system:serviceaccounts:apigee
---
apiVersion: v1
kind: SecurityContextConstraints
metadata:
name: apigee-system-scc
priority: 9
runAsUser:
type: MustRunAsRange
uidRangeMin: 999
uidRangeMax: 999
seLinuxContext:
type: MustRunAs
supplementalGroups:
type: MustRunAs
ranges:
- min: 998
max: 998
fsGroup:
type: MustRunAs
ranges:
- min: 998
max: 998
users:
groups:
- system:serviceaccounts:apigee-system
EOF
oc create -f $HYBRID_HOME/apigee-scc.yaml
?. Make RUNTIME_CONFIG file
ahr-runtime-ctl install-profile small asm-gcp -c runtime-config
This file is important. When we need to change hybrid instance settings, we will do it via this manifest. Keep it safe.
?. Install hybrid runtime
ahr-runtime-ctl install-profile small asm-gcp -c runtime
?. Use provided script to import and deploy a ping proxy
$AHR_HOME/proxies/deploy.sh
?. In the same terminal session, execute following command to call ping API
curl --cacert $RUNTIME_SSL_CERT https://$RUNTIME_HOST_ALIAS/ping -v --resolve "$RUNTIME_HOST_ALIAS:443:$RUNTIME_IP" --http1.1
?. For other terminal session, you would need to initiate some variable
export RUNTIME_IP=<as-appropriate>
export RUNTIME_HOST_ALIAS=<as-appropriate>
export RUNTIME_SSL_CERT=<as-appropriate>
?. Copy RUNTIME_SSL_CERT to other VM or use `-k` to ignore trust certificate.
```sh
curl --cacert $RUNTIME_SSL_CERT https://$RUNTIME_HOST_ALIAS/ping -v --resolve "$RUNTIME_HOST_ALIAS:443:$RUNTIME_IP" --http1.1
When you endevour and change installation proces, here are some commands to help you if things go wrong.
- Collect logs from VMs created during bootstrap phase
openshift-install --dir ~/openshift-install gather bootstrap --key ~/.ssh/id_openshift
- [After you many any changes], wait till bootstrap phase completes
openshift-install wait-for bootstrap-complete --log-level debug --dir ~/openshift-install
- [After you many any changes], wait till install phase completes
openshift-install wait-for install-complete --log-level debug --dir ~/openshift-install
- [When you reach bootstrap installed point] Get list of deployment
oc --kubeconfig=$HOME/openshift-install/auth/kubeconfig --namespace=openshift-machine-api get deployments
- Ssh into a bootstrap or master node for troubleshooting
ssh core@<node-ip> -i ~/.ssh/id_openshift
- Nuke the cluster
openshift-install destroy cluster
Until 4.8, if constraints/storage.uniformBucketLevelAccess
is set up in a GCP org, openshift-install will not be able to proceed the setup virtually right after start.
openshift-install in GCP projects: https://bugzilla.redhat.com/show_bug.cgi?id=1936375 "Error: googleapi: Error 412: Request violates constraint 'constraints/storage.uniformBucketLevelAccess', conditionNotMet