IT: HOWTO: Install Kolla Ansible - feralcoder/shared GitHub Wiki
feralcoder public Home
feralcoder IT
Living Room Data Center
My Private Cloud
Kolla-Ansible OpenStack Deployment
HOWTO: Setup Docker Registries For OpenStack
HOWTO: Install Ceph
HOWTO: Kolla-Ansible Container Management
HOWTO: Setup Octavia LBAAS
Kolla-Ansible used to also deploy Ceph. This is no longer the case.
I'm using ceph-ansible to deploy ceph. There are some mismatced configurations to resolve before this layering works.
HOWTO: Install Ceph
Your API network must be configured on all hosts. In addition, your hostname must resolve to this IP. This creates a problem if you access the hosts from a different network, like the Control Plane, and their names point to this other network. For Example: My hosts all have names pointing to their ControlPlane IPs on 192.168.127.X. They are all configured this way in /etc/hosts. I need for this naming scheme to continue, because I boot these hosts to an administrative partition which isn't configured on the API Network, and I expect them to still be able to reach each other in this state. But the kolla-ansible installer looks up each host's hostname and checks that it resolves to its own IP on the API Network.
To deal with this, I change each host's own hostname to xxx-api.feralcoder.org before installation, and all hosts have each others' api names in /etc/hosts also.
/etc/hosts: ... 172.17.0.218 yoda-api.feralcoder.org yoda-api 192.168.127.218 yoda.feralcoder.org yoda 172.17.0.220 dumbledore-api.feralcoder.org dumbledore-api 192.168.127.220 dumbledore.feralcoder.org dumbledore 172.17.0.222 luke-api.feralcoder.org luke-api 192.168.127.222 luke.feralcoder.org luke 172.17.0.224 ben-api.feralcoder.org ben-api 192.168.127.224 ben.feralcoder.org ben ...
Be aware that on CentOS 8 hostname changes don't stick the way they used to - it's not sufficient to run 'hostname xxx.yyy.zzz' or edit /etc/hostname anymore. Now you must run "hostnamectl set-hostname xxx.yyy.zzz", the systemd way.
hostnamectl set-hostname merlin-api.feralcoder.org
My setup has hosts configured like this in inventory:
[control] strange-api ansible_user=stack ansible_become=true ... [compute] kerrigan-api ansible_user=stack ansible_become=true ... [deployment] localhost ansible_connection=local become=true
I have 3 control nodes for HA, and 5 compute nodes.
Before installation I add a 'stack' user on every host and configure this to be the ansible_user. On the ansible controller I just use my own user.
I grant passwordless sudo to myself on the ansible controller:
echo "$USER ALL=(root) NOPASSWD:ALL' | sudo tee /etc/sudoers.d/$USER" ssh root@yoda "echo 'stack ALL=(root) NOPASSWD:ALL' > /etc/sudoers.d/stack"
For the stack user I grant passwordless sudo on the target nodes, and passwordless ssh from my ansible controller.
ADMIN_KEY=`cat ~/.ssh/id_rsa.pub` ssh stack@yoda "ssh-keygen -P '' -t rsa -f ~/.ssh/id_rsa" ssh stack@yoda "echo '$ADMIN_KEY' >> ~/.ssh/authorized_keys && chmod 644 ~/.ssh/authorized_keys"
After deployment, I remove passwordless ssh and sudo.
If you don't want passwordless sudo or ssh, you can enter the sudo and ssh passwords via prompt by adding to the command line '--ask-become-pass' and '--ask-pass', or store them in inventory like so:
strange-api ansible_user=stack ansible_sudo_pass=s3cr3t ansible_password=s3cr3t ansible_become=true ... localhost ansible_connection=local ansible_sudo_pass=s3cr3t become=true
If you're building with network isolation then it's possible you'll run into asymmetric routing issues. CentOS 8, by default, drops return packets if they egress a different interface than ingress.
For example: All your stack networks live on the same switch, and your switch has IP's on multiple of those networks. If that switch does your routing then it may send directly to any of your stack's networks. Let's say you connect from your desktop machine, on a network completely disconnected from the stack, to an interface on the control plane. If the default route for your stack hosts isn't via a gateway on the control plane network, then packets will ingress on control plane and egress on a different interface, and be dropped.
One way to avoid this would be to make sure your switch only has a router IP on one subnet, the same which the stack hosts use for their default route.
Another way would be to set up routing rules on the router to send traffic only via one subnet, even if directly connected to the destination network. This assumes you've got enterprise-grade network gear.
Another way is to turn off reverse path filter on all hosts:
sed -i 's/.*net.ipv4.conf.all.rp_filter.*/net.ipv4.conf.all.rp_filter=0/g' /etc/sysctl.conf || echo net.ipv4.conf.all.rp_filter=0 >> /etc/sysctl.conf
I create bonds on my interfaces to enable consistent naming across the nodes. This allows me to have a simple globals.yml to drive kolla-ansible:
network_interface: "bond1" neutron_external_interface: "bond2" api_interface: "bond3" storage_interface: "bond4" swift_storage_interface: "bond4" tunnel_interface: "bond6"
Notice there's no bond5. This is my storage management network, and isn't seen by kolla-ansible. It is referenced in my ceph-ansible configs.
Modprobe must be configured to load bonding module for these bonds:
# cat /etc/modprobe.d/bond.conf alias bond1 bonding options bond1 mode=balance-rr alias bond2 bonding options bond2 mode=balance-rr alias bond3 bonding options bond3 mode=balance-rr alias bond4 bonding options bond4 mode=balance-rr alias bond5 bonding options bond5 mode=balance-rr alias bond6 bonding options bond6 mode=balance-rr
My bond setup on each host looks something like this:
# cd /etc/sysconfig/network-scripts && cat ifcfg-bond1 ifcfg-ens5f0 BONDING_MASTER=yes BONDING_OPTS='mode=0 miimon=100' BOOTPROTO=none BROWSER_ONLY=no DEFROUTE=yes DEVICE=bond1 DNS1=8.8.8.8 GATEWAY=192.168.127.241 IPADDR=192.168.127.214 IPV4_FAILURE_FATAL=no NAME=bond1 ONBOOT=yes PREFIX=24 PROXY_METHOD=none TYPE=BOND UUID=6d0a6acc-b984-4c2d-b4df-ab23d479413a BOOTPROTO=none DEVICE=ens5f0 MASTER=bond5 NAME=ens5f0 ONBOOT=yes SLAVE=yes TYPE=Ethernet
To activate each bond, run:
ifdown ens5f0 ifup bond1
Before you go to production, you really should figure out exactly which allowances you need and configure them in your firewall.
Until I do that, I'm doing this on all stack hosts:
systemctl disable firewalld systemctl stop firewalld
On your ansible controller, set up your deployment environment:
sudo dnf -y install python3-devel libffi-devel gcc openssl-devel python3-libselinux mkdir -p ~/CODE/venvs/kolla-ansible python3 -m venv ~/CODE/venvs/kolla-ansible source ~/CODE/venvs/kolla-ansible/bin/activate pip install -U pip pip install 'ansible=2.9' pip install kolla-ansible sudo mkdir -p /etc/kolla sudo chown $USER:$USER /etc/kolla cp -r ~/CODE/venvs/kolla-ansible/share/kolla-ansible/etc_examples/kolla/* /etc/kolla # Install sshpass for ansible to use sudo dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm sudo dnf -y install sshpass sudo dnf config-manager --set-disabled epel-modular epel # Place your customized globals.yml file in /etc/kolla (see below for my own details)
Set up /etc/ansible/ansible.cfg, or ~/ansible.cfg
[defaults] host_key_checking=False pipelining=True forks=10
I had problems with podman, so I'm using docker. The problem is, kolla-ansible assumes podman for CentOS 8, and some local mods on kolla-ansible code is needed.
On all stack hosts run:
sudo dnf -y erase buildah podman
If you're going to use Ceph, set that up now. Also on the Ceph HOWTO are integration instructions for Kolla-Ansible OpenStack.
IT: HOWTO: Install Ceph
Random passwords can be generated by running kolla-genpwd. You can pre-populate an empty /etc/kolla/passwords.yml with any passwords you wish to set.
cp ~/CODE/venvs/kolla-ansible/share/kolla-ansible/etc_examples/kolla/passwords.yml /etc/kolla/passwords.yml vi /etc/kolla/passwords.yml source ~/CODE/venvs/kolla-ansible/bin/activate kolla-genpwd
The last step here, pull, isn't strictly necessary - the deployment will also pull containers. However, this exposes you to long-hanging steps which may time out or suffer network failures.
This is essential if you haven't localized the containers and are pulling the latest from DockerIO. And even so, the containers are updated so frequently, you may still be pulling new containers on deployment, even after a successful pull step.
ansible -i $INVENTORY all -m ping ansible -i $INVENTORY all -m bootstrap-servers ansible -i $INVENTORY all -m prechecks ansible -i $INVENTORY all -m pull
Once you've got containers pulled, you're ready to deploy.
ansible -i $INVENTORY all -m deploy
You may iterate through deploy / destroy steps until you've got a working stack. That's the ideal situation, at least. I've found there are times when a full environment wipe and rebuild is required. I hope you're building scripts for all this...
ansible -i $INVENTORY all -m destroy --yes-i-really-really-mean-it