OpenStack AIO in Azure - hpaluch/hpaluch.github.io GitHub Wiki
OpenStack AIO in Azure
[!WARNING]
I'm currently using my single-script setup - see https://github.com/hpaluch/osfs for details. And sometimes I use DevStack see DevStack 2 interfaces for details.
How to deploy experimental OpenStack All-in-One (AIO) in Azure.
Requirements:
-
suitable Azure Account - must have access to VMs with nested virtualization Enabled
Visual Studio Professional Subscription
is fine
-
verify that your Azure Subnet (and possibly VPN Gateway Subnet) does not collide with OpenStack AIO Subnets. According to https://docs.openstack.org/openstack-ansible/latest/user/network-arch/example.html these subnets are used by OpenStack AIO (actually there are reserved few more networks)
Network | CIDR |
---|---|
Management Network | 172.29.236.0/22 |
Overlay Network | 172.29.240.0/22 |
Storage Network | 172.29.244.0/22 |
My Azure Subnet starts with 10.
so it is fine.
My Azure VPN Gateway Subnet (click on your VPN Gateway -> Point-to-site
Configuration and query Address Pool) starts with 172.16.
- so it is
OK.
Setup
Official OpenStack AIO guide is on:
-
https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html
-
PDF: https://docs.openstack.org//openstack-ansible/latest/doc-openstack-ansible.pdf
- go to Page 71 to start of AIO Guide
-
General recommended requirements are on:
-
https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html
- 8 Cores (VS Prof Subscriber have only 4)
- 80GB free disk space
- 16GB RAM
In Azure we have to select VM that has Nested Virtualization Support:
- https://azure.microsoft.com/en-us/blog/nested-virtualization-in-azure/
- Most close is
Standard E2s v3 (4 vcpus, 16 GiB memory)
Here is example script you can run in Azure Shell. However you have to change:
/subscriptions/ID/resourceGroups/ID/providers/Microsoft.Network/virtualNetworks/ID/subnets/ID
to your Subnet IDhp_vm2.pub
- to your PUBLIC ssh key that you have to upload to your Azure Shell
Here is that script:
#!/bin/bash
set -ue -o pipefail
# Your SubNet ID
subnet=/subscriptions/ID/resourceGroups/ID/providers/Microsoft.Network/virtualNetworks/ID/subnets/ID
ssh_key_path=`pwd`/hp_vm2.pub
rg=OsAioRG
loc=germanywestcentral
vm=openstack-aio
# URN from command:
# az vm image list --all -l germanywestcentral -f 0001-com-ubuntu-server-focal -p canonical -s 20_04-lts-gen2 -o table
image=Canonical:0001-com-ubuntu-server-focal:20_04-lts-gen2:latest
set -x
az group create -l $loc -n $rg
az vm create -g $rg -l $loc \
--image $image \
--nsg-rule NONE \
--subnet $subnet \
--public-ip-address "" \
--storage-sku Premium_LRS \
--size Standard_E2s_v3 \
--os-disk-size-gb 80 \
--ssh-key-values $ssh_key_path \
--admin-username azureuser \
-n $vm
exit 0
- and you can login with your private SSH key as
azureuser
Now when you log in:
- verify that KVM is available:
ls -l /dev/kvm crw-rw---- 1 root kvm 10, 232 Nov 17 08:19 /dev/kvm
- verify that main disk is 50GB+:
df -h / Filesystem Size Used Avail Use% Mounted on /dev/root 78G 1.4G 77G 2% /
- now we will prepare system for OpenStack/Ansible install:
df -h / sudo apt-get update sudo apt-get dist-upgrade # reboot if any system component (library or kernel) # was upgraded using: sudo init 6
- ensure that firewall is inactive:
$ sudo ufw status Status: inactive
Finally we can follow installation guide from:
Invoke:
sudo bash # do everything as root
# find latest tag
git checkout master
git describe --abbrev=0 --tags
23.0.0.0rc1
# Hmm, do not like rc1 - prefere to lookup last major release - 22
git tag -l | grep 22
22.3.3
git checkout 22.3.3
NOTE: Up to date version of AIO User Guide is on:
/opt/openstack-ansible/doc/source/user/aio/quickstart.rst
Now I strongly recommend
- exit root session - Ctrl-d
- run
tmux
- inside tmux:
sudo bash cd /opt/openstack-ansible/
- first Bootstrap Ansible provision tool
scripts/bootstrap-ansible.sh
- evaluate Bootstrap options from file
- possible example - local Squid proxy:
- https://opendev.org/openstack/openstack-ansible/commit/c73091967d8db449cd5f0fea4472d6a8e1b2ab22
- https://opendev.org/openstack/openstack-ansible/commit/9b1f331d9fff362cad74ea46b67e2ea8e162ddeb
# run local squid proxy/cache export SCENARIO="aio_proxy" # NOT YET TESTED!
- now Bootstrap OpenStack AIO playbooks
scripts/bootstrap-aio.sh
- it should end with something like:
EXIT NOTICE [Playbook execution success] **
And now tough stuff - really deploy OpenStack AIO using Ansible Playbooks:
cd /opt/openstack-ansible/playbooks
openstack-ansible setup-hosts.yml
openstack-ansible setup-infrastructure.yml
openstack-ansible setup-openstack.yml
WARNING!
- Sometimes there is transient repository error. Running same playbook again usually helps.
- Sometimes it is necessary to run again sequence of all three above playbooks
Please note that Ubuntu is quite confusing concerning containers:
- the
lxc
command is used for LXD containers (Ubuntu's container flavour) - the
lxc-*
commands are used for original LXC containers - OpenStack AIO uses regular LXC, because those are available on most Linux distributions
Now we should verify our AIO environment using:
Here is list of all LXC containers on my Host VM (for reference):
# Run as root!
sudo lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED
aio1_cinder_api_container-b5f7b5c8 RUNNING 1 onboot, openstack 10.255.255.45, 172.29.236.247, 172.29.247.65 - false
aio1_galera_container-3b4ffdc8 RUNNING 1 onboot, openstack 10.255.255.155, 172.29.237.175 - false
aio1_glance_container-5b698614 RUNNING 1 onboot, openstack 10.255.255.252, 172.29.237.60, 172.29.247.23 - false
aio1_horizon_container-65a14e29 RUNNING 1 onboot, openstack 10.255.255.163, 172.29.238.255 - false
aio1_keystone_container-af53eca3 RUNNING 1 onboot, openstack 10.255.255.46, 172.29.236.236 - false
aio1_memcached_container-75d7321a RUNNING 1 onboot, openstack 10.255.255.234, 172.29.236.78 - false
aio1_neutron_server_container-a3543bd4 RUNNING 1 onboot, openstack 10.255.255.57, 172.29.237.108 - false
aio1_nova_api_container-17f37a69 RUNNING 1 onboot, openstack 10.255.255.127, 172.29.237.23 - false
aio1_placement_container-1415e982 RUNNING 1 onboot, openstack 10.255.255.149, 172.29.237.98 - false
aio1_rabbit_mq_container-7e2a2fbb RUNNING 1 onboot, openstack 10.255.255.84, 172.29.238.145 - false
aio1_repo_container-fff27c8a RUNNING 1 onboot, openstack 10.255.255.134, 172.29.236.110 - false
aio1_utility_container-671f4ded RUNNING 1 onboot, openstack 10.255.255.227, 172.29.237.191 - false
Now we have to follow guide and login to utility
container
(the last one):
# run as ordinary user, all LXC commands must be run as root
sudo lxc-attach -n `sudo lxc-ls -1 | grep utility | head -n 1`
# now we are in LXC container:
source ~/openrc
openstack user list --domain default
+----------------------------------+-----------+
| ID | Name |
+----------------------------------+-----------+
| 262e2c58ceb047b6af3d4a0d17eb2833 | admin |
| 04f9e30e9692428b9592f5a203b59abe | placement |
| 146daea75bbe497d9979f00118a0db93 | glance |
| 07c43dc44ab04f0286cfd0a5a1af3d3e | cinder |
| 5e052a22e1cd4f37993c90fceac31a80 | nova |
| f03f59b39d7d430e8302eea7bd2e5457 | neutron |
| ac1babb3f5f948e4b2bb61cac2ff8367 | demo |
| 9921be52d2f445b3897c9cf13e54db70 | alt_demo |
+----------------------------------+-----------+
openstack endpoint list
# shoudld dump lot of data
openstack compute service list
# must dump also `nova-compute` this will run VM in OpenStack:
+----+----------------+----------------------------------+----------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+----+----------------+----------------------------------+----------+---------+-------+----------------------------+
| 3 | nova-conductor | aio1-nova-api-container-17f37a69 | internal | enabled | up | 2021-11-17T15:05:06.000000 |
| 4 | nova-scheduler | aio1-nova-api-container-17f37a69 | internal | enabled | up | 2021-11-17T15:05:00.000000 |
| 5 | nova-compute | aio1 | nova | enabled | up | 2021-11-17T15:05:00.000000 |
+----+----------------+----------------------------------+----------+---------+-------+----------------------------+
openstack network agent list
+--------------------------------------+--------------------+------+-------------------+-------+-------+---------------------------+
| ID | Agent Type | Host | Availability Zone | Alive | State | Binary |
+--------------------------------------+--------------------+------+-------------------+-------+-------+---------------------------+
| 51621e1c-df19-4072-bec7-03628a499fcc | DHCP agent | aio1 | nova | :-) | UP | neutron-dhcp-agent |
| 67bedb81-2764-4fa4-87ac-f6d09b6d1b75 | L3 agent | aio1 | nova | :-) | UP | neutron-l3-agent |
| 6e1bd23e-fe56-48f0-ab67-73ff4dcc112e | Linux bridge agent | aio1 | None | :-) | UP | neutron-linuxbridge-agent |
| 8bf8555b-b714-4d22-8118-fb282b6769c2 | Metering agent | aio1 | None | :-) | UP | neutron-metering-agent |
| bc626b1a-e41d-4366-afe6-9601dd02d342 | Metadata agent | aio1 | None | :-) | UP | neutron-metadata-agent |
+--------------------------------------+--------------------+------+-------------------+-------+-------+---------------------------+
openstack volume service list
+------------------+------------------------------------+------+---------+-------+----------------------------+
| Binary | Host | Zone | Status | State | Updated At |
+------------------+------------------------------------+------+---------+-------+----------------------------+
| cinder-volume | aio1@lvm | nova | enabled | up | 2021-11-17T15:06:32.000000 |
| cinder-scheduler | aio1-cinder-api-container-b5f7b5c8 | nova | enabled | up | 2021-11-17T15:06:36.000000 |
+------------------+------------------------------------+------+---------+-------+----------------------------+
Launching our 1st instance
Ensure that you are still logged to utility
container
as described in previous chapter.
Now we will follow
-
https://docs.openstack.org/mitaka/install-guide-ubuntu/launch-instance-provider.html To launch our first OpenStack VM
-
at first we have to find available VM flavours (combination of CPU, RAM and possibly local temporary disk):
root@aio1-utility-container-671f4ded:/# openstack flavor list +-----+----------+-----+------+-----------+-------+-----------+ | ID | Name | RAM | Disk | Ephemeral | VCPUs | Is Public | +-----+----------+-----+------+-----------+-------+-----------+ | 201 | tempest1 | 256 | 1 | 0 | 1 | True | | 202 | tempest2 | 512 | 1 | 0 | 1 | True | +-----+----------+-----+------+-----------+-------+-----------+
-
note flavour name
tempest2
-
now we have to find available images:
root@aio1-utility-container-671f4ded:/# openstack image list +--------------------------------------+--------+--------+ | ID | Name | Status | +--------------------------------------+--------+--------+ | 646f2a2f-fa73-4915-82ed-08c56399bb04 | cirros | active | +--------------------------------------+--------+--------+
-
note
cirros
(this is very small Linux image automatically installed by OpenStack AIO) -
now we have to find available networks:
root@aio1-utility-container-671f4ded:/# openstack network list +--------------------------------------+---------+--------------------------------------+ | ID | Name | Subnets | +--------------------------------------+---------+--------------------------------------+ | 09336f60-f8af-42fe-9dba-40ca51d23842 | private | c28f9ff5-bb04-40b7-a164-e82897aed1e6 | | e3c5a29b-0826-407d-a010-b7b55d4642ae | public | 61694ee8-622b-4c26-bf33-103afe78a84b | +--------------------------------------+---------+--------------------------------------+
-
notice
public
network -
now we have to find our openstack
username
andproject_name
:echo "OS_USERNAME=$OS_USERNAME OS_PROJECT_NAME=$OS_PROJECT_NAME" OS_USERNAME=admin OS_PROJECT_NAME=admin
-
note username
admin
and projectadmin
-
now we have to find ID of project
admin
: (this step is necessary foradmin
user only, because it can see all security groups)openstack project show -c id -f value admin 9ff16514f2bd4346b9c0c5f50253a172
-
note
9ff16514f2bd4346b9c0c5f50253a172
project_id
-
now we have to find security group in our project:
openstack security group list --project 9ff16514f2bd4346b9c0c5f50253a172 +--------------------------------------+---------+------------------------+----------------------------------+------+ | ID | Name | Description | Project | Tags | +--------------------------------------+---------+------------------------+----------------------------------+------+ | 034ca1f9-6e15-4236-9e2f-e1fd8fe1edef | default | Default security group | 9ff16514f2bd4346b9c0c5f50253a172 | [] | +--------------------------------------+---------+------------------------+----------------------------------+------+
-
note
default
(or034ca1f9-6e15-4236-9e2f-e1fd8fe1edef
) as security group id -
NOW hackish - part - allow SSH connection in default security group
- do NOT do this on production!
- from: https://docs.openstack.org/ocata/user-guide/cli-nova-configure-access-security-for-instances.html#create-and-manage-security-group-rules
openstack security group rule \ create 034ca1f9-6e15-4236-9e2f-e1fd8fe1edef \ --protocol tcp --dst-port 22:22 --remote-ip 0.0.0.0/0
-
finally we must list and/or create SSH keypair that will be used for login:
openstack keypair list # oops no keypair
-
now still in container create SSH key-pair
ssh-keygen # press ENTER to answer all questions
-
back in
utility
container:# source this so you can use TAB to expand OpenStack commands source /etc/bash_completion # register our new keypair openstack keypair create --public-key ~/.ssh/id_rsa.pub admin_kp
-
finally we should be able to launch VM!
openstack server create --flavor tempest2 --image cirros \ --nic net-id=e3c5a29b-0826-407d-a010-b7b55d4642ae \ --security-group 034ca1f9-6e15-4236-9e2f-e1fd8fe1edef \ --key-name admin_kp admin-vm1
-
now poll command
openstack server list
until VM isactive
openstack server list +--------------------------------------+-----------+--------+-----------------------+--------+----------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-----------+--------+-----------------------+--------+----------+ | 18e921c9-df58-47db-ad81-c6e00239a072 | admin-vm1 | ACTIVE | public=172.29.249.183 | cirros | tempest2 | +--------------------------------------+-----------+--------+-----------------------+--------+----------+
-
now you can try to login from container
utility
using:ssh -i ~/.ssh/id_rsa [email protected] # you should be logged without any password using key
-
same SSH connection should work also from your HOST, however you need to copy there Private ssh key
~/.ssh/id_rsa
from LXC container.- in host you can find that key in path like:
(must be root)
/var/lib/lxc/aio1_utility_container-671f4ded/rootfs/root/.ssh/id_rsa
- in host you can find that key in path like:
(must be root)
Dirty trick to get console to VM without IP adress (good for diagnostics):
- on your HOST run this command:
sudo virsh list --all Id Name State ----------------------------------- 2 instance-00000002 running
- connect from your host directly to console:
sudo virsh console instance-00000002 Connected to domain instance-00000002 Escape character is ^] login as 'cirros' user. default password: 'gocubsgo'. use 'sudo' for root. admin-vm1 login: cirros Password: $ ip a
- use Ctrl-] to escape console
That's it!
Limitations
- currently VMs can be accessed from Host only
- even if you will succeed by configuring Bridge for Host's
eth0
it will not help you much, because Azure (and AWS, GCE..) will drop all send packets with unknown MAC address and/or IP Address (and every VM in OpenStack has new MAC address and IP Address). The only way to make it work is to register such MAC/IP on Azure (which is tedious work), or make some kind of VPN access to OpenStack Host.
Bonus Deploying Juju to OpenStack AIO
Here we will show how to Deploy Juju and its sample applications (called Charms) to OpenStack AIO.
What is Juju? It is universal container deployer - it supports both Public clouds and also selected private clouds and bare metal (using MAAS).
WARNING! It is works in progress.
We will basically follow
At first connect to your OpenStack AIO HOST and install juju using snap:
sudo snap install juju --classic
Here is my version:
$ snap list juju
Name Version Rev Tracking Publisher Notes
juju 2.9.18 17625 latest/stable canonical✓ classic
Now you need to find your Public API endpoint, using this command:
$ ip a s dev eth0 | fgrep 'inet '
inet 10.101.0.8/24 brd 10.101.0.255 scope global eth0
So my public API endpoint should be this one:
- https://10.101.0.8:5000 In my case I can verify it using:
# install jq to have fancy output of json
sudo apt-get install jq
curl -fsS --cacert /etc/ssl/certs/haproxy.cert https://10.101.0.8:5000/v3 | jq
Should produce something like:
{
"version": {
"id": "v3.14",
"status": "stable",
"updated": "2020-04-07T00:00:00Z",
"links": [
{
"rel": "self",
"href": "https://10.101.0.8:5000/v3/"
}
],
"media-types": [
{
"base": "application/json",
"type": "application/vnd.openstack.identity-v3+json"
}
]
}
}
Now you need to extract OpenStack admin credentials and other settings. The easiest way is to cat or copy as root this file:
sudo bash
# must be root before invoking this command to '*' expansion take place
cat /var/lib/lxc/aio1_utility_container-*/rootfs/root/openrc
Here is example snippet:
export OS_AUTH_URL=http://172.29.236.101:5000/v3
export OS_USERNAME=admin
export OS_PASSWORD=SOME_HARD_TO_GUESS_PASSWORD
export OS_REGION_NAME=RegionOne
Then create Juju cloud:
juju add-cloud --local
Select cloud type: openstack
Enter a name for your openstack cloud: os-aio
### replace 10.101.0.8 with your eth0 IP address:
Enter the API endpoint url for the cloud []: https://10.101.0.8:5000
### Notice this CA path - specific for OpenStack AIO!!!
Enter a path to the CA certificate for your cloud if one is required to access it.
(optional) [none]: /etc/ssl/certs/haproxy.cert
Select one or more auth types separated by commas: userpass
Enter region name: RegionOne
Enter the API endpoint url for the region [use cloud api url]:
Enter another region? (y/N):
Now we have to create Juju credentials for our cloud os-aio
juju credentials juju add-credential os-aio
Enter credential name: os-aio-creds
Select region [any region, credential is not region specific]: RegionOne
Using auth-type "userpass".
Enter username: admin
Enter password: XXXXXXXX
Enter tenant-name (optional): admin
Enter tenant-id (optional):
Enter version (optional):
Enter domain-name (optional):
Enter project-domain-name (optional):
Enter user-domain-name (optional):
Now tricky part - try bootstrap:
juju bootstrap os-aio
ERROR cannot determine available auth versions
auth options fetching failed
caused by: request available auth options:
failed executing the request https://10.101.0.8:5000/
caused by: Get "https://10.101.0.8:5000/":
x509: cannot validate certificate for 10.101.0.8 because it doesn't contain any IP SANs
This is because Go language is too restrictive (it ignores CN part of self-signed certs and accepts SAN only - Google enjoys breaking things). Please see Troubleshooting section at the end of this document for solution.
Then try again:
juju bootstrap os-aio
Creating Juju controller "os-aio-regionone" on os-aio/RegionOne
Looking for packaged Juju agent version 2.9.18 for amd64
Located Juju agent version 2.9.18-ubuntu-amd64 at
https://streams.canonical.com/juju/tools/agent/2.9.18/juju-2.9.18-ubuntu-amd64.tgz
Launching controller instance(s) on os-aio/RegionOne...
ERROR failed to bootstrap model: cannot start bootstrap instance:
no metadata for "focal" images in RegionOne with arches [amd64]
It seems that solution is there:
- https://discourse.charmhub.io/t/how-to-deploy-openstack-on-openstack-and-bootstrap-a-juju-env-on-top-of-it/4189 TODO: Test it.
Troubleshooting
Golang (Juju) refuses HAProxy certificate
When you try juju bootstrap
you will likely encounter fatal
error that haproxy certificate does not contain SAN for its IP Address.
To fix it you have to apply following patch:
--- /etc/ansible/roles/haproxy_server/tasks/haproxy_ssl_key_create.yml.orig 2021-11-18 16:07:52.062451169 +0000
+++ /etc/ansible/roles/haproxy_server/tasks/haproxy_ssl_key_create.yml 2021-11-18 16:09:09.255140413 +0000
@@ -30,6 +30,7 @@
openssl req -new -nodes -sha256 -x509 -subj
"{{ haproxy_ssl_self_signed_subject }}"
-days 3650
+ -addext "subjectAltName = IP.1:{{ external_lb_vip_address }}"
-keyout {{ haproxy_ssl_key }}
-out {{ haproxy_ssl_cert }}
-extensions v3_ca
And run as root on HOST:
cd /opt/openstack-ansible/playbooks
openstack-ansible -e "haproxy_ssl_self_signed_regen=true" haproxy-install.yml
Ansible should report changed=4
or something similar.
To verify that HAProxy cert contains SAN try:
openssl x509 -in /etc/ssl/certs/haproxy.cert -text | \
fgrep -A 1 'X509v3 Subject Alternative Name'
X509v3 Subject Alternative Name:
IP Address:10.101.0.8
Original has alternative Name only in Subject, which is not enough for Golang.
More information available on:
- https://bugs.launchpad.net/openstack-ansible/+bug/1783121
- https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/584857/
- https://github.com/levitte/openssl/blame/43890d7fc963d8c5ec3084dcf6c6c20b5efcaa7f/doc/man1/req.pod#L603
LXC network failes with error
You will find this error in /var/log/syslog
Nov 21 15:37:37 aio1 dnsmasq[2906]: failed to create listening socket
for 10.255.255.1: Cannot assign requested address
Nov 21 15:37:37 aio1 dnsmasq[2906]: FAILED to start up
The cause was surprising:
- broken link
/etc/resolv.conf
It helped to restart this systemd service:
sudo systemctl restart systemd-resolved
# check that /etc/resolv.conf links to file
cat /etc/resolv.conf
And then I rather rebuild whole AIO network with:
sudo bash
/usr/local/bin/lxc-system-manage system-force-rebuild
Tips
For reboot there are special instructions: