OpenStack AIO in Azure - hpaluch/hpaluch.github.io GitHub Wiki

OpenStack AIO in Azure

[!WARNING]

I'm currently using my single-script setup - see https://github.com/hpaluch/osfs for details. And sometimes I use DevStack see DevStack 2 interfaces for details.

How to deploy experimental OpenStack All-in-One (AIO) in Azure.

Requirements:

  • suitable Azure Account - must have access to VMs with nested virtualization Enabled

    • Visual Studio Professional Subscription is fine
  • verify that your Azure Subnet (and possibly VPN Gateway Subnet) does not collide with OpenStack AIO Subnets. According to https://docs.openstack.org/openstack-ansible/latest/user/network-arch/example.html these subnets are used by OpenStack AIO (actually there are reserved few more networks)

Network CIDR
Management Network 172.29.236.0/22
Overlay Network 172.29.240.0/22
Storage Network 172.29.244.0/22

My Azure Subnet starts with 10. so it is fine. My Azure VPN Gateway Subnet (click on your VPN Gateway -> Point-to-site Configuration and query Address Pool) starts with 172.16. - so it is OK.

Setup

Official OpenStack AIO guide is on:

In Azure we have to select VM that has Nested Virtualization Support:

Here is example script you can run in Azure Shell. However you have to change:

  • /subscriptions/ID/resourceGroups/ID/providers/Microsoft.Network/virtualNetworks/ID/subnets/ID to your Subnet ID
  • hp_vm2.pub - to your PUBLIC ssh key that you have to upload to your Azure Shell

Here is that script:

#!/bin/bash

set -ue -o pipefail
# Your SubNet ID
subnet=/subscriptions/ID/resourceGroups/ID/providers/Microsoft.Network/virtualNetworks/ID/subnets/ID
ssh_key_path=`pwd`/hp_vm2.pub 

rg=OsAioRG
loc=germanywestcentral
vm=openstack-aio
# URN from command:
# az vm image list --all -l germanywestcentral -f 0001-com-ubuntu-server-focal -p canonical -s 20_04-lts-gen2 -o table 
image=Canonical:0001-com-ubuntu-server-focal:20_04-lts-gen2:latest

set -x
az group create -l $loc -n $rg
az vm create -g $rg -l $loc \
    --image $image  \
    --nsg-rule NONE \
    --subnet $subnet \
    --public-ip-address "" \
    --storage-sku Premium_LRS \
    --size Standard_E2s_v3 \
    --os-disk-size-gb 80 \
    --ssh-key-values $ssh_key_path \
    --admin-username azureuser \
    -n $vm
exit 0
  • and you can login with your private SSH key as azureuser

Now when you log in:

  • verify that KVM is available:
    ls -l /dev/kvm
    
    crw-rw---- 1 root kvm 10, 232 Nov 17 08:19 /dev/kvm
    
  • verify that main disk is 50GB+:
    df -h /
    
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/root        78G  1.4G   77G   2% /
    
  • now we will prepare system for OpenStack/Ansible install:
    df -h /
    sudo apt-get update
    sudo apt-get dist-upgrade
    # reboot if any system component (library or kernel)
    # was upgraded using:
    sudo init 6
    
  • ensure that firewall is inactive:
    $ sudo ufw status
    
    Status: inactive
    

Finally we can follow installation guide from:

Invoke:

sudo bash # do everything as root


# find latest tag
git checkout master
git describe --abbrev=0 --tags

23.0.0.0rc1

# Hmm, do not like rc1 - prefere to lookup last major release - 22
git tag -l | grep 22

22.3.3

git checkout 22.3.3

NOTE: Up to date version of AIO User Guide is on:

/opt/openstack-ansible/doc/source/user/aio/quickstart.rst

Now I strongly recommend

EXIT NOTICE [Playbook execution success] **

And now tough stuff - really deploy OpenStack AIO using Ansible Playbooks:

cd /opt/openstack-ansible/playbooks
openstack-ansible setup-hosts.yml
openstack-ansible setup-infrastructure.yml
openstack-ansible setup-openstack.yml

WARNING!

  • Sometimes there is transient repository error. Running same playbook again usually helps.
  • Sometimes it is necessary to run again sequence of all three above playbooks

Please note that Ubuntu is quite confusing concerning containers:

  • the lxc command is used for LXD containers (Ubuntu's container flavour)
  • the lxc-* commands are used for original LXC containers
  • OpenStack AIO uses regular LXC, because those are available on most Linux distributions

Now we should verify our AIO environment using:

Here is list of all LXC containers on my Host VM (for reference):

# Run as root!
sudo lxc-ls -f
NAME                                   STATE   AUTOSTART GROUPS            IPV4                                         IPV6 UNPRIVILEGED
aio1_cinder_api_container-b5f7b5c8     RUNNING 1         onboot, openstack 10.255.255.45, 172.29.236.247, 172.29.247.65 -    false
aio1_galera_container-3b4ffdc8         RUNNING 1         onboot, openstack 10.255.255.155, 172.29.237.175               -    false
aio1_glance_container-5b698614         RUNNING 1         onboot, openstack 10.255.255.252, 172.29.237.60, 172.29.247.23 -    false
aio1_horizon_container-65a14e29        RUNNING 1         onboot, openstack 10.255.255.163, 172.29.238.255               -    false
aio1_keystone_container-af53eca3       RUNNING 1         onboot, openstack 10.255.255.46, 172.29.236.236                -    false
aio1_memcached_container-75d7321a      RUNNING 1         onboot, openstack 10.255.255.234, 172.29.236.78                -    false
aio1_neutron_server_container-a3543bd4 RUNNING 1         onboot, openstack 10.255.255.57, 172.29.237.108                -    false
aio1_nova_api_container-17f37a69       RUNNING 1         onboot, openstack 10.255.255.127, 172.29.237.23                -    false
aio1_placement_container-1415e982      RUNNING 1         onboot, openstack 10.255.255.149, 172.29.237.98                -    false
aio1_rabbit_mq_container-7e2a2fbb      RUNNING 1         onboot, openstack 10.255.255.84, 172.29.238.145                -    false
aio1_repo_container-fff27c8a           RUNNING 1         onboot, openstack 10.255.255.134, 172.29.236.110               -    false
aio1_utility_container-671f4ded        RUNNING 1         onboot, openstack 10.255.255.227, 172.29.237.191               -    false

Now we have to follow guide and login to utility container (the last one):

# run as ordinary user, all LXC commands must be run as root
sudo lxc-attach -n `sudo lxc-ls -1 | grep utility | head -n 1`
# now we are in LXC container:
source ~/openrc
openstack user list --domain default

+----------------------------------+-----------+
| ID                               | Name      |
+----------------------------------+-----------+
| 262e2c58ceb047b6af3d4a0d17eb2833 | admin     |
| 04f9e30e9692428b9592f5a203b59abe | placement |
| 146daea75bbe497d9979f00118a0db93 | glance    |
| 07c43dc44ab04f0286cfd0a5a1af3d3e | cinder    |
| 5e052a22e1cd4f37993c90fceac31a80 | nova      |
| f03f59b39d7d430e8302eea7bd2e5457 | neutron   |
| ac1babb3f5f948e4b2bb61cac2ff8367 | demo      |
| 9921be52d2f445b3897c9cf13e54db70 | alt_demo  |
+----------------------------------+-----------+

openstack endpoint list
# shoudld dump lot of data

openstack compute service list
# must dump also `nova-compute` this will run VM in OpenStack:
+----+----------------+----------------------------------+----------+---------+-------+----------------------------+
| ID | Binary         | Host                             | Zone     | Status  | State | Updated At                 |
+----+----------------+----------------------------------+----------+---------+-------+----------------------------+
|  3 | nova-conductor | aio1-nova-api-container-17f37a69 | internal | enabled | up    | 2021-11-17T15:05:06.000000 |
|  4 | nova-scheduler | aio1-nova-api-container-17f37a69 | internal | enabled | up    | 2021-11-17T15:05:00.000000 |
|  5 | nova-compute   | aio1                             | nova     | enabled | up    | 2021-11-17T15:05:00.000000 |
+----+----------------+----------------------------------+----------+---------+-------+----------------------------+


openstack network agent list
+--------------------------------------+--------------------+------+-------------------+-------+-------+---------------------------+
| ID                                   | Agent Type         | Host | Availability Zone | Alive | State | Binary                    |
+--------------------------------------+--------------------+------+-------------------+-------+-------+---------------------------+
| 51621e1c-df19-4072-bec7-03628a499fcc | DHCP agent         | aio1 | nova              | :-)   | UP    | neutron-dhcp-agent        |
| 67bedb81-2764-4fa4-87ac-f6d09b6d1b75 | L3 agent           | aio1 | nova              | :-)   | UP    | neutron-l3-agent          |
| 6e1bd23e-fe56-48f0-ab67-73ff4dcc112e | Linux bridge agent | aio1 | None              | :-)   | UP    | neutron-linuxbridge-agent |
| 8bf8555b-b714-4d22-8118-fb282b6769c2 | Metering agent     | aio1 | None              | :-)   | UP    | neutron-metering-agent    |
| bc626b1a-e41d-4366-afe6-9601dd02d342 | Metadata agent     | aio1 | None              | :-)   | UP    | neutron-metadata-agent    |
+--------------------------------------+--------------------+------+-------------------+-------+-------+---------------------------+


openstack volume service list
+------------------+------------------------------------+------+---------+-------+----------------------------+
| Binary           | Host                               | Zone | Status  | State | Updated At                 |
+------------------+------------------------------------+------+---------+-------+----------------------------+
| cinder-volume    | aio1@lvm                           | nova | enabled | up    | 2021-11-17T15:06:32.000000 |
| cinder-scheduler | aio1-cinder-api-container-b5f7b5c8 | nova | enabled | up    | 2021-11-17T15:06:36.000000 |
+------------------+------------------------------------+------+---------+-------+----------------------------+

Launching our 1st instance

Ensure that you are still logged to utility container as described in previous chapter.

Now we will follow

  • https://docs.openstack.org/mitaka/install-guide-ubuntu/launch-instance-provider.html To launch our first OpenStack VM

  • at first we have to find available VM flavours (combination of CPU, RAM and possibly local temporary disk):

    root@aio1-utility-container-671f4ded:/#
    openstack flavor list
    
    +-----+----------+-----+------+-----------+-------+-----------+
    | ID  | Name     | RAM | Disk | Ephemeral | VCPUs | Is Public |
    +-----+----------+-----+------+-----------+-------+-----------+
    | 201 | tempest1 | 256 |    1 |         0 |     1 | True      |
    | 202 | tempest2 | 512 |    1 |         0 |     1 | True      |
    +-----+----------+-----+------+-----------+-------+-----------+
    
  • note flavour name tempest2

  • now we have to find available images:

    root@aio1-utility-container-671f4ded:/#
    openstack image list
    
    +--------------------------------------+--------+--------+
    | ID                                   | Name   | Status |
    +--------------------------------------+--------+--------+
    | 646f2a2f-fa73-4915-82ed-08c56399bb04 | cirros | active |
    +--------------------------------------+--------+--------+
    
  • note cirros (this is very small Linux image automatically installed by OpenStack AIO)

  • now we have to find available networks:

    root@aio1-utility-container-671f4ded:/#
    openstack network list
    +--------------------------------------+---------+--------------------------------------+
    | ID                                   | Name    | Subnets                              |
    +--------------------------------------+---------+--------------------------------------+
    | 09336f60-f8af-42fe-9dba-40ca51d23842 | private | c28f9ff5-bb04-40b7-a164-e82897aed1e6 |
    | e3c5a29b-0826-407d-a010-b7b55d4642ae | public  | 61694ee8-622b-4c26-bf33-103afe78a84b |
    +--------------------------------------+---------+--------------------------------------+
    
  • notice public network

  • now we have to find our openstack username and project_name:

    echo "OS_USERNAME=$OS_USERNAME OS_PROJECT_NAME=$OS_PROJECT_NAME"
    
    OS_USERNAME=admin OS_PROJECT_NAME=admin
    
  • note username admin and project admin

  • now we have to find ID of project admin: (this step is necessary for admin user only, because it can see all security groups)

    openstack project show -c id -f value admin
    
    9ff16514f2bd4346b9c0c5f50253a172
    
  • note 9ff16514f2bd4346b9c0c5f50253a172 project_id

  • now we have to find security group in our project:

    openstack security group list --project 9ff16514f2bd4346b9c0c5f50253a172
    
    +--------------------------------------+---------+------------------------+----------------------------------+------+
    | ID                                   | Name    | Description            | Project                          | Tags |
    +--------------------------------------+---------+------------------------+----------------------------------+------+
    | 034ca1f9-6e15-4236-9e2f-e1fd8fe1edef | default | Default security group | 9ff16514f2bd4346b9c0c5f50253a172 | []   |
    +--------------------------------------+---------+------------------------+----------------------------------+------+
    
  • note default (or 034ca1f9-6e15-4236-9e2f-e1fd8fe1edef) as security group id

  • NOW hackish - part - allow SSH connection in default security group

    openstack security group rule \
      create 034ca1f9-6e15-4236-9e2f-e1fd8fe1edef \
      --protocol tcp --dst-port 22:22 --remote-ip 0.0.0.0/0
    
  • finally we must list and/or create SSH keypair that will be used for login:

    
    openstack keypair list
    # oops no keypair
    
  • now still in container create SSH key-pair

    ssh-keygen
    # press ENTER to answer all questions
    
  • back in utility container:

    # source this so you can use TAB to expand OpenStack commands
    source /etc/bash_completion
    # register our new keypair
    openstack keypair create --public-key ~/.ssh/id_rsa.pub admin_kp
    
  • finally we should be able to launch VM!

    openstack server create --flavor tempest2 --image cirros \
      --nic net-id=e3c5a29b-0826-407d-a010-b7b55d4642ae \
      --security-group 034ca1f9-6e15-4236-9e2f-e1fd8fe1edef \
      --key-name admin_kp admin-vm1
    
  • now poll command openstack server list until VM is active

    openstack server list
    
    +--------------------------------------+-----------+--------+-----------------------+--------+----------+
    | ID                                   | Name      | Status | Networks              | Image  | Flavor   |
    +--------------------------------------+-----------+--------+-----------------------+--------+----------+
    | 18e921c9-df58-47db-ad81-c6e00239a072 | admin-vm1 | ACTIVE | public=172.29.249.183 | cirros | tempest2 |
    +--------------------------------------+-----------+--------+-----------------------+--------+----------+
    
  • now you can try to login from container utility using:

    ssh -i ~/.ssh/id_rsa [email protected]
    # you should be logged without any password using key
    
  • same SSH connection should work also from your HOST, however you need to copy there Private ssh key ~/.ssh/id_rsa from LXC container.

    • in host you can find that key in path like: (must be root)
      /var/lib/lxc/aio1_utility_container-671f4ded/rootfs/root/.ssh/id_rsa
      

Dirty trick to get console to VM without IP adress (good for diagnostics):

  • on your HOST run this command:
    sudo virsh list --all
    
    Id   Name                State
    -----------------------------------
    2    instance-00000002   running
    
  • connect from your host directly to console:
    sudo virsh console instance-00000002
    
    Connected to domain instance-00000002
    Escape character is ^]
    
    login as 'cirros' user. default password: 'gocubsgo'. use 'sudo' for root.
    admin-vm1 login: cirros
    Password:
    $ ip a
    
  • use Ctrl-] to escape console

That's it!

Limitations

  • currently VMs can be accessed from Host only
  • even if you will succeed by configuring Bridge for Host's eth0 it will not help you much, because Azure (and AWS, GCE..) will drop all send packets with unknown MAC address and/or IP Address (and every VM in OpenStack has new MAC address and IP Address). The only way to make it work is to register such MAC/IP on Azure (which is tedious work), or make some kind of VPN access to OpenStack Host.

Bonus Deploying Juju to OpenStack AIO

Here we will show how to Deploy Juju and its sample applications (called Charms) to OpenStack AIO.

What is Juju? It is universal container deployer - it supports both Public clouds and also selected private clouds and bare metal (using MAAS).

WARNING! It is works in progress.

We will basically follow

At first connect to your OpenStack AIO HOST and install juju using snap:

sudo snap install juju --classic

Here is my version:

$ snap list juju

Name  Version  Rev    Tracking       Publisher   Notes
juju  2.9.18   17625  latest/stable  canonical✓  classic

Now you need to find your Public API endpoint, using this command:

$ ip a s dev eth0 | fgrep 'inet '

    inet 10.101.0.8/24 brd 10.101.0.255 scope global eth0

So my public API endpoint should be this one:

# install jq to have fancy output of json
sudo apt-get install jq
curl -fsS --cacert /etc/ssl/certs/haproxy.cert https://10.101.0.8:5000/v3 | jq

Should produce something like:

{
  "version": {
    "id": "v3.14",
    "status": "stable",
    "updated": "2020-04-07T00:00:00Z",
    "links": [
      {
        "rel": "self",
        "href": "https://10.101.0.8:5000/v3/"
      }
    ],
    "media-types": [
      {
        "base": "application/json",
        "type": "application/vnd.openstack.identity-v3+json"
      }
    ]
  }
}

Now you need to extract OpenStack admin credentials and other settings. The easiest way is to cat or copy as root this file:

sudo bash
# must be root before invoking this command to '*' expansion take place
cat /var/lib/lxc/aio1_utility_container-*/rootfs/root/openrc

Here is example snippet:

export OS_AUTH_URL=http://172.29.236.101:5000/v3
export OS_USERNAME=admin
export OS_PASSWORD=SOME_HARD_TO_GUESS_PASSWORD
export OS_REGION_NAME=RegionOne

Then create Juju cloud:

juju add-cloud --local
Select cloud type: openstack
Enter a name for your openstack cloud: os-aio
### replace 10.101.0.8 with your eth0 IP address:
Enter the API endpoint url for the cloud []: https://10.101.0.8:5000
### Notice this CA path - specific for OpenStack AIO!!!
Enter a path to the CA certificate for your cloud if one is required to access it.
     (optional) [none]: /etc/ssl/certs/haproxy.cert
Select one or more auth types separated by commas: userpass
Enter region name: RegionOne
Enter the API endpoint url for the region [use cloud api url]:
Enter another region? (y/N):

Now we have to create Juju credentials for our cloud os-aio

juju credentials juju add-credential os-aio

Enter credential name: os-aio-creds
Select region [any region, credential is not region specific]: RegionOne
Using auth-type "userpass".
Enter username: admin
Enter password: XXXXXXXX
Enter tenant-name (optional): admin
Enter tenant-id (optional):
Enter version (optional):
Enter domain-name (optional):
Enter project-domain-name (optional):
Enter user-domain-name (optional):

Now tricky part - try bootstrap:

juju bootstrap os-aio

ERROR cannot determine available auth versions
   auth options fetching failed
   caused by: request available auth options:
   failed executing the request https://10.101.0.8:5000/
   caused by: Get "https://10.101.0.8:5000/":
   x509: cannot validate certificate for 10.101.0.8 because it doesn't contain any IP SANs

This is because Go language is too restrictive (it ignores CN part of self-signed certs and accepts SAN only - Google enjoys breaking things). Please see Troubleshooting section at the end of this document for solution.

Then try again:

juju bootstrap os-aio

Creating Juju controller "os-aio-regionone" on os-aio/RegionOne
Looking for packaged Juju agent version 2.9.18 for amd64
Located Juju agent version 2.9.18-ubuntu-amd64 at
   https://streams.canonical.com/juju/tools/agent/2.9.18/juju-2.9.18-ubuntu-amd64.tgz
Launching controller instance(s) on os-aio/RegionOne...
ERROR failed to bootstrap model: cannot start bootstrap instance:
   no metadata for "focal" images in RegionOne with arches [amd64]

It seems that solution is there:

Troubleshooting

Golang (Juju) refuses HAProxy certificate

When you try juju bootstrap you will likely encounter fatal error that haproxy certificate does not contain SAN for its IP Address.

To fix it you have to apply following patch:

--- /etc/ansible/roles/haproxy_server/tasks/haproxy_ssl_key_create.yml.orig     2021-11-18 16:07:52.062451169 +0000
+++ /etc/ansible/roles/haproxy_server/tasks/haproxy_ssl_key_create.yml  2021-11-18 16:09:09.255140413 +0000
@@ -30,6 +30,7 @@
     openssl req -new -nodes -sha256 -x509 -subj
     "{{ haproxy_ssl_self_signed_subject }}"
     -days 3650
+    -addext "subjectAltName = IP.1:{{ external_lb_vip_address }}"
     -keyout {{ haproxy_ssl_key }}
     -out {{ haproxy_ssl_cert }}
     -extensions v3_ca

And run as root on HOST:

cd /opt/openstack-ansible/playbooks
openstack-ansible -e "haproxy_ssl_self_signed_regen=true" haproxy-install.yml

Ansible should report changed=4 or something similar. To verify that HAProxy cert contains SAN try:

openssl x509 -in /etc/ssl/certs/haproxy.cert -text | \

fgrep -A 1 'X509v3 Subject Alternative Name'

            X509v3 Subject Alternative Name:
                IP Address:10.101.0.8

Original has alternative Name only in Subject, which is not enough for Golang.

More information available on:

LXC network failes with error

You will find this error in /var/log/syslog

Nov 21 15:37:37 aio1 dnsmasq[2906]: failed to create listening socket
   for 10.255.255.1: Cannot assign requested address
Nov 21 15:37:37 aio1 dnsmasq[2906]: FAILED to start up

The cause was surprising:

  • broken link /etc/resolv.conf It helped to restart this systemd service:
sudo systemctl restart systemd-resolved
# check that /etc/resolv.conf links to file
cat /etc/resolv.conf

And then I rather rebuild whole AIO network with:

sudo bash
/usr/local/bin/lxc-system-manage system-force-rebuild

Tips

For reboot there are special instructions: