Proxmox in Azure - hpaluch/hpaluch.github.io GitHub Wiki

Installing Proxmox VE under Azure

Theoretically it is easy:

Here is example script for "Azure Shell". Note that it requires:

  • that you already have created your VNet and Subnet - variable subnet
  • have your own SSH key-pair - public key path in ssh_key_path variable
  • having e-mail address for Shutdown set in shutdown_email

Here is contents of ./create_vm_debian12_proxmox.sh:

#!/bin/bash

set -ue -o pipefail
# Your SubNet ID
subnet=/subscriptions/xxxxx/resourceGroups/xxxxx/providers/Microsoft.Network/virtualNetworks/xxxx/subnets/FrontEnd 
ssh_key_path=`pwd`/my_ssh_key.pub
shutdown_email='[email protected]'

rg=ProxmoxRG
loc=germanywestcentral
vm=pve-az
IP=$vm-ip
opts="-o table"
# URN from command:
# az vm image list --all -l germanywestcentral -f debian-12 -p debian -s 12-gen2 -o table
image=Debian:debian-12:12-gen2:latest

set -x
az group create -l $loc -n $rg $opts
az network public-ip create -g $rg -l $loc --name $IP --sku Basic $opts
# NOTE: Only E*v3 and D*v3 supports nested virtualization
az vm create -g $rg -l $loc \
    --image $image  \
    --nsg-rule NONE \
    --subnet $subnet \
    --public-ip-address "$IP" \
    --storage-sku Standard_LRS \
    --security-type Standard \
    --size Standard_D4s_v3 \
    --os-disk-size-gb 32 \
    --ssh-key-values $ssh_key_path \
    --admin-username azureuser \
    -n $vm $opts
az vm auto-shutdown -g $rg -n $vm --time 2200 --email "$shutdown_email" $opts
set +x
cat <<EOF
You may access this VM in 2 ways:
1. using Azure VPN Gateway 
2. Using Public IP - in such case you need to add appropriate
   SSH allow in rule to NSG rules of this created VM
EOF
exit 0

Obstacles to overcome

Error: support for 'kvm_intel' disabled by bios

NOTE: This problem is already fixed in above ./create_vm_debian12_proxmox.sh script.

When you boot your Debian 12 VM under Azure you will find that there is no /dev/kvm device, which is always bad sign. Attempting to modprobe kvm_intel will result into this kernel message (dmesg output):

kvm: support for 'kvm_intel' disabled by bios

Solution: you have to specify --security-type Standard when you create VM. Unfortunately Default is TrustedLaunch which may not be switched back to Standard later(!)

When you boot VM created with "Security type" set to "Standard" you should see /dev/kvm device:

$ ls -l /dev/kvm

crw-rw---- 1 root kvm 10, 232 Mar 25 15:43 /dev/kvm

Example Setup

As usual start with updates/upgrades:

sudo apt-get update
sudo apt-get dist-upgrade

Optional: disable bloated timers and auto-upgrades:

systemctl list-timers
sudo systemctl mask dpkg-db-backup.timer apt-daily.timer \
      man-db.timer apt-daily-upgrade.timer e2scrub_all.timer fstrim.timer
# if you don't stop unattended-upgrades there will be removed binaries but service
# still running...
sudo systemctl disable --now unattended-upgrades
sudo apt-get purge unattended-upgrades

WARNING! Never disable systemd-tmpfiles-clean.timer! It is essential hook that maintains many sub-directories in ramdisk under /run directory!

Now reboot using:

sudo reboot

Optional: install and configure useful software:

sudo apt-get install tmux curl wget mc lynx vim
# select vim.basic as default editor:
sudo update-alternatives --config editor

Now we have to follow:

$ hostname

pve-az

hostname -i ...ipv6crap... 10.101.0.4

- so I added to `/etc/hosts`

10.101.0.4 pve-az.example.com pve-az

- and verify that ping to both short and long hostname responds:
```shell
ping -c 2 `hostname`
ping -c 2 `hostname -f`

Now we have to again follow https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_12_Bookworm and add Proxmox VE repo:

sudo wget https://enterprise.proxmox.com/debian/proxmox-release-bookworm.gpg \
     -O /etc/apt/trusted.gpg.d/proxmox-release-bookworm.gpg
echo "deb [arch=amd64] http://download.proxmox.com/debian/pve bookworm pve-no-subscription" | \
     sudo tee /etc/apt/sources.list.d/pve-install-repo.list

WARNING! Above is deprecated (there should be GPG key referenced from .list file)

Now Update and Upgrade:

sudo apt-get update
sudo apt-get full-upgrade

Warning! Command below is potentially risky:

sudo apt-get install proxmox-default-kernel

When asked about Grub config - keep original (it contains important hook to use serial console - without that you will have no chance to see boot messages in Azure - in case of problems...)

Reboot to new Proxmox kernel:

sudo reboot

After reboot ensure that active kernel has -pve suffix, for example:

$ uname -r

6.5.13-3-pve

WARNING!

Do not continue - commands below seem to screw Grub....

Now we have to continue with original guide and install all Proxmox VE packages:

  • important: add grub-efi-amd64 !!!
sudo apt-get install proxmox-ve postfix open-iscsi chrony grub-efi-amd64
  • in case of Postfix dialog, select Local only delivery. Azure is actively blocking all outgoing SMTP ports so you have no chance to directly send e-mail...
  • also confirm "System e-mail name" - should be already correct..

Hmm, problems ahead:

Errors were encountered while processing:
 ifupdown2
 pve-manager
 proxmox-ve

Let's try one-by-one

sudo dpkg --configure ifupdown2
Setting up ifupdown2 (3.2.0-1+pmx8) ...

network config changes have been detected for ifupdown2 compatibility.
Saved in /etc/network/interfaces.new for hot-apply or next reboot.

Reloading network config on first install
error: Another instance of this program is already running.

Recommended:

  • Stop VM from Azure Portal (if you just shutdown VM inside, Azure will be still billing VM as running!)
  • wait until VM is in state Stopped (deallocated)
  • in Portal go to Disks -> click on OS Disk
  • click on "Create Snapshot"
  • I filled name "pve-before-bridge"
  • using all Defaults with the exception:
    • Network access: "Disable private and public access"

Only when you have backup, try this:

  • boot VM
  • Oops, GRUB ended in CLI

How to restore?

  • in azure portal go to "Serial console"
  • you will see only "(grub)" prompt
  • to load menu type this command:
configfile (hd0,gpt1)/boot/grub/grub.cfg
  • system should boot normally :-)
  • then I needed to reinstall it:
    sudo grub-install --target=x86_64-efi /dev/sda
    

Now dangerous stuff:

sudo apt-get purge cloud-init\*
sudo apt-get autoremove --purge

Note contents of this file:

  • /etc/network/interfaces.d/50-cloud-init
  • it is:
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet dhcp
    metric 100

This basically saves our network configuration.

Notice current network configuration:

  1. network interfaces:
    $ ip -br l
    
    lo               UNKNOWN        00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP> 
    eth0             UP             00:22:48:e6:3c:91 <BROADCAST,MULTICAST,UP,LOWER_UP> 
    
  2. IPv4 addresses
    $ ip -br -4 a
    
    lo               UNKNOWN        127.0.0.1/8 
    eth0             UP             10.101.0.4/24 metric 100
    
  3. routes - important is first line - default route:
    $ ip -br r
    
    default via 10.101.0.1 dev eth0 proto dhcp src 10.101.0.4 metric 100 
    10.101.0.0/24 dev eth0 proto kernel scope link src 10.101.0.4 metric 100 
    10.101.0.1 dev eth0 proto dhcp scope link src 10.101.0.4 metric 100 
    168.63.129.16 via 10.101.0.1 dev eth0 proto dhcp src 10.101.0.4 metric 100 
    169.254.169.254 via 10.101.0.1 dev eth0 proto dhcp src 10.101.0.4 metric 100 
    
  4. DNS:
    fgrep nameserver /etc/resolv.conf 
    nameserver 168.63.129.16
    

Now we have to do in single step (without reboot)

auto lo
iface lo inet loopback

auto eth0
#real IP address
iface eth0 inet static
    address  10.101.0.4/24 
    gateway  10.101.0.1

Ensure static DNS configuration:

  • backup linked /etc/resolv.conf as file with: cp -L /etc/resolv.conf /root
  • disable systemd-resolved with systemctl mask --now systemd-resolved
  • remove link and restore as regular file:
    rm /etc/resolv.conf 
    cp /root/resolv.conf /etc/
    

Double-check contents of /etc/network/interfaces and /etc/resolv.conf and then reboot with: sudo reboot

After reboot we will follow Masquerading (NAT) with iptables from https://pve.proxmox.com/wiki/Network_Configuration#_default_configuration_using_a_bridge

  • verify that there is no address conflict (that you have unused network 10.10.10.0/24
  • simply append to /etc/network/interfaces
auto vmbr0
#private sub network
iface vmbr0 inet static
    address  10.10.10.1/24
    bridge-ports none
    bridge-stp off
    bridge-fd 0

    post-up   echo 1 > /proc/sys/net/ipv4/ip_forward
    post-up   iptables -t nat -A POSTROUTING -s '10.10.10.0/24' -o eth0 -j MASQUERADE
    post-down iptables -t nat -D POSTROUTING -s '10.10.10.0/24' -o eth0 -j MASQUERADE
  • important - on iptables command the -o eth0 must reference interface that has "public" (routable) IP address - in our case eth0. However if use "regular" Proxmox VE installation it is vmbr0 (!).

  • if you use firewall you have to also append two additional post-up lines to NAT interface:

    post-up   iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
    post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
    
  • double check contents of /etc/network/interfaces notice that iptables commands reference main network interface (eth0 in Azure)

  • after reboot verify that there is new vmbr0 interface with gateway address 10.10.10.1:

    $ ip -br -4 a
    
    lo               UNKNOWN        127.0.0.1/8 
    eth0             UP             10.101.0.4/24 metric 100 
    vmbr0            UNKNOWN        10.10.10.1/24 
    
  • and also verify masquerading rule in iptables:

    $ sudo iptables -L -n -t nat
    
    ... other chains are empty ...
    
    Chain POSTROUTING (policy ACCEPT)
    target     prot opt source               destination         
    MASQUERADE  0    --  10.10.10.0/24        0.0.0.0/0   
    

To access Proxmox Web UI these ports should be open:

  • open in Azure Portal -> Proxmox VM -> Network Settings -> Create port rule -> Inbound rule
  • tcp/8006 (main web ui) - use https://IP_OF_PROXMOX:8006
  • tcp/3128 (Spice console)

Then access your Proxmox VE using https://IP_OF_PROXMOX:8006

  • you need to first set root's password using:
    passwd root
    
  • now you are nearly done - download some ISO image to /var/lib/vz/template/iso/, for example:
    cd /var/lib/vz/template/iso/
    curl -fLO https://ftp.linux.cz/pub/linux/almalinux/9/isos/x86_64/AlmaLinux-9-latest-x86_64-minimal.iso
    

Now you can create and run your first vm!

  • but: there is NO DHCP server on Proxmox (unless you install configure it)
  • you have to assign static IP in VM, in my case:
    • IP: 10.10.10.99 (any IP except ending .1 and below .100 - we plan to use upper addresses for DHCP)
    • Mask: 255.255.255.0
    • Gateway: 10.10.10.1
    • DNS: same IP as in Proxmox /etc/resolv.conf

Notes:

  • because VM is in private NAT network you can access it only from Proxmox Host (no access from outside)
  • remember to access your Proxmox with Azure Snapshot - preferably before each network configuration change

Adding DHCP+DNS Server to NAT network

To save us lot of work (and log DNS and DHCP requests) we can use dnsmasq to provide DNS and DHCP for NAT network.

  • WARNING! System wide dnsmasq may clash with Proxmox SDN feature. Do not use configuration below if you also use Proxmox SDN!
  • install dnsmasq:
    apt-get install dnsmasq
    
  • create new configuration file /etc/dnsmasq.d/nat.conf with contents:
    listen-address=10.10.10.1
    interface=vmbr0
    log-queries
    log-dhcp
    dhcp-range=10.10.10.100,10.10.10.200,12h
    # set gateway in DHCP response;
    dhcp-option=option:router,10.10.10.1
    # set dnsmasq as DNS server:
    dhcp-option=option:dns-server,10.10.10.1
    
    # DNS: Do NOT return IPv6  addresses (AAAA)
    filter-AAAA
    
    # register static IP for specific MAC address
    #dhcp-host=11:22:33:44:55:66,192.168.0.60
    # add custom DNS entry 
    #address=/double-click.net/127.0.0.1
    
  • if you disabled resolveconf you may have to also uncomment in /etc/default/dnsmasq
    IGNORE_RESOLVCONF=yes
    
  • and restart dnsmasq with systemctl restart dnsmasq
  • to see log output, use journalctl -u dnsmasq
  • now try using DHCPv4 configuration on any VM. In case of RHEL and clones you can use nmtui (NetworkManager Text User Interface) to change IPv4 configuration from "Manual" to "Automatic"
  • verify that assigned IP is in expected range (10.10.10.100 to 10.10.10.200), that gateway is 10.10.10.1 (with ip r command) and that DNS is correct (inspecting /etc/resolv.conf for example)