Ideas virt_net - ansible/community GitHub Wiki
Everything started with issues with the virt_net modules:
- ansible-collections/community.libvirt#38
- ansible-collections/community.libvirt#46
- ansible-collections/community.libvirt#47
This page collects ideas for dicussion to find the right way of a fix / improvement.
From Ansible use case configuration management:
Ansible features an state-driven resource model that describes the desired state of computer systems and services, not the paths to get them to this state. No matter what state a system is in, Ansible understands how to transform it to the desired state (and also supports a "dry run" mode to preview needed changes). This allows reliable and repeatable IT infrastructure configuration, avoiding the potential failures from scripting and script-based solutions that describe explicit and often irreversible actions rather than the end goal."
Good example from https://hvops.com/articles/ansible-vs-shell-scripts (with slight wording improvements):
---
- hosts: all
tasks:
- name: Ensure the PGP key is installed
apt_key: >
state=present
id=AC40B2F7
url="http://keyserver.ubuntu.com/pks/lookup?op=get&fingerprint=on&search=0x561F9B9CAC40B2F7"
- name: Ensure https support for apt is installed
apt: >
state=present
pkg=apt-transport-https
- name: Ensure the passenger apt repository is configured
apt_repository: >
state=present
repo='deb https://oss-binaries.phusionpassenger.com/apt/passenger raring main'
- name: Ensure nginx is installed
apt: >
state=present
pkg=nginx-full
- name: Ensure passenger is installed
apt: >
state=present
pkg=passenger
update_cache=yes
- name: Ensure the nginx configuration is correct
copy: >
src=/app/config/nginx.conf
dest=/etc/nginx/nginx.conf
- name: Ensure nginx is running
service: >
name=nginx
state=started
Some critical / skeptical words: https://regebro.wordpress.com/2014/09/17/a-script-is-not-configuration
This section focuses on the user context, when using virt_net. The purpose is to understand the workflow of the user. It is insufficient to just look at libvirt network features.
An Ansible developer wants to run a virtual machine as staging environment for her Ansible configuration. Could be a network of several virtual machines. Mainly, I assume the virtual machine runs on the local host in the user space.
Basic steps:
- Boot up a fresh virtual machine from a fresh image
- Bootstrap Ansible playbook
- Test everything
- Clean up in the end
As part of the first step, we must ensure the virtual staging network is set up as needed.
---
- name: Ensure the test environment is set up correctly
hosts: localhost
tasks:
- name: Ensure the default network defined correctly and running
community.libvirt.virt_net:
state: present
xml: '{{ lookup("template", "network_default.xml") }}'
I do not define parameters here, which are already part of the XML template. Especially I avoided the parameter name in the example to see how it feels. The combination of name and xml has issues in the current implementation (see parameter name). However, the default network already exists. The user needs not specify an XML definition, if she is happy with the default definition of libvirt. In this case, she needs a parameter name.
---
- name: Ensure the test environment is set up correctly
hosts: localhost
tasks:
- name: Ensure the default network is running
community.libvirt.virt_net:
name: default
state: present
This network can be non-persistent, but persistence would work in any case.
---
- name: Ensure the test environment is set up correctly
hosts: localhost
tasks:
- name: Ensure the network *development* is defined correctly and running
community.libvirt.virt_net:
state: present
xml: '{{ lookup("template", "network_development.xml") }}'
After running the tests, the developer could clean up the development environment.
---
- name: Ensure a cleaned up development environment
hosts: localhost
tasks:
- name: Ensure the network *development* is removed
community.libvirt.virt_net:
state: absent
name: development
Having the parameter name sometimes in and out makes it a bit difficult, to bring the corresponding definitions together, if there are several network definitions.
Testing systems in a continuous integration environment is basically the next step after the previous use case. The CI system might select the right test machine bootstrap the virtual machine, run test case and clean up everything in the end. Ansible can help to create the virtual machine as well as to deploy the current software in the virtual machine. Note: I think libvirt might be good for small setups. For bigger setups we have usual other suspects like OKD, OpenStack etc.
The Ansible playbook is very similar to the previous use case: non-persistent setup (everything managed by Ansible), but no local host.
---
- name: Ensure the test environment is set up correctly
hosts: {{ staging_host }}
tasks:
- name: Ensure the network *development* is defined correctly and running
community.libvirt.virt_net:
state: present
xml: '{{ lookup("template", "network_development.xml") }}'
Run a service XY in a virtual machine. Again this is for small environments. The host could be selected by the infrastructure file or by a simple management component. In this case, we would configure the autostart option.
---
- name: Ensure the service XY is running
hosts: all
tasks:
- name: Ensure the network *storage* is defined correctly and running
community.libvirt.virt_net:
state: present
autostart: yes
xml: '{{ lookup("template", "network_storage.xml") }}'
Similar modules are
- OpenStack subnet module
- VMware guest module
- Kubernetes module
- Docker network module, Docker container module and Docker compose module
The current implementation allows to specify conflicting network names in the referenced XML file and the playbook. The implementation does not handle this case actively and the documentation does not mention the issue. The behaviour is undefined and leads to effects like that described in ansible-collections/community.libvirt#47.
Docker compose module |
|
Kubernetes module | An inline definition or referenced definition overwrites the top level parameter name. |
OpenStack subnet module | Everything is defined inline. No parameter to reference an external source. No conflicting name |
I consider the network name helpful in the playbook for clarity. For this reason, I would see this parameter as required and a definition in the referenced file as optional. The module should set or overwrite the name parameter after reading the referenced definition file.
TODO ...
Docker compose module |
|
Kubernetes module |
|
OpenStack subnet module |
|
Current virt_net |
|
Following the OpenStack subnet module, a network can be present or absent. This proposal considers Ansible as the main configuration source. There is no need for a separate libvirt database. If the autostart option is chosen, the network must be defined in libvirt. These considerations result in the following simple proposal.
-
state: absent
: The network is not visible in libvirt. If this is not the case, it must be destroyed. In the end, this network is not running and is not defined anymore. -
state: present
: The network is visible and active. If this is not the case, it must be defined and started.
As you can see, the proposal disregards the intermediate state of a network being defined but not active. This is an intermediate state we must see in the facts database, but not a state of practical use in an production setup with Ansible. The network is always defined in the Ansible database. It is simple to use and understand, in alignment with the OpenStack modules and avoids some pitfalls of the current virt_net implementation.
All other libvirt states can be relevant for development purposes to test something. For this, we have more appropriate tools / scripting languages like Python or shell.
We can distinguish four machines to bootstrap a virtual machine with Ansible.
- Machine that executes the playbook
- Machine on which the libvirt client runs
- Machine the libvirt client connects to
- Instantiated and booted virtual machine that needs to be set up via Ansible
TODO ...
TODO ...
Do we need the commands info, facts, get_xml, status and list_nets all together?
From VMware module: "Note that this play disables the gather_facts parameter, since you don’t want to collect facts about localhost."
TODO ...
The current virt_net module has some commands, which are not idempotent. They describe actions, not target states of the system.
- define
- create
- start
- stop
- destroy
- undefine
- modify
Related modules do not use such commands as well.
Docker compose module | No direct commands like these. |
Kubernetes module | No such commands. |
OpenStack subnet module | No such commands. |
Such commands contradict the paradigm proposed in the section general paradigm, too.
I propose to deprecate these commands. The user can directly use Python or shell scripting. These tools are made for scripting.