OVS - hpaluch/hpaluch.github.io GitHub Wiki
OVS
OVS is Open vSwitch
- Open (Source) Software Switch.
Homepage is here: https://docs.openvswitch.org/en/latest/intro/what-is-ovs/
You can use OVS at many levels:
-
as simple learning switch (like Linux Bridge)
-
as locally managed switch (configuration stored in database called OVS-DB) with various useful features:
- VLAN support
- bonding (grouping more LAN adapters together for higher speed and/or redundancy)
- QoS
- various tunnels Geneve, GRE, VXLAN,...
-
or even more: OVS managed centrally using controller software and OpenFlow protocol. This is used for advanced solutions like OVN or Faucet
My first OVS tutorial
Here is my first tutorial how to use OVS. We will create 3 logical switches (emulating isolated networks)
with Layer2 Geneve tunnel over dc-link
:
+---------+ +---------+
| dc-west | | dc-east |
+---------+ +---------+
\ /
\ /
+---------+
| dc-link |
+---------+
Where:
-
dc-west
- "Data Center West" - logical OVS switch running 2 VMs (actually using Network namespaces to emulate them):vm1-west
- 192.168.100.1vm2-west
- 192.168.100.2
-
dc-east
- "Data Center East" - logical OVS switch running 2 VMs (actually using Network namespaces to emulate them):vm3-east
- 192.168.100.3vm4-east
- 192.168.100.4
-
NOTE: We will use Layer 2 (Ethernet MAC addresses) tunnel to connect these two localities, so they use same IP network (192.168.100.0/24) ! From VMs perspective all 4 VMs are on same IP network - they are not aware that it is actually tunneled to other DC...
-
dc-link
- logical switch emulating Internet link between those 2 "data centers". It will use Geneve or other tunnel to connectdc-east
anddc-west
networks (with same IP network!)dc-link
will have 2 IPs:link1-west
- 10.200.200.1link2-east
- 10.200.200.2- these will be tunnel endpoints
Requirements for tutorial:
- single Debian 11 host (or even VM is OK). I will call it
deb11-ovs
- all traffic will use only local (internal) OVS switches. So it should be safe to try it even on remote machine, however you must ensure that my IP ranges or device names does not collide with your setup.
Installing OVS:
-
all commands to be run as root (Debian 10+ somehow removed
sudo
from default installation, which is now add-on package) -
ensure that your system is up-to-date:
apt-get update apt-get dist-upgrade # reboot if system components were updated
-
now install OVS packages:
apt-get install openvswitch-switch tcpdump # if you like ifconfig, route and friends also install: apt-get install net-tools
Now create all 3 bridges (or actually "OVS Switches"
-
create script
10_setup_switches.sh
with contents:#!/bin/bash set -euo pipefail # Create OVS Switches - each switch simulate one "datacenter" network for i in dc-west dc-east dc-link do set -x ovs-vsctl add-br $i set +x done echo "Listing switches: " ovs-vsctl list-br echo "OK: dumping OVS configuration" ovs-vsctl show exit 0
-
grant executable permissions using
chmod +x 10_setup_switches.sh
and run it using./10_setup_switches.sh
-
it should produce output like:
ovs-vsctl add-br dc-west + set +x + ovs-vsctl add-br dc-east + set +x + ovs-vsctl add-br dc-link + set +x Listing switches: dc-east dc-link dc-west OK: dumping OVS configuration 56ae4467-60fe-4819-a83e-714bfa59a74b Bridge dc-west Port dc-west Interface dc-west type: internal Bridge dc-link Port dc-link Interface dc-link type: internal Bridge dc-east Port dc-east Interface dc-east type: internal ovs_version: "2.15.0"
Now we will prepare 2 devices for tunneling called tep0
and tep1
(Tunnel Endpoint):
-
create script
20-setup-tunnel-dev.sh
with contents:#!/bin/bash set -euo pipefail # logical switch used for tunnel dc=dc-link # it is common to use "TEPx" device name for "Tunnel Endpoints" # these device wil be used to tunnel traffic from our "dc-west" to "dc-east" for eth in tep0 tep1 do set -x ovs-vsctl list-ports $dc | fgrep -wo "$eth" || ovs-vsctl add-port $dc $eth ovs-vsctl set interface $eth type=internal set +x done exit 0
-
mark it executable and run it. NOTE: on first invocation it throwed error, but it was OK on second invocation.
Before reboot we should prepare static IP configuration for those tunnel devices:
-
append to
/etc/network/interfaces
:# OVS tutorial - tunnel endpoints for switch dc-link auto tep0 iface tep0 inet static address 10.200.200.1/24 auto tep1 iface tep1 inet static address 10.200.200.2/24
-
try to enable those devices manually using:
/sbin/ifup tep0 /sbin/ifup tep1
-
verify that they are properly configured using:
ip -br -4 a | egrep -w '^tep[01]' tep0 UNKNOWN 10.200.200.1/24 tep1 UNKNOWN 10.200.200.2/24
-
NOTE: it is OK that state is UNKNOWN - it is normal for internal device (including
lo
for loopback)
Now reboot system using init 6
and verify with same commands
that bridges still exist:
init 6
# after reboot try:
p -br -4 a
lo UNKNOWN 127.0.0.1/8 ### Loopback
eth0 UP 192.168.100.173/24 ### physical network interface of deb11-ovs
tep0 UNKNOWN 10.200.200.1/24 ### tunnel endpoint for dc-west
tep1 UNKNOWN 10.200.200.2/24 ### tunnel endpoint for dc-east
Now we will setup 4 VMs (as namespaces):
-
create script
create_all_vms.sh
with contents:#!/bin/bash set -eu create_vm() { [ $# -eq 2 ] || { echo "ERROR: Invalid number of arguments $# != 2" >&2 exit 1 } # arguments local region="$1" # east|west local num="$2" # vm number 1..4 # computed variables local dc="dc-$region" local ns="vm$num-$region" local vmip=192.168.100.$num local eth="vm${num}eth" set -x # Add Port (internal network device) to OVS Switch ovs-vsctl list-ports $dc | fgrep -wo "$eth" || ovs-vsctl add-port $dc $eth ovs-vsctl set interface $eth type=internal # create namespace if it does not exist yet ip netns | fgrep -wo "$ns" || ip netns add $ns # configure and/or replace loopback ip netns exec $ns ip a show dev lo ip netns exec $ns ip a replace 127.0.0.1/8 dev lo ip netns exec $ns ip link set lo up # first check if our LAN device already exists in $ns namespace # if not, move it to Namespace ip netns exec $ns ip link show $eth || ip link set dev $eth netns $ns # configure IP Address and netmask ip netns exec $ns ip addr replace $vmip/24 dev $eth ip netns exec $ns ip link set $eth up # dump current values ip netns exec $ns ip -br -4 l ip netns exec $ns ip -br -4 a set +x } create_vm west 1 create_vm west 2 create_vm east 3 create_vm east 4 exit 0
-
mark it executable and run it
-
to see Interface configuration of all namespaces create script
dump_all_vms.sh
with contents:#!/bin/bash set -eu for ns in `ip netns | awk '{print $1}' | sort` do echo "Dumping Namespace '$ns':" ip netns exec $ns ip -br -4 l | sed 's/^/ /' ip netns exec $ns ip -br -4 a | sed 's/^/ /' ip netns exec $ns ip r | sed 's/^/ /' ip netns exec $ns ip n | sed 's/^/ /' done exit 0
-
mark it executable and run. Here is expected output:
Dumping Namespace 'vm1-west': lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP> vm1eth UNKNOWN ea:1e:e8:88:a8:fe <BROADCAST,MULTICAST,UP,LOWER_UP> lo UNKNOWN 127.0.0.1/8 vm1eth UNKNOWN 192.168.100.1/24 Dumping Namespace 'vm2-west': lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP> vm2eth UNKNOWN 2a:db:dd:b4:3a:83 <BROADCAST,MULTICAST,UP,LOWER_UP> lo UNKNOWN 127.0.0.1/8 vm2eth UNKNOWN 192.168.100.2/24 Dumping Namespace 'vm3-east': lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP> vm3eth UNKNOWN 9e:4f:72:b2:33:33 <BROADCAST,MULTICAST,UP,LOWER_UP> lo UNKNOWN 127.0.0.1/8 vm3eth UNKNOWN 192.168.100.3/24 Dumping Namespace 'vm4-east': lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP> vm4eth UNKNOWN 4e:72:d9:e9:19:19 <BROADCAST,MULTICAST,UP,LOWER_UP> lo UNKNOWN 127.0.0.1/8 vm4eth UNKNOWN 192.168.100.4/24
-
please note that it is pretty normal that link status is
UNKNOWN
for internal interfaces including Loopback.
Now we should test with ping
in namespace that:
-
only VMs in same region are reachable. So these IPs should be reachable:
- VM:
vm1-west
<->vm2-west
- VM:
vm3-east
<->vm4-east
- VM:
-
however if you try to ping from
east
towest
(or back) it should be unreachable.
Example of pings
that should work:
# OK: ping from vm1-west to vm2-west
ip netns exec vm1-west ping -c 2 192.168.100.2
But this will not work (yet):
# Should fail: ping from vm1-west to vm3-east:
ip netns exec vm1-west ping -c 2 192.168.100.3
Setting up Tunnel
-
Now moment of truth!
-
we will setup Geneve tunnel
-
create script
30-setup-tunnel.sh
with contents:#!/bin/bash set -euo pipefail for p in gre0 gre1 do ovs-vsctl del-port $p || true done ovs-vsctl add-port dc-west gre0 -- set interface gre0 type=geneve \ options:remote_ip=10.200.200.2 options:local_ip=10.200.200.1 ovs-vsctl add-port dc-east gre1 -- set interface gre1 type=geneve \ options:remote_ip=10.200.200.1 options:local_ip=10.200.200.2 exit 0
-
NOTE: we must use
local_ip
for tunnel endpoints otherwise kernel will use loopback and packets will miss OVS - so the tunnel will not work. When you have using OVS on real PCs this is usually not problem because it is not possible to route it via loopback.. -
NOTE: I originally planed to use GRE tunnel but later switches to Geneve, thus devices are still named
gre0
andgre1
. However it should otherwise work well. -
make it executable and run-it.
-
finally ping between dc-east and dc-west should work, for example:
# ping from vm1-west to vm3-east ip netns exec vm1-west ping -c 2 192.168.100.3 # ping from vm1-west to vm4-east ip netns exec vm1-west ping -c 2 192.168.100.4
-
you can also monitor traffic (surprisingly still listening on loopback):
tcpdump -e -n -p -i lo tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on lo, link-type EN10MB (Ethernet), snapshot length 262144 bytes # request from tep0 to tep1 16:44:29.376860 00:00:00:00:00:00 > 00:00:00:00:00:00, \ ethertype IPv4 (0x0800), length 148: 10.200.200.1.36185 > 10.200.200.2.6081: \ Geneve, Flags [none], vni 0x0, proto TEB (0x6558): ea:1e:e8:88:a8:fe > 4e:72:d9:e9:19:19, \ ethertype IPv4 (0x0800), length 98: 192.168.100.1 > 192.168.100.4: \ ICMP echo request, id 29820, seq 1, length 64 # response from tep1 to tep0 16:44:29.377164 00:00:00:00:00:00 > 00:00:00:00:00:00, \ ethertype IPv4 (0x0800), length 148: 10.200.200.2.36185 > 10.200.200.1.6081: \ Geneve, Flags [none], vni 0x0, proto TEB (0x6558): 4e:72:d9:e9:19:19 > ea:1e:e8:88:a8:fe, \ ethertype IPv4 (0x0800), length 98: 192.168.100.4 > 192.168.100.1: \ ICMP echo reply, id 29820, seq 1, length 64
TODO: MTU issues
Please see really nice tutorial on MTU issues and how to solve them:
More tips:
-
to see statistics of traffic from/to various ports of OVS switch you can use OpenFlow command (
ovs-ofctl
), for example to dump statistics from switchdc-link
use:ovs-ofctl dump-ports dc-link OFPST_PORT reply (xid=0x2): 3 ports port tep1: rx pkts=12, bytes=748, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=13, bytes=1006, drop=0, errs=0, coll=0 port tep0: rx pkts=13, bytes=824, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=13, bytes=1006, drop=0, errs=0, coll=0 port LOCAL: rx pkts=0, bytes=0, drop=25, errs=0, frame=0, over=0, crc=0 tx pkts=0, bytes=0, drop=0, errs=0, coll=0
-
to see OVS DB content you can just
less /var/lib/openvswitch/conf.db
(it is basically log of metadata + one-line json changes -
you can also try more civilized form:
ovs-vsctl list Open_vSwitch
What to do after reboot:
-
verify that
tep0
andtep1
are configured after reboot:ip -br -4 a
-
if they are missing (typically it happens after 2nd boot) you have to reactivate them using:
/sbin/ifup tep0 /sbin/ifup tep1 ip -br -4 a
-
you also have to again re-create all "VMs" (network namespace configurations) using:
./create_all_vms.sh
-
now tunnels should again work, for example:
ip netns exec vm1-west ping -c 2 192.168.100.4
So it is end of my first OVS tutorial. It barely scratches surface, but it takes definitely lot of time to learn OVS...
Resources used for this tutorial (and many others I forgot):
- https://albertomolina.wordpress.com/2022/12/04/openvswitch-geneve-tunnel/
- https://blog.scottlowe.org/2013/09/04/introducing-linux-network-namespaces/
- https://blog.scottlowe.org/2013/09/09/namespaces-vlans-open-vswitch-and-gre-tunnels/
- https://blog.scottlowe.org/2013/05/07/using-gre-tunnels-with-open-vswitch/
NOTE: Many guides on Internet are actually based on those from Scott Lowe so you should definitely start with Scott's blog and compare it with those guides.