Test Bed New - maggiemsft/SONiC GitHub Wiki
Testbed Setup
This document describes the steps to setup the testbed and deploy a topology.
Prepare testbed server
- Install Ubuntu 16.04 or 17.04 amd64 server.
- Setup management port configuration using sample
/etc/network/interfaces.
root@server-1:~# cat /etc/network/interfaces
# The management network interface
auto ma0
iface ma0 inet manual
# Server, VM and PTF management interface
auto br1
iface br1 inet static
bridge_ports ma0
bridge_stp off
bridge_maxwait 0
bridge_fd 0
address 10.250.0.245
netmask 255.255.255.0
network 10.250.0.0
broadcast 10.250.0.255
gateway 10.250.0.1
dns-nameservers 10.250.0.1 10.250.0.2
# dns-* options are implemented by the resolvconf package, if installed
- Installed python 2.7 (required by ansible).
- Add Docker's official GPG key
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
Setup docker registry for PTF docker
PTF docker is used to send and receive packets to test data plane.
- Build PTF docker
git clone --recursive https://github.com/Azure/sonic-buildimage.git
make configure PLATFORM=generic
make target/docker-ptf.gz
- Setup docker registry and upload docker-ptf to the docker registry.
Build and run sonic-mgmt docker
ansible playbook in sonic-mgmt repo requires to setup ansible and various dependencies. We have built a sonic-mgmt docker that installs all dependencies, and you can build the docker and run ansible playbook inside the docker.
- Build sonic-mgmt docker
git clone --recursive https://github.com/Azure/sonic-buildimage.git
make configure PLATFORM=generic
make target/docker-sonic-mgmt.gz
Pre-built sonic-mgmt can also be downloaded from here.
- Run sonic-mgmt docker
docker load -i target/docker-sonic-mgmt.gz
docker run -it docker-sonic-mgmt bash
cd ~/sonic-mgmt
From now on, all steps are running inside the sonic-mgmt docker.
Prepare Testbed Physical connections
all physical connections are recorded in a yaml format file under ansible/testbed_inv/physical_connection.yaml, this file is the central place of recording all physical connections within whole testbed.
physical_connections:
sonic_duts:
sonic_dut_1:
eth1:
peer_device: leaffanout_1
peer_port: eth23
port_phy_speed:
eth2:
sonic_dut2:
eth1:
rootfanout:
eth1:
peer_device: server_1
peer_port: p4p1
port_phy_spped: 40
eth2:
peer_device: fanout_leaf_1
peer_port: Eth64/1
peer_phy_speed: 100
-
root fanout
- All root fanout ports are vlan trunk ports
- connect all servers' interface for test to root fanout switch
- write down each connection in above mentioned ansible/testbed_inv/cable_connection.yaml file
-
leaf fanout
- Each leaf fanout port has one uplink port connect to root fanout and rest of ports connecting to SONic DUT
- Connect uplink interface to root fanout switch vlan trunk
- connect SONiC DUT interfaces to leaf fanout switch with vlan access port
- write down each connections in above mentioned ansible/testbed_inv/physical_connection.yaml file
Prepare lab basic service servers
These are basic network lab service servers that for SONiC testbed environment to work properly.
-
NTP Server
you better have a lab NTP server for testing NTP.
if you don't have a lab NTP server, you may consider using some public NTP servers for time sync.
0.pool.ntp.org 1.pool.ntp.org 2.pool.ntp.org -
DHCP Server (optional)
SONiC test does not require a DHCP server to be present in testbed. However, if you want the DUT or other devices in your lab to be able to obtain IP addresses up boot up on management interface, you may consider to use a DHCP server.
When SONiC device initiated from onie-boot, you may consider configure your DHCP server to automatically assign management IP address and load a correct SONiC image upon boot up.
In our lab, we are using
dnsmasqas our lab DHCP server and configure it for each SONiC DUT ONIE boot install option. -
HTTP Server SONiC ONIE boot is using a HTTP server to store all available images for installation.
So, to correctly setup and configure a HTTP server is the first step to load SONiC image.
More onie-installer related configuration, please follow ONIE
-
Syslog Server (optional)
SONiC tests does not require syslog server. But have a syslog server that allow SONiC DUTs to send logs to server and can trace back later is more convenient for debugging....
Prepare testbed configurations
Latest sonic-mgmt repo is cloned into sonic-mgmt docker under '/var/[your-login-username]/sonic-mgmt`. Once you are in the docker, you need to modify the testbed configuration files to reflect your lab setup. Or to avoid lose your sonic-mgmt repo patch for your own environment when docker is destroied, you may checkout to your test server and mount the checked out repo into your docker.
All testbed related inventory information will be put under an inventory folder: ansible/tetbed_inv/, all secret are stored in ansible/group_vars/all/secrets.json
following inventory items are in inventory folder * Static information inventory * servers * vms * sonic duts * fanout swtiches * physical_connections * testbed (create a testbed role?, under inventory is more centralized) your testbed topology description file where they are currently in vars when run testbed related tasks, load these files first *Dynamic inventory * testbed actual interface mapping after port breakout * testbed topology connections * ptf docker inventory after topology connections
-
Servers
Inventory file
serversis the central place for all servers that host ptf dockers and VMs(EOS) information within testbed.[servers:children] server_1 server_2 [server_1:children] vm_host_1 vms_1 [vm_host] vm_host_1 vm_host_2 [vm_host_1] STAR-ACS-SERV-01 [vm_host_1: vars] ansible_host: 10.3.255.245 mgmt_bridge: br1 mgmt_prefixlen: 17 mgmt_gw: 10.3.255.1 vm_mgmt_gw: 172.16.128.1 external_iface: p4p1 [docker_registry] acs-devrepo.corp.microsoft.com acs-repo.corp.microsoft.com- Check that ansible could reach this device by command
ansible -m ping -i veos vm_host_1.
- Check that ansible could reach this device by command
-
VMs
We are using EOS VMs as neighbor routers to create a testing topology for testing SONiC DUT. EOS images are downloaded directly from public available images from Arista support site.
File or files starting with
vms_*are inventory files to record all eos vms information[eos:children] vms_1 vms_2 [eos: vars] vars=ansible/vars/creds [vms_1] VM0100 ansible_host=172.16.200.32 VM0101 ansible_host=172.16.200.33 VM0102 ansible_host=172.16.200.34 VM0103 ansible_host=172.16.200.35- Download vEOS image from arista.
- Copy below image files to
~/veos-vm/imageson your testbed server.Aboot-veos-serial-8.0.0.isovEOS-lab-4.15.9M.vmdk
- Update VM IP addresses
ansible/veosinventory file. These IP addresses should be in the management subnet defined above. - Update VM credentials in
ansible/group_vars/eos/creds.yml.
-
fanout switches
[fanout:children] fanout_eos fanout_sonic [fanout_eos:children] Arista64_40 hwsku: Airsta32_100 hwsku: Arista64_100 hwsku: [fanout_sonic] str-7060cx-09 ansible_host=10.3.255.30 hwsku= str-7260cx3-01 ansible_host= hwsku= [Arista64_40] str-7260-01 ansible_host=10.3.255.76 str-7260-02 ansible_host=10.3.255.105 [fanout: vars] vars=ansible/group_vars/secrets.json -
SONiC Device Under Test
Dependency: There are some assumptions that all hwskus referred here are defined in correct format in sonic-buildimage. if not, consider to put a temp port_config.ini file in sonic_mgmt
[sonic_devices:children] sonic_dell sonic_arista sonic_nexus sonic_celestica sonic_eval [sonic_dell] sonic_s6000_on_1 ansible_host=10.3.255.20 hwsku=force10-s6000 [sonic_arista] sonic_arista_sku1 ansible_host=10.3.253.21 hwsku=arista-7060sku-1 sonic_arista_sku2 ansible_host=10.3.253.21 hwsku=arista-7060sku-2 base_deivce=sonic_arista_sku1 [sonic_arista_sku1:vars] port_ini: ansible/vars/sonic/???Question: We should allow static or temp port_ini.json file here to override default? only when we cannot find portini file here, then try to find default location?
-
Testlab environmental data (was in group_vars/lab/lab.yml)
SONiC switch will integrate these configurations into configuration
minigraph.xmlat the time of trying to generating configuration file minigraph.xml based on testbed selection, so the SONiC DUT will know where these testing resources are.If later we skip minigraph.xml and use config_db.json only, then these need to be in
config_db.jsonIMPORTANT: Here DHCP servers are faked DHCP servers in your testbed, please don't put your real DHCP servers here
ntp_server: syslog_servers: dns_servers: snmp_location: forced_mgmt_routes: tacacs_servers: dhcp_servers: ['192.0.0.1', …...] ...
secret management
put all secrets in one file under ansible/group_vars/all/secrets
Bring up all VMs for individual server
./testbed-cli.sh start-vms server_1 password.txt
- please note: Here "password.txt" is the ansible vault password file name/path. Ansible allows user use ansible vault to encrypt password files. By default, this shell script require a password file. If you are not using ansible vault, just create an empty file and pass the filename to the command line. The file name and location is created and maintained by user.
Check that all VMs are up and running: ansible -m ping -i veos server_1
Define Testbed actual connections with correct HwSku (dynamic and can change)
Following [sonic_dut_in_topology] and [vlans] define SONiC DUTs that going to deploy with testbed topologies.
[sonic_dut_in_topology]
sonic_dut_1
sonic_dut_2
[vlans:children]
Vlan32
Vlan64
Vlan128
[vlan32] (total 59)
100: str-s6000-01
132: str-s6000-02
164: str-s6000-03
…
[vlan64] (total 15)
2000: str-s6100-01
2064: str-S6100-02
2128: Str-n92304-03
...
[vlan128] (total 10)
3000: str-7260csx-1
3128: str-7260cx3-2
Based on sonic_dut_in_topology and vlan assignment, create actual connections file:
ansible/files/testbed_connection.yml
question need answer: count interface_name on fanout side? How to trigger? anything change in this file, trigger Jenkins task? phase-1: manually run update file like create_graph before? Check in testbed_connection.yaml to server.
Leaf_fanout:
sonic_dut_1:
eth1:
peer_device: sonic_dut_1
peer_port: eth23
port_phy_speed:
vlan_mode:
vlan_id:
eth2:
...
eth64:
peer_device: root_fanout
peer_port: eth7
port_phy_speed:
vlan_mode:
vlan_id:
leaf_fanout_2:
eth1: ...
rootfanout:
eth1:
peer_device: server_1
peer_port: p4p1
port_phy_spped: 40
vlan_mode: trunk
vlan_id:
eth2:
peer_device: fanout_leaf_1
peer_port: Eth64/1
peer_phy_speed: 100
vlan_mode: trunk
vlan_id: 100-132,247-256
Deploy fanout switch and Vlan
- initial deploy all fanout switches
- when adding new dut or change connections, on demand deploy fanout switches need change
Define testbed VM and topology connection
- all information currently are in testbed.csv , need saved in server side to share
- have a remote server to save current deployed topology and can display current connected topology
Deploy topology
- can query what's current deployed topology
- can query if vms of specific topology are connected
- can add or remove topology
- can connect or disconnect vm
- can destroy individual vm and recreate it( if there is topology being deployed and vm is connected status, reconnect vm)
to achieve this:
server file or database table:
testbed:
topology_1:
dut: sonic_dut_1
ptf_docker: 10.3.255.12/24
topology_type: t0
vm_base: vm0100
vms: [vm0100, vm0101, vm0102, vm0103]
topology_deployed: True | false
vms_connected: True | False
topology_2:
dut: sonic_dut_2
ptf_docker: 10.3.255.13/24
topology_type: t1
vm_base: vm0200
vms: [vm0200, vm0201, vm0202, vm0203, ... vm0231]
topology_deployed: True | false
vms_connected: True | False
- Update
testbed.csvwith your data. At least update PTF mgmt interface settings - To deploy PTF topology run:
./testbed-cli.sh add-topo ptf1-m ~/.password - To remove PTF topology run:
./testbed-cli.sh remove-topo ptf1-m ~/.password - To deploy T1 topology run:
./testbed-cli.sh add-topo vms-t1 ~/.password - The last step in testbed-cli is trying to re-deploy Vlan range in root fanout switch to match the VLAN range specified in that topology. It's trying to change the 'allowed' Vlan for Arista switch port. If you have other type of switch, it may or may not work. Please review it and change accordingly if required. If you comment out the last step, you may manually swap Vlan ranges in rootfanout to make the testbed topology switch to work.