Test Bed New - maggiemsft/SONiC GitHub Wiki

Testbed Setup

This document describes the steps to setup the testbed and deploy a topology.

Prepare testbed server

  • Install Ubuntu 16.04 or 17.04 amd64 server.
  • Setup management port configuration using sample /etc/network/interfaces.
root@server-1:~# cat /etc/network/interfaces
# The management network interface
auto ma0
iface ma0 inet manual

# Server, VM and PTF management interface
auto br1
iface br1 inet static
    bridge_ports ma0
    bridge_stp off
    bridge_maxwait 0
    bridge_fd 0
    address 10.250.0.245
    netmask 255.255.255.0
    network 10.250.0.0
    broadcast 10.250.0.255
    gateway 10.250.0.1
    dns-nameservers 10.250.0.1 10.250.0.2
    # dns-* options are implemented by the resolvconf package, if installed
  • Installed python 2.7 (required by ansible).
  • Add Docker's official GPG key
   $ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

Setup docker registry for PTF docker

PTF docker is used to send and receive packets to test data plane.

  • Build PTF docker
git clone --recursive https://github.com/Azure/sonic-buildimage.git
make configure PLATFORM=generic
make target/docker-ptf.gz

Build and run sonic-mgmt docker

ansible playbook in sonic-mgmt repo requires to setup ansible and various dependencies. We have built a sonic-mgmt docker that installs all dependencies, and you can build the docker and run ansible playbook inside the docker.

  • Build sonic-mgmt docker
git clone --recursive https://github.com/Azure/sonic-buildimage.git
make configure PLATFORM=generic
make target/docker-sonic-mgmt.gz

Pre-built sonic-mgmt can also be downloaded from here.

  • Run sonic-mgmt docker
docker load -i target/docker-sonic-mgmt.gz
docker run -it docker-sonic-mgmt bash
cd ~/sonic-mgmt

From now on, all steps are running inside the sonic-mgmt docker.

Prepare Testbed Physical connections

all physical connections are recorded in a yaml format file under ansible/testbed_inv/physical_connection.yaml, this file is the central place of recording all physical connections within whole testbed.

physical_connections:
 sonic_duts:
  sonic_dut_1:
   eth1: 
    peer_device: leaffanout_1
    peer_port: eth23
    port_phy_speed: 
   eth2:

  sonic_dut2:
    eth1:

  rootfanout:
    eth1:
      peer_device: server_1
      peer_port: p4p1
      port_phy_spped: 40
    eth2:
      peer_device: fanout_leaf_1
      peer_port: Eth64/1
      peer_phy_speed: 100
  • root fanout

    • All root fanout ports are vlan trunk ports
    • connect all servers' interface for test to root fanout switch
    • write down each connection in above mentioned ansible/testbed_inv/cable_connection.yaml file
  • leaf fanout

    • Each leaf fanout port has one uplink port connect to root fanout and rest of ports connecting to SONic DUT
    • Connect uplink interface to root fanout switch vlan trunk
    • connect SONiC DUT interfaces to leaf fanout switch with vlan access port
    • write down each connections in above mentioned ansible/testbed_inv/physical_connection.yaml file

Prepare lab basic service servers

These are basic network lab service servers that for SONiC testbed environment to work properly.

  • NTP Server

    you better have a lab NTP server for testing NTP.

    if you don't have a lab NTP server, you may consider using some public NTP servers for time sync.

    0.pool.ntp.org
    1.pool.ntp.org
    2.pool.ntp.org
    
  • DHCP Server (optional)

    SONiC test does not require a DHCP server to be present in testbed. However, if you want the DUT or other devices in your lab to be able to obtain IP addresses up boot up on management interface, you may consider to use a DHCP server.

    When SONiC device initiated from onie-boot, you may consider configure your DHCP server to automatically assign management IP address and load a correct SONiC image upon boot up.

    In our lab, we are using dnsmasq as our lab DHCP server and configure it for each SONiC DUT ONIE boot install option.

  • HTTP Server SONiC ONIE boot is using a HTTP server to store all available images for installation.

    So, to correctly setup and configure a HTTP server is the first step to load SONiC image.

    More onie-installer related configuration, please follow ONIE

  • Syslog Server (optional)

    SONiC tests does not require syslog server. But have a syslog server that allow SONiC DUTs to send logs to server and can trace back later is more convenient for debugging....

Prepare testbed configurations

Latest sonic-mgmt repo is cloned into sonic-mgmt docker under '/var/[your-login-username]/sonic-mgmt`. Once you are in the docker, you need to modify the testbed configuration files to reflect your lab setup. Or to avoid lose your sonic-mgmt repo patch for your own environment when docker is destroied, you may checkout to your test server and mount the checked out repo into your docker.

All testbed related inventory information will be put under an inventory folder: ansible/tetbed_inv/, all secret are stored in ansible/group_vars/all/secrets.json

following inventory items are in inventory folder * Static information inventory * servers * vms * sonic duts * fanout swtiches * physical_connections * testbed (create a testbed role?, under inventory is more centralized) your testbed topology description file where they are currently in vars when run testbed related tasks, load these files first *Dynamic inventory * testbed actual interface mapping after port breakout * testbed topology connections * ptf docker inventory after topology connections

  • Servers

    Inventory file servers is the central place for all servers that host ptf dockers and VMs(EOS) information within testbed.

    [servers:children]
       server_1
       server_2
    
    [server_1:children]
       vm_host_1
       vms_1
    
    [vm_host]
       vm_host_1
       vm_host_2
    
    [vm_host_1]
      STAR-ACS-SERV-01
    
    [vm_host_1: vars]
      ansible_host: 10.3.255.245
      mgmt_bridge: br1
      mgmt_prefixlen: 17 
      mgmt_gw: 10.3.255.1
      vm_mgmt_gw: 172.16.128.1
      external_iface: p4p1
    
    [docker_registry]
     acs-devrepo.corp.microsoft.com
     acs-repo.corp.microsoft.com
    
    • Check that ansible could reach this device by command ansible -m ping -i veos vm_host_1.
  • VMs

    We are using EOS VMs as neighbor routers to create a testing topology for testing SONiC DUT. EOS images are downloaded directly from public available images from Arista support site.

    File or files starting with vms_* are inventory files to record all eos vms information

    [eos:children]
     vms_1
     vms_2
    
    [eos: vars]
      vars=ansible/vars/creds
    
    [vms_1]
     VM0100 ansible_host=172.16.200.32
     VM0101 ansible_host=172.16.200.33
     VM0102 ansible_host=172.16.200.34
     VM0103 ansible_host=172.16.200.35
    
    • Download vEOS image from arista.
    • Copy below image files to ~/veos-vm/images on your testbed server.
      • Aboot-veos-serial-8.0.0.iso
      • vEOS-lab-4.15.9M.vmdk
    • Update VM IP addresses ansible/veos inventory file. These IP addresses should be in the management subnet defined above.
    • Update VM credentials in ansible/group_vars/eos/creds.yml.
  • fanout switches

    [fanout:children]
      fanout_eos
      fanout_sonic
     
    [fanout_eos:children]
      Arista64_40  hwsku:
      Airsta32_100  hwsku:
      Arista64_100  hwsku: 
    
    [fanout_sonic]
      str-7060cx-09 ansible_host=10.3.255.30 hwsku=
      str-7260cx3-01 ansible_host= hwsku=
    
    [Arista64_40] 
      str-7260-01       ansible_host=10.3.255.76
      str-7260-02       ansible_host=10.3.255.105
    
    [fanout: vars]
       vars=ansible/group_vars/secrets.json
     
    
  • SONiC Device Under Test

    Dependency: There are some assumptions that all hwskus referred here are defined in correct format in sonic-buildimage. if not, consider to put a temp port_config.ini file in sonic_mgmt

    [sonic_devices:children]
      sonic_dell
      sonic_arista
      sonic_nexus
      sonic_celestica
      sonic_eval
    
    [sonic_dell]  
      sonic_s6000_on_1 ansible_host=10.3.255.20 hwsku=force10-s6000 
      
    [sonic_arista]
      sonic_arista_sku1 ansible_host=10.3.253.21 hwsku=arista-7060sku-1 
      sonic_arista_sku2 ansible_host=10.3.253.21 hwsku=arista-7060sku-2 base_deivce=sonic_arista_sku1
    
    [sonic_arista_sku1:vars]
      port_ini: ansible/vars/sonic/???
    

    Question: We should allow static or temp port_ini.json file here to override default? only when we cannot find portini file here, then try to find default location?

  • Testlab environmental data (was in group_vars/lab/lab.yml)

    SONiC switch will integrate these configurations into configuration minigraph.xml at the time of trying to generating configuration file minigraph.xml based on testbed selection, so the SONiC DUT will know where these testing resources are.

    If later we skip minigraph.xml and use config_db.json only, then these need to be in config_db.json

    IMPORTANT: Here DHCP servers are faked DHCP servers in your testbed, please don't put your real DHCP servers here

    ntp_server:
    syslog_servers:
    dns_servers:
    snmp_location:
    forced_mgmt_routes:
    tacacs_servers:
    dhcp_servers: ['192.0.0.1', …...] 
    ...
    

secret management

put all secrets in one file under ansible/group_vars/all/secrets  

Bring up all VMs for individual server

./testbed-cli.sh start-vms server_1 password.txt
  • please note: Here "password.txt" is the ansible vault password file name/path. Ansible allows user use ansible vault to encrypt password files. By default, this shell script require a password file. If you are not using ansible vault, just create an empty file and pass the filename to the command line. The file name and location is created and maintained by user.

Check that all VMs are up and running: ansible -m ping -i veos server_1

Define Testbed actual connections with correct HwSku (dynamic and can change)

Following [sonic_dut_in_topology] and [vlans] define SONiC DUTs that going to deploy with testbed topologies.

[sonic_dut_in_topology]
  sonic_dut_1
  sonic_dut_2

[vlans:children]
  Vlan32
  Vlan64
  Vlan128

[vlan32] (total 59)
  100: str-s6000-01 
  132: str-s6000-02
  164: str-s6000-03
  …
[vlan64] (total 15)
 2000: str-s6100-01 
 2064: str-S6100-02
 2128: Str-n92304-03
 ...
[vlan128] (total 10)
 3000: str-7260csx-1
 3128: str-7260cx3-2

Based on sonic_dut_in_topology and vlan assignment, create actual connections file:

ansible/files/testbed_connection.yml

question need answer: count interface_name on fanout side? How to trigger? anything change in this file, trigger Jenkins task? phase-1: manually run update file like create_graph before? Check in testbed_connection.yaml to server.

Leaf_fanout:
 sonic_dut_1:
  eth1: 
    peer_device: sonic_dut_1
    peer_port: eth23
    port_phy_speed: 
    vlan_mode:
    vlan_id: 
 eth2:
    ... 
 eth64: 
    peer_device: root_fanout
    peer_port: eth7
    port_phy_speed: 
    vlan_mode:
    vlan_id: 


leaf_fanout_2:
  eth1: ...

rootfanout:
  eth1:
    peer_device: server_1
    peer_port: p4p1
    port_phy_spped: 40
    vlan_mode: trunk
    vlan_id: 
  eth2:
    peer_device: fanout_leaf_1
    peer_port: Eth64/1
    peer_phy_speed: 100
    vlan_mode: trunk
    vlan_id: 100-132,247-256

Deploy fanout switch and Vlan

  1. initial deploy all fanout switches
  2. when adding new dut or change connections, on demand deploy fanout switches need change

Define testbed VM and topology connection

  1. all information currently are in testbed.csv , need saved in server side to share
  2. have a remote server to save current deployed topology and can display current connected topology

Deploy topology

  1. can query what's current deployed topology
  2. can query if vms of specific topology are connected
  3. can add or remove topology
  4. can connect or disconnect vm
  5. can destroy individual vm and recreate it( if there is topology being deployed and vm is connected status, reconnect vm)

to achieve this:

server file or database table:

testbed:
  topology_1:
     dut: sonic_dut_1
     ptf_docker: 10.3.255.12/24
     topology_type: t0
     vm_base: vm0100
     vms: [vm0100, vm0101, vm0102, vm0103]
     topology_deployed: True | false
     vms_connected: True | False
   topology_2:
     dut: sonic_dut_2
     ptf_docker: 10.3.255.13/24
     topology_type: t1
     vm_base: vm0200
     vms: [vm0200, vm0201, vm0202, vm0203, ... vm0231]
     topology_deployed: True | false
     vms_connected: True | False
  • Update testbed.csv with your data. At least update PTF mgmt interface settings
  • To deploy PTF topology run: ./testbed-cli.sh add-topo ptf1-m ~/.password
  • To remove PTF topology run: ./testbed-cli.sh remove-topo ptf1-m ~/.password
  • To deploy T1 topology run: ./testbed-cli.sh add-topo vms-t1 ~/.password
  • The last step in testbed-cli is trying to re-deploy Vlan range in root fanout switch to match the VLAN range specified in that topology. It's trying to change the 'allowed' Vlan for Arista switch port. If you have other type of switch, it may or may not work. Please review it and change accordingly if required. If you comment out the last step, you may manually swap Vlan ranges in rootfanout to make the testbed topology switch to work.