Testbed Configuration Planning and Preparation - maggiemsft/SONiC GitHub Wiki

Testbed Configuration Preparation

This figure gives a brief illustration of what SONiC testbed looks like

What would you need

  • Test server
  • fanout switches
  • SONiC DUT
  • Fiber Cables
  • Test lab information

Prepare testbed server

  • Install Ubuntu 16.04 amd64 on server.
  • Setup management port configuration using following sample /etc/network/interfaces.
  • Make sure Ubuntu server is reachable in network and able to remote SSH login.
root@server-1:~# cat /etc/network/interfaces
# The management network interface
auto ma0
iface ma0 inet manual

# Server, VM and PTF management interface
auto br1
iface br1 inet static
    bridge_ports ma0
    bridge_stp off
    bridge_maxwait 0
    bridge_fd 0
    address 10.250.3.10
    netmask 255.255.255.0
    network 10.250.3.0
    broadcast 10.250.3.255
    gateway 10.250.3.1
    dns-nameservers 10.250.0.201 10.250.0.202
    # dns-* options are implemented by the resolvconf package, if installed

Connect Testbed Physical Connections

Connect all SONiC DUTs, test servers and fanout switches together following above sample SONiC testbed figure.

  • root fanout

    • root fanout switch is a SONiC OS powered switch
    • All root fanout ports are vlan trunk ports
    • connect all servers' interface for test to root fanout switch
    • write down each connection in above mentioned ansible/testbed_inv/cable_connection.yaml file
  • leaf fanout

    • leaf fanout switches are also SONiC OS powered switches
    • Each leaf fanout switch has one uplink port connect to root fanout and rest of ports connecting to SONiC DUT
    • Connect uplink interface to root fanout switch
    • Connect SONiC DUT interfaces to leaf fanout switch
    • write down each connections in above mentioned ansible/testbed_inv/physical_connection.yaml file

Prepare Testbed Physical Connections file

All physical connections are recorded in a YAML format file under ansible/testbed_inv/physical_connection.yaml. This file is the central place of recording all physical connections within whole testbed. After you connect your DUTs to fanout switch and test server to fanout switches, put all connections to this connection file.

This Key for this YAML file are SONiC DUT and Root Fanout.

physical_connections:
  sonic_duts:
    sonic_dut_1:
      0: 
         peer_device: leaffanout_1
         peer_port_index: 0 
         port_phy_speed: 
      1:
         peer_device: leaffanout_1
         peer_port_index: 1 
         port_phy_speed: 
         ...
      16:
         peer_device: leaffanout_2
         peer_port_index: 0 
         port_phy_speed: 
         ...
      31:
         peer_device: leaffanout_2
         peer_port: 15 
         port_phy_speed: 

    sonic_dut2:
      0: ...

    rootfanout:
      1:
         peer_device: server_1
         peer_port: 1
         port_phy_spped: 40
      2:
         peer_device: fanout_leaf_1
         peer_port: 2
         peer_phy_speed: 100

another thought?

In case your testbed is very large, this file will becomes huge, yaml does not provide a straight forward include yaml. I think to have multiple files is much easier to maintain.

   physical_connection.yaml_
   includes: [ DUT1.yml, DUT2.yml]

Or For easy of manually maintain the SONiC testbed physical connections, you may use a .csv format files to record the connections and run a small python program to output this .json file?

  physical_connection.csv_
    dut_1,0,fanout_leaf_1,0,40000
    dut_1,1,fanout_leaf_1,1,40000

using SONiC OS fanout, need to have a SONiC fanout hardware SKU to do port break out randomly? based on how to break out, create port_config.ini & Broadcom.sai.profile???

Prepare Lab Basic Service Servers

These are a few basic network lab services are required for SONiC testbed to work properly. You may user one server to serve all services, or using different servers to serve different services.

  • NTP Server

    you have to have a lab NTP server for testing NTP protocol for SONiC NTP time sync. if you don't have a lab NTP server, you may consider using some public NTP servers for time sync.

    0.pool.ntp.org
    1.pool.ntp.org
    2.pool.ntp.org
    
  • DHCP Server

    In our lab, we are using dnsmasq as our lab DHCP server and configure it for each SONiC DUT ONIE boot install option. dnsmasq configuration example

    Following is our lab sample:

    There are 2 major parts to make your DHCP server work for SONiC testbed:

    • dnsmasq.conf : to configure your onie different tag download URL location
    • dnsmasq.d/lab : to configure your lab DHCP address assignment and boot set/tag options

    Sample configuration: dnsmasq.conf

    #
    # Lab
    #
    dhcp-option=tag:msn2700,114,"http://10.250.0.201/installer/sonic/generic/public/sonic-generic.bin"
    dhcp-option=tag:S6000,114,"http://10.250.0.201/installer/sonic/broadcom/public/sonic-broadcom.bin"
    dhcp-option=tag:labngs,225,"http://10.250.0.201/installer/sonic/minigraph/{{hostname}}.xml"
    dhcp-option=tag:labngs,226,"http://10.250.0.201/installer/sonic/acl.json"
    

    Sample configuration : dnsmasq.d/lab

    dhcp-range=set:10.250.3.0,10.250.3.100,10.250.3.196,24h
    # S6000
    dhcp-host=90:b1:1c:f4:9d:47,10.250.3.150,sonic_dut_1,24h,set:10.250.3.0,set:s6000,set:labngs
    

    DHCP server will based on DHCP request switch MAC address assign pre-defined IP address to SONiC DUT. Then based on the set it belongs to, it will follow ONIE image download link or initial configuration(minigraph.xml) download link.

    More onie-installer related configuration, please read more ONIE

    If SONiC DUTs management network is on the same network segment with your lab DHCP server, you are just fine. However, if your SONiC DUT management network is on a different subnet from your lab DHCP server, you need to add your DHCP server to IP helper forward option in your testbed management network.

    Sample configuration

     "VLAN": {
          "Vlansonic": {
              "dhcp_servers": [
                  "10.250.0.200"
              ], 
              "vlanid": "2500"
          }
     }, 
    
  • HTTP Server

    SONiC ONIE boot use HTTP URL installing SONiC image, We store all available images for installation in HTTP server. So, correctly setup and configure a HTTP server is the first step to load SONiC image through network.

    We are using Ubuntu default apache2 host our http service, here is more detailed configuration guide

    Sample configuration from our server in site-enabled/default.conf:

    ServerAdmin webmaster@localhost
    DocumentRoot /data/www
      <Directory />
                  Options FollowSymLinks
                  AllowOverride None
      </Directory>
      <Directory /data/www/>
                  Options Indexes FollowSymLinks MultiViews
                  AllowOverride None
                  Require all granted
      </Directory>
    
  • Syslog Server (optional)

    We are using rsyslog configured and serve as our lab syslog server.

    SONiC tests does not require syslog server. But have a syslog server that allow SONiC DUTs to send logs to server and can trace back later is more convenient for debugging....

Prepare Lab Network IP space

Next step is you are going to prepare your lab inventory files. There are many pieces within your lab testbed. This instruction is IPv4 based. Before you can put inventory files together, you need to make sure all devices in testbed are reachable by routing. You assign every device a correct IP address, have all access credentials ready for inventory files and secret file.

Depends on your lab management, you could have a regular lab management network, SONiC testbed management network and SONiC test network. You could have them in one flat lab management network or you may have different network segment in your lab. The principle is you have to have every components (VMs, ptf dockers, SONiC DUTs, Test servers, supporting servers) reachable to each other in whole testbed environment.

Prepare For Testbed Inventory

SONiC use Ansible to manage/deploy testbed and run test cases. All testbed related inventory information will be put under Ansible inventory folder: ansible/tetbed_inv/, all secret are stored in ansible/group_vars/all/secrets.json

You need to first prepare all these inventory files and related testbed information file before you can deploy testbed and run tests.

following inventory items are in inventory folder:

  • Static information inventory
    • servers
    • vms
    • SONiC duts
    • fanout swtiches
    • physical_connections
    • testbed topology(create a testbed role?, under inventory is more centralized) your testbed topology description file where they are currently in ansible/vars/topo_*.yml when run testbed related tasks, load these files first *Dynamic inventory
    • testbed actual interface mapping after port breakout
    • testbed topology connections
    • ptf docker inventory after topology connections
  • Servers

    Inventory file servers is the central place for all servers that host ptf dockers and VMs(EOS) information within testbed.

    [servers:children]
       server_1
       server_2
       ptfs
    
    [server_1:children]
       vm_host_1
       vms_1
    
    [vm_host]
       vm_host_1
       vm_host_2
    
    [vm_host_1]
      STAR-ACS-SERV-01
    
    [vm_host_1: vars]
      ansible_host: 10.3.255.245
      mgmt_bridge: br1
      mgmt_prefixlen: 17 
      mgmt_gw: 10.3.255.1
      vm_mgmt_gw: 172.16.128.1
      external_iface: p4p1
    
    [ptfs]
      start_addr: 10.250.3.200
      total_available: 50
    
    • Check that ansible could reach this device by command ansible -m ping -i veos vm_host_1. Here:
    • external_iface: server trunk port name (connected to the fanout switch)
    • mgmt_gw: ip of gateway for VM mgmt interfaces
    • mgmt_prefixlen: prefixlen for management interfaces
  • VMs

    We are using EOS VMs as neighbor routers to create a testing topology for testing SONiC DUT. EOS images are downloaded directly from public available images from Arista support site.

    File or files starting with vms_* are inventory files to record all eos vms information

    [eos:children]
     vms_1
     vms_2
    
    [vms_1]
     VM0100 ansible_host=172.16.200.32
     VM0101 ansible_host=172.16.200.33
     VM0102 ansible_host=172.16.200.34
     VM0103 ansible_host=172.16.200.35
    
    • Download vEOS image from arista.
    • Copy below image files to ~/veos-vm/images on your testbed server.
      • Aboot-veos-serial-8.0.0.iso
      • vEOS-lab-4.15.9M.vmdk
    • Update VM IP addresses ansible/testbed_inv/vms inventory file. These IP addresses should be in the management subnet defined above.
  • fanout switches

    [fanout:children] fanout_eos fanout_sonic [fanout_eos:children] Arista64_40 hwsku: Airsta32_100 hwsku: Arista64_100 hwsku: [fanout_sonic] str-7060cx-09 ansible_host=10.3.255.30 hwsku= str-7260cx3-01 ansible_host= hwsku= [Arista64_40] str-7260-01 ansible_host=10.3.255.76 str-7260-02 ansible_host=10.3.255.105 [fanout: vars] vars=ansible/group_vars/secrets.json

    Update fanout inventor file ansible/testbed_inv/fanouts inventory file

    [fanout]
      str-7060cx-09 ansible_host=10.3.255.30 hwsku=
      str-7260cx3-01 ansible_host=10.3.255.31 hwsku=
    
  • SONiC Device Under Test

    Dependency: There are some assumptions that all hwskus referred here are defined in correct format in sonic-buildimage. if not, consider to put a temp port_config.ini file in sonic_mgmt

    [sonic_devices:children]
      sonic_dell
      sonic_arista
      sonic_celestica
      sonic_eval
    
    [sonic_dell]  
      sonic_s6000_on_1 ansible_host=10.3.255.20 hwsku=force10-s6000 
      
    [sonic_arista]
      sonic_arista_sku1 ansible_host=10.3.253.21 hwsku=arista-7060sku-1 
      sonic_arista_sku2 ansible_host=10.3.253.21 hwsku=arista-7060sku-2 base_deivce=sonic_arista_sku1
      The base_device here is the sonic_dut name in physical connection file. We have only one DUT name in connection files. But sonic_dut can change name due to port break out, so ... ? 
     
    
    [sonic_arista_sku1:vars]
      port_ini: ansible/vars/sonic/???
    

    Question: We should allow static or temp port_ini.json file here to override default? only when we cannot find portini file here, then try to find default location?

  • Testlab environmental data (was in group_vars/lab/lab.yml)

    SONiC switch will integrate these configurations into configuration minigraph.xml at the time of trying to generating configuration file minigraph.xml based on testbed selection, so the SONiC DUT will know where these testing resources are.

    If later we skip minigraph.xml and use config_db.json only, then these need to be in config_db.json

    IMPORTANT: For now, the DHCP servers here are faked DHCP servers in your testbed, please don't put your real DHCP servers. We are going to revise test case to use test data, then we will put read DCHP servers here

    [services]
      ntp_server:
      syslog_servers:
      dns_servers:
      snmp_location:
      forced_mgmt_routes:
      tacacs_servers:
      dhcp_servers: ['192.0.0.1', …...] 
    
    [docker_registry]
      acs-devrepo.corp.microsoft.com
      acs-repo.corp.microsoft.com
      ...
    

secret management

Put all secrets for SONiC testbed in one file under ansible/group_vars/all/secrets.

We are using Ansible as the main deployment and testing tool, each Ansible role may have its own secret, which usually define under role's secret file. For first time Ansible user with so many types of testbed roles and groups, it's very confusing and hard to find the correct location of each secret definition. This centralized management, is much easier finding and managing all secrets in one place, and each role will get its secrets from this file "secret_group_vars" or other specified section.

Secrets file is a json format file for all secrets testbed going to use.

{
  "switch_login": {
      "Arista": {
          "user": "admin",
          "passwd": ["password", "123456"],
          "enable": ['', null]
      },
      "Force10": {
          "user": "admin",
          "passwd": ["password"],
          "enable": ["password"] 
      }
  },
  "secret_group_vars": {
      "eos": {
          "ansible_user": "root"
          "ansible_password": "123456"
       },
      "fanout: {
          "ansible_ssh_user": "admin"
          "ansible_ssh_password": "password"
       },
      ......
  }
}

Define Testbed pre-defined Vlan and VM sets

When we build Virtual testbed, there is a default assumption that Vlan-ids and VMs to connected to one SONiC DUT are contiguous. So, each DUT uses how many Vlan-ids and each topology use how many VMs are based on DUT interface numbers and topology definitions. Based on these, we group vlans and VMs in blocks to allow DUT easy to allocate then for building virutal topology.

This file defines:

  • Vlans is a set of Vlan-ids that available to use within testbed, vlan32 means each block use 32 vlan-ids.
  • Vms_topo is the VM sets that available to build virtual topology, here VMs are grouped by topology types.
  • topologies defines how many toplogy available
 [vlans]
   Vlan32: [100,132,164,...,1956]   --- starting vlanid(total 59) 
   Vlan64: [2000, ... , 2896]  --- (total 14)
   Vlan128: [2960, ....,3856]  --- (total 8)

 [vms_topo]
   vms_topo_4: [VM0100,VM0104,VM108]  --- starting VM
   vms_topo_8: [VM0116,VM0124]
   vms_topo_24: [VM0132,VM0156]
   vms_topo_32: [VM0200,VM0232]
   
 topologies: t0, t1, t1-lag, t0-64, ......
⚠️ **GitHub.com Fallback** ⚠️