iperf3 - shawfdong/hyades GitHub Wiki

iperf3 is a tool for active measurements of the maximum achievable bandwidth on IP networks. We've used iperf3 to measure the network bandwidth between pairs of hosts, under various conditions. Here we document our methods and results.

Table of Contents

Test hosts

We are mostly interested in the performance of 10GbE adapters. In this test, we used 4 hosts in the Hyades cluster, each of which has a dual-port 10GbE adapter. All 10GbE ports on those hosts are connected to a Dell 8132 10GbE switch, using SFP+ Direct Attach Copper cables. Unless otherwise noted, all measures were performed on the 2nd ports, the IP addresses of which all belong to the private subnet 10.7.0.0/16. Here is a summary of the 4 test hosts:

Host OS IP/mask Interface Hardware
hyades RHEL 6.4 10.7.8.1/16 em2 Broadcom NetXtreme II BCM57800 10 Gigabit Ethernet Adapter
ambrosia FreeBSD 9.2 10.7.7.1/16 ix1 Intel Ethernet Converged Network Adapter X520-DA2
aesyle RHEL 6.4 10.7.7.2/16 em2 Broadcom NetXtreme II BCM57800 10 Gigabit Ethernet Adapter
eudora RHEL 6.4 10.7.7.3/16 em2 Broadcom NetXtreme II BCM57800 10 Gigabit Ethernet Adapter

Installing iperf3

On RHEL and its derivatives, the RPM package for iperf3 is provided by the EPEL repository. Once the EPEL repository is enabled, one can install iperf3 with:

yum install iperf3

On FreeBSD 9.2, one can use pkg to install iperf3:

pkg install iperf3

TCP

To measure TCP bandwidth, on server we ran:[1]

iperf3 -s -V

on client we ran:

iperf3 -c 10.7.x.x -i 1 -t 100 -V
where 10.7.x.x is server's IP address.

Linux server & client, before tuning, MTU = 1500

Using default settings – without any tuning of the kernel parameters – the Linus hosts delivered a respectable TCP bandwidth of 7.24 Gbps between the 10GbE adapters:

[ ID]  Interval            Transfer     Bandwidth         Retr
[  4]  0.00-100.00  sec    84.3 GBytes  7.24 Gbits/sec    0             sender
[  4]  0.00-100.00  sec    84.3 GBytes  7.24 Gbits/sec                  receiver
CPU Utilization: local/sender 21.7% (0.2%u/21.6%s), remote/receiver 10.9% (0.0%u/10.9%s)
where 84.3 GBytes = 84.3 * 8 * 1.024 * 1.024 * 1.024 = 724 Gbits, so Bandwidth = 724 Gbits / 100 sec = 7.24 Gbits/sec

Note By comparison, TCP bandwidth between the GbE adapters is close to wire speed:

[ ID]  Interval            Transfer      Bandwidth       Retr
[  4]  0.00-100.00  sec    10.9 GBytes   934 Mbits/sec   13             sender
[  4]  0.00-100.00  sec    10.9 GBytes   932 Mbits/sec                  receiver
CPU Utilization: local/sender 3.8% (0.1%u/3.8%s), remote/receiver 21.4% (0.4%u/21.0%s)

Linux server & client, after tuning, MTU = 1500

After tuning the kernel parameters for TCP/IP as described in Linux Network Tuning, we again measured the TCP bandwidth between Linux hosts.

Using aesyle as server and eudora as client, we got:

[ ID]  Interval           Transfer    Bandwidth       Retr
[  4]  0.00-100.00 sec    109 GBytes  9.35 Gbits/sec  1907            sender
[  4]  0.00-100.00 sec    109 GBytes  9.35 Gbits/sec                  receiver
CPU Utilization: local/sender 53.7% (0.4%u/53.4%s), remote/receiver 58.7% (0.5%u/58.2%s)

By comparison, using eudora as server and aesyle as client, we got a lower throughput:

[ ID]  Interval           Transfer    Bandwidth         Retr
[  4]  0.00-100.00 sec    100 GBytes  8.60 Gbits/sec    0             sender
[  4]  0.00-100.00 sec    100 GBytes  8.60 Gbits/sec                  receiver
CPU Utilization: local/sender 62.2% (0.7%u/61.5%s), remote/receiver 95.7% (0.5%u/95.2%s)
Most likely, the lower performance was because eudora was busy during the measurement, as indicated by the high CPU usage.

Linux server & client, after tuning, MTU = 9000

Next we enabled Jumbo Frame on the Linux hosts and the Dell 8132 10GbE switch. TCP bandwidth between the Linux hosts is now approaching the wire speed!

[ ID]  Interval            Transfer      Bandwidth         Retr
[  4]  0.00-100.00  sec    115.2 GBytes  9.90 Gbits/sec    0             sender
[  4]  0.00-100.00  sec    115.2 GBytes  9.90 Gbits/sec                  receiver
CPU Utilization: local/sender 52.2% (0.4%u/51.8%s), remote/receiver 48.5% (0.7%u/47.8%s)

Note on VLAN vs. Direct Connection:

Neither of the following had any noticeable impact on TCP bandwidth:

  • Assigning the switch ports to a VLAN;
  • Connecting Linux hosts (hyades & eudora) directly with an SFP+ Direct Attach Copper cable, rather than going through the switch.

Linux server & FreeBSD client, after tuning, MTU = 9000

On the FreeBSD host (ambrosia), we tuned kernel parameters for TCP/IP as described in FreeBSD Network Tuning, and enabled Jumbo Frame. Using aesyle (Linux) as server and ambrosia (FreeBSD) as client, we got:

[ ID]  Interval            Transfer      Bandwidth
[  4]  0.00-100.00  sec    109.7 GBytes  9.42 Gbits/sec                  sender
[  4]  0.00-100.00  sec    109.7 GBytes  9.42 Gbits/sec                  receiver
CPU Utilization: local/sender 0.0% (0.2%u/58.1%s), remote/receiver 38.7% (0.8%u/38.0%s)

Note on TCP Segmentation Offload (TSO):

Due to a device driver bug for the Intel 10GbE adapter in FreeBSD, we may have to turn tso off:

# ifconfig ix1 -tso

But we saw no noticeable difference in TCP bandwidth:

[ ID]  Interval             Transfer      Bandwidth
[  4]   0.00-100.00  sec    109.5 GBytes  9.40 Gbits/sec                  sender
[  4]   0.00-100.00  sec    109.5 GBytes  9.40 Gbits/sec                  receiver
CPU Utilization: local/sender 0.0% (0.3%u/59.7%s), remote/receiver 24.0% (0.5%u/23.5%s)

FreeBSD server & Linux client, after tuning, MTU = 9000

Using ambrosia (FreeBSD) as server and aesyle (Linux) as client, we got a slightly higher throughput:

[ ID]  Interval              Transfer      Bandwidth         Retr
[  4]  0.00-100.00  sec    113.0 GBytes  9.70 Gbits/sec    0             sender
[  4]  0.00-100.00  sec    113.0 GBytes  9.70 Gbits/sec                  receiver
CPU Utilization: local/sender 63.4% (0.6%u/62.8%s), remote/receiver 0.0% (3.0%u/24.2%s)

Once again, we saw no noticeable difference in TCP bandwidth when tso was turned off:

[ ID]  Interval            Transfer      Bandwidth         Retr
[  4]  0.00-100.00  sec    111.3 GBytes  9.57 Gbits/sec    0             sender
[  4]  0.00-100.00  sec    111.3 GBytes  9.56 Gbits/sec                  receiver
CPU Utilization: local/sender 62.3% (0.6%u/61.7%s), remote/receiver 0.0% (4.8%u/36.4%s)

UDP

To measure UDP bandwidth, on server (aesyle) we ran:[1]

iperf3 -s -V

on client (eudora) we ran:

iperf3 -c 10.7.7.2 -u -i 1 -t 10 -b 10G

Linux server & client, before tuning, MTU = 1500

Using default settings – without any tuning of the kernel parameters – the UDP performance between the Linux hosts was rather poor:

[ ID]  Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  4]  0.00-10.00  sec    5.45 GBytes  4.68 Gbits/sec  0.036 ms  436833/714057 (61%)  
Note the the high data lost rate!

Linux server & client, after tuning, MTU = 9000

After tuning the kernel parameters for TCP/IP as described in Linux Network Tuning and enabling Jumbo Frame, the UDP bandwidth between the Linux hosts is approaching the wire speed:

[ ID]  Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  4]  0.00-10.00  sec    11.1 GBytes  9.53 Gbits/sec  0.007 ms  38733/1453766 (2.7%)  
Data loss rate is lower but still nonzero!

Conclusions

  1. On modern OSes, including Linux & FreeBSD, the default settings are optimal for GbE NICs; but not for 10GbE NICs.
  2. To achieve the wire speed of 10 GbE NICs on Linux & FreeBSD, one must tune the kernel parameters for TCP/IP.
  3. Tuning has more significant impact on UDP performance, than on TCP.[2]
  4. Enabling jumbo frame (MTU=9000) does not appear to noticeably increase network bandwidth, nor decrease CPU usage.
  5. Turning off tso on FreeBSD does not appear to noticeably decrease network bandwidth, nor increase CPU usage.

References

  1. a b iperf / iperf3
  2. ^ UDP Tuning
⚠️ **GitHub.com Fallback** ⚠️