uperf manual - uperf/uperf GitHub Wiki
Unified Performance Tool or uperf for short, is a network performance measurement tool that supports execution of workload profiles
Micro-benchmarks rarely measure real world performance. This is especially true in networking where applications can use multiple protocols, use different types of communication, interleave CPU processing with communication, etc. Popular micro-benchmarks like iPerf and netperf are simplistic, supporting one protocol at a time, fixed message sized communication, no support for interleaving CPU processing between communication, and so on. Thus, we need for a tool to model real world performance.
Uperf (Unifed performance tool for networking) solves this problem by allowing the user to model the real world application using a high level language (called profile) and running this over the network. It allows the user to use multiple protocols, varying message sizes for communication, a 1xN communication model, support for collection of CPU counter statistics, and much more.
The following list is a short overview of the features supported by uperf:
- Support modeling of workloads using profiles
- Support for TCP/UDP/SSL/SCTP/VSOCK protocols.
- 1-N hosts
- Support for CPU counters and lots of other detailed statistics
- Ability to choose whether to use processes or threads
- Runs on Linux, BSD, Solaris and Windows
uperf is opensource software using the GNU General Public License v2 . You can download it from http://uperf.org. Binaries are available for Solaris and Linux.
uperf can be run as either a master(active) or slave(passive). When
run as active it needs master flag(-m) with a profile describing the
test application.
Uperf Version 1.0.7
Usage: uperf [-m profile] [-hvV] [-ngtTfkpaeE:X:i:P:RS:]
uperf [-s] [-hvV]
-m <profile> Run uperf with this profile
-s Slave
-S <protocol> Protocol type for the control Socket [def: tcp]
-n No statistics
-T Print Thread statistics
-t Print Transaction averages
-f Print Flowop averages
-g Print Group statistics
-k Collect kstat statistics
-p Collect CPU utilization for flowops [-f assumed]
-e Collect default CPU counters for flowops [-f assumed]
-E <ev1,ev2> Collect CPU counters for flowops [-f assumed]
-a Collect all statistics
-X <file> Collect response times
-i <interval> Collect throughput every <interval>
-P <port> Set the master port (defaults to 20000)
-R Emit raw (not transformed), time-stamped (ms) statistics
-v Verbose
-V Version
-h Print usage
More information at http://www.uperf.org
uperf source distribution has sample profiles in the workloads
directory. You can tweak them to suit your needs or write your own
profile. Many of the profiles pick up values (like remotehost, or
protocol) from the ENVIRONMENT. These variables begin with the $ sign
in the profile. You can either set these (via export h=192.168.1.4) or
hard code them in the profile.
The list of profiles included by uperf is as follows
This profile represents the request-response kind of traffic. One
thread on the master is reading and writing 90 bytes of data from the
slave. The remote end (slave) address is specified via the $h
environment variable. $proto specifies the protocol to be used.
In this profile, multiple threads simulates one way traffic (8K size)
between two hosts (like the iperf networking tool) for 30
seconds. $h specifies the remote host, $proto specifies the
protocol, and $nthr specifies the numnber of threads.
In this profile, multiple threads try to connect and disconnect from
the remote host. You can use this to measure the connection setup
performance. $nthr specifies the numnber of threads, and $iter
determines number of connects and disconnects each thread will do.
This profile demonstrates an application in which each thread opens a connection each to two hosts, and then reads 200 bytes from the first connection and writes it to the other connection.
A profile is a description(in XML) that describes the workload. For example, a description of a request-response workload (like netperf) is "each thread sends 100 bytes and receives 100 bytes using UDP". For a more complex application, we may specify the number of connections, number of threads, are the threads all doing the same kind of operation, what protocols to use, Is the traffic bursty?, etc. uperf defines a language to specify all of these information in a machine-understandable format (XML) called a profile. uperf then parses and runs whatever the profile specifies. The user has to specify the profile for the master only. uperf automatically transforms the profile for the slaves and uses it.
The profile needs to be a valid XML file. Variables that begin with a '$' are picked up from the ENVIRONMENT.
Below is a profile for the request-response micro benchmark.
<?xml version="1.0"?>
<profile name="netperf">
<group nthreads="1">
<transaction iterations="1">
<flowop type="accept" options="remotehost=$h protocol=$proto
wndsz=50k tcp_nodelay"/>
</transaction>
<transaction duration="30s">
<flowop type="write" options="size=90"/>
<flowop type="read" options="size=90"/>
</transaction>
<transaction iterations="1">
<flowop type="disconnect" />
</transaction>
</group>
</profile>
Every profile begins with a standard XML header which contains the
name of the profile. This name however is not used by uperf. The
major parts of a profile are
- Group
- Transaction
- Flowop
Lets look at each of these in detail.
A profile can have multiple groups. A group is a collection of threads or processes that execute transactions contained in that group.
A transaction is a unit of work. All threads or processes start
executing transactions at the same time. By default, the transaction
executes its contents once. Transactions have either an iteration or a
duration associated with it. If you use <transaction iteration=1000>, uperf will execute the contents of the
transactions 1000 times. If <transaction duration=30s> is specified,
the contents of the transaction are executed for 30 seconds.
The contents of the transaction are called flowops. These basic operations (building blocks) are used to define a workload. Current supported flowps are
- Connect
- Accept
- disconnect
- read
- write
- redv
- sendto
- sendfilev
- NOP
- think
Every Flowop has a set of options. In the XML file, these are space separated. The supported options are listed below.
-
count: The number of times this flowop will be executed -
duration: The amount of time this flowop will execute. Example:duration=100ms. This option is deprecated. Specify the duration in the transaction -
rate: This option causes uperf to execute this flowop at the specified rate for =iterations= or =duration= seconds.
The connect flowop specifies that a connection needs to be opened. The options parameter specifies more details regaring the connection. The following keys are supported
-
remotehost: The remote host that we need to connect or accept connection from -
protocol: The protocol used to connect to the remote host. Valid values are tcp, udp, ssl, sctp, and vsock -
tcp_nodelay: Controls whetherTCP_NODELAYis set or not -
wndsz: Size of the socket send and receive buffer. This parameter is used to setSO_SNDBUF,SO_RCVBUFflags usingsetsocktopt() -
engine: SSL Engine. -
port: You can specify a port toconnect(oraccept). This is useful in cases where specific ports can accept connections. Note that this only controls the initialconnectoraccept. The actual data transfer happens on a different port as accept creates a new socket to handle the connection.
-
size: Amount of data that is either read or written. Uperf supports exchange of * Fixed size messages * Asymmetrical size messages * Random size messages For fixed size messages, the master and all slaves used a fixed size for receives and transmits. For asymmetrical sized messages, the slaves use the size specified by thersizeparameter. The master still uses the size parameter. For a random sized message, the a uniformly distributed value between the user specifed min and max is used by the transmitting side, and the receiving side uses the max as the message size. Example:size=64korsize=rand(4k,8k) -
rsize: See description of asymmetrical messages above. -
canfail: Indicates that a failure for this flowop will not cause uperf to abort. This is espcially useful in UDP where a packet drop does not constitute a fatal error. You can use this to test a SYN flood attack (Threadsconnect()in a loop ignoring errors). -
non_blocking: Use non-blocking IO. The socket/file descriptor is set the =NO_BLOCK= flag. -
poll_timeout: With this option, the thread will first poll for specified duration before trying to carry out the operation. A poll timeout is returned as an error back to uperf. -
conn: Every open connection is assigned a connection name. Currently, the name can be any valid integer, however, uperf could take a string in the future. =conn= identifies the connection to use with this flowop. This connection name is thread private.
The sendfile flowop uses the sendfile(3EXT) function call to transfer
a single file. The sendfilev flowop transfers a set of files using the
sendfilev(3EXT) interface. Multiple files are randomly picked from all
transferrable files (see dir below) and tranferred to the slave.
-
dir: This parameter identifies the directory from which the files will be transferred. The directory is search recursively to generate a list of all readable files. Example:dir=/www -
nfiles: This parameter identifies the number of files that will be transferred with each call tosendfilev(3EXT). This is used as the 3rd argument to thesendfilev(3EXT). =nfiles= is assumed to be 1 for the sendfile flowop. function. Example:nfiles=10 -
size: This parameter identifies the chunk size for the transfer. Instead of sending the whole file, uperf will send size sized chunks one at a time. This is used only ifnfiles=1
uperf collects quite a wide variety of statistics. By default, uperf prints the throughput every second while the test is running, and then prints out the total throughput. uperf also prints the network statistics, calculated independently using system statistics, to verify the throughput reported via uperf. uperf also prints statistics from all the hosts involved in this test to validate the output.
Some of the statistics collected by uperf are listed below
- Throughput
- Latency
- Group Statistics
- Per-Thread statistics
- Transaction Statistics
- Flowops Statistics
- Netstat Statistics
- Per-second Throughput
bash$ ./framework/uperf -m netperf.xml -a -e -p
Starting 4 threads running profile:netperf ... 0.01 seconds
Txn0 0B/1.01 (s) = 0b/s 3txn/s 254.89ms/txn
Txn1 195.31MB/30.30 (s) = 54.07Mb/s 13201txn/s 2.30ms/txn
Txn2 0B/0.00 (s) = 0b/s
--------------------------------------------------------------------------------
netperf 195.31MB/32.31(s) = 50.70Mb/s (CPU 21.42s)
Section: Group details
--------------------------------------------------------------------------------
Elapsed(s) CPU(s) DataTx Throughput
Group0 32.31 21.40 195.31M 50.70M
Group 0 Thread details
--------------------------------------------------------------------------------
Thread Elapsed(s) CPU(s) DataTx Throughput
0 32.31 5.30 48.83M 12.68M
1 32.31 5.31 48.83M 12.68M
2 32.31 5.44 48.83M 12.68M
3 32.31 5.36 48.83M 12.68M
Group 0 Txn details
--------------------------------------------------------------------------------
Txn Avg(ms) CPU(ms) Min(ms) Max(ms)
0 5.45 0.51 5.37 5.68
1 0.29 0.00 0.23 408.63
2 0.32 0.16 0.07 0.81
Group 0 Flowop details (ms/Flowop)
--------------------------------------------------------------------------------
Flowop Avg(ms) CPU(ms) Min(ms) Max(ms)
Connect 5.41 0.49 5.31 5.66
Write 0.02 0.00 0.01 0.53
Read 0.25 0.00 0.05 408.59
Disconnect 0.30 0.14 0.06 0.79
Netstat statistics for this run
--------------------------------------------------------------------------------
Nic opkts/s ipkts/s obits/s ibits/s
ce0 12380 12391 30.68M 30.70M
ce1 0 0 0 84.67
--------------------------------------------------------------------------------
Waiting to exchange stats with slave[s]...
Error Statistics
--------------------------------------------------------------------------------
Slave Total(s) DataTx Throughput Operations Error %
192.9.96.101 32.25 195.31MB 50.80Mbps 800008 0.00
Master 32.31 195.31MB 50.70Mbps 800008 0.00
--------------------------------------------------------------------------------
Difference(%) 0.20% 0.00% -0.20% 0.00% 0.00%
Q: Where can I submit bugs/feedback? A: Please open a GitHub issue.
Q: How do I specify which interface to use? A: uperf just specifies the host to connect to. It is upto the OS to determine which interface to use. You can change the default interface to that host by changing the routing tables
Q: Does the use of -a affect performance?
A: Since -a collects all kinds of statistical information, there is a
measurable impact when the flowop is lightweight (UDP TX for small
packets).
Q: Does uperf support socket autotuning on Linux?
A: uperf currently always call setsocketopt(), and this disables
autotuning on Linux, so you can't test autotuning. If no window
size(wndsz) is specified in the profile, setsocketopt()won't be
called by uperf, thus enabling the autotuning on Linux
Q: Why do you even have a -n option?
A: uperf uses a global variable to count the number of bytes transferred. This is updated using atomic instructions atomic_add_64 function. However, if you have thousands of threads, there is very high likelyhood that many threads update this value simultaneously. causing higher CPU utilization. The -n helps in this case.
Q: Why do we have an option to do sendfilev with chunks?
A: We identified an issue where chunked sendfilev's were faster than transferring the whole file in one go. This will help debug the issue.
Q: How can I use "named" connections
A: uperf supports named connections. To specify a name, you should
specify conn=X variable in the options to a connect or accept
flowop. For example,
<flowop type="connect" options="conn=2 remotehost=$h protocol=tcp>
If a name is not specified, the connection is an anonymous connection. For any flowop, if a connection is not specified, it uses the first anonymous connection.
uperf was developed by the Performance Availability Engineering group at Sun Microsystems. It was originally developed by Neelakanth Nadgir and Nitin Rammanavar. Jing Zhang added support for the uperf harness. Joy (Chaoyue Xiong) added SSL support, and Eric He ported it to windows. Charles Suresh Alan Chiu, Jan-Lung Sung have contributed to its design and development.