Periodic performance measurements - vincentkfu/fio-blog GitHub Wiki
Adam Manzanares provided helpful feedback on this post.
I have been using fio for over a decade but still regularly find features I am not familiar with. One of these is fio's summary statistics for periodic bandwidth and IOPS measurements. These are reported via the two "bw" and "iops" lines highlighted in the output below, directly beneath the completion latency percentile panel.
$ fio --name=test --ioengine=null --filesize=1G --time_based --runtime=3s
test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=null, iodepth=1
fio-3.34-46-g0771
Starting 1 process
Jobs: 1 (f=1): [R(1)][-.-%][r=13.0GiB/s][r=3399k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=8498: Mon May 1 20:24:28 2023
read: IOPS=3360k, BW=12.8GiB/s (13.8GB/s)(38.4GiB/3000msec)
clat (nsec): min=37, max=62397, avg=46.13, stdev=146.11
lat (nsec): min=73, max=62490, avg=87.04, stdev=197.77
clat percentiles (nsec):
| 1.00th=[ 40], 5.00th=[ 42], 10.00th=[ 42], 20.00th=[ 43],
| 30.00th=[ 43], 40.00th=[ 44], 50.00th=[ 44], 60.00th=[ 45],
| 70.00th=[ 45], 80.00th=[ 45], 90.00th=[ 48], 95.00th=[ 51],
| 99.00th=[ 57], 99.50th=[ 78], 99.90th=[ 110], 99.95th=[ 113],
| 99.99th=[ 362]
+------------------------------------------------------------------------------------------+
| bw ( MiB/s): min=12354, max=13351, per=100.00%, avg=13130.32, stdev=435.00, samples=5 |
| iops : min=3162738, max=3418112, avg=3361363.60, stdev=111358.31, samples=5 |
+------------------------------------------------------------------------------------------+
lat (nsec) : 50=93.49%, 100=6.26%, 250=0.23%, 500=0.01%, 750=0.01%
lat (nsec) : 1000=0.01%
lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
lat (usec) : 100=0.01%
cpu : usr=99.93%, sys=0.00%, ctx=3, majf=0, minf=8
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=10079211,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=12.8GiB/s (13.8GB/s), 12.8GiB/s-12.8GiB/s (13.8GB/s-13.8GB/s), io=38.4GiB (41.3GB), run=3000-3000msec
Fio continuously monitors bytes transferred and I/O operations completed. By default fio will measure bandwidth and IOPS in each half-second interval and then report results based on these measurements.
The reported values are mostly straightforward. For bandwidth, the collected measurements had a minimum of 12,354 MiB/s and a maximum of 13,351 MiB/s. Of the jobs in this one's reporting group, the measured bandwidth constituted 100 percent of the group's bandwidth (since this group included only one job). Bandwidth measurements had an average of 13,130 MiB/s and a standard deviation of 435 MiB/s. In total five bandwidth measurements were collected.
The statistics reported for IOPS are similar, although the group percentage is omitted. The five IOPS measurements had a minimum of 3,162,738, a maximum of 3,418,112, a mean of 3,361,364, and a standard deviation of 111,358. Note that IOPS and bandwidth values are collected independently, and should not be expected to match exactly with each other.
Two options control the size of the intervals over which bandwidth and IOPS are calculated. Here are the descriptions from the documentation:
Name | Description |
---|---|
bwavgtime=int |
Average the calculated bandwidth over the given time. Value is specified in milliseconds. If the job also does bandwidth logging through write_bw_log , then the minimum of this option and log_avg_msec will be used. Default: 500ms. |
iopsavgtime=int |
Average the calculated IOPS over the given time. Value is specified in milliseconds. If the job also does IOPS logging through write_iops_log , then the minimum of this option and log_avg_msec will be used. Default: 500ms. |
bwavgtime
controls the size of the intervals over which bandwidth
measurements are collected. By default bandwidth is calculated over half second
intervals. iopsavgtime
specifies the intervals over which IOPS measurements
are collected. The default is the same at 500ms. Note that fio will ignore any
units specified with the values and always parse values as milliseconds (i.e.,
bwavgtime=1000s
will have fio collect bandwidth measurements every second).
Also worth noting is the relationship between these two options and
log_avg_msec
. As the documentation states, the respective actual measurement
frequencies will be the minimum of {bw|iops}avgtime
and the value specified
for log_avg_msec
. The reason is that the code used for these periodic
measurements is also used for write_{bw|iops}_log
when log_avg_msec
is set.
The value set for {bw|iops}avgtime
can also increase the frequency with which
these logs are written when the times are less than log_avg_msec
.
The most useful aspect of these reported values is that they provide information on the variability of performance levels during the job run. The averages of these measurements are not especially useful as they are just estimates of the overall bandwidth and IOPS for the full job which are already reported exactly elsewhere in the output. However, nowhere else do we see how performance varied during the job's run time. Was bandwidth essentially constant during the job? Or did IOPS vary wildly over the course of the run?
Here is an example of what these measurements can tell us about variability. I ran two sequential read workloads on a hard drive. Run 1 was carried out when the device was otherwise idle. Run 2 was carried out while a separate random read job was running on the same device. The table below lists the summary statistics for the periodic bandwidth and IOPS measurements.
Statistic | Run 1: No background job | Run 2: Background random read job |
---|---|---|
Bandwidth (KiB/s) | ||
standard deviation | 2,068 | 3,212 |
range | 10,176 | 10,024 |
average | 112,742 | 6,738 |
minimum | 110,408 | 3,680 |
maximum | 120,584 | 13,704 |
N | 19 | 19 |
IOPS | ||
standard deviation | 517 | 803 |
range | 2544 | 2506 |
average | 28185 | 1684 |
minimum | 27602 | 920 |
maximum | 30146 | 3426 |
N | 19 | 19 |
The standard deviation is higher for Run 2 with a background random read job than for Run 1 where the device was otherwise idle for both bandwidth (3212 KiB/s vs 2068 KiB/s) and IOPS (803 vs 517). When there is no background activity, a sequential read job should produce a smooth performance curve, but with a random read workload in the background it is not surprising to see higher variability.
The other measures of variability also tell a similar story. The ranges (difference between minimum and maximum) for bandwidth (10,024 vs 10,176) and IOPS (2,506 vs 2,544) are similar between the two trials but this direct comparison does not take into account the actual performance levels. The bandwidth range for Run 1 is less than 10% of the run's overall average bandwidth, but the bandwidth range for Run 2 is nearly 1.5 times the run's overall average bandwidth. Thus, as a proportion of the respective performance levels, the bandwidth range for Run 2 dwarfs the bandwidth range for Run 1. The same is true for the IOPS ranges. In terms of absolute value the IOPS ranges are similar but relative to the performance levels of the respective runs, the Run 2 IOPS range is much larger than the Run 1 IOPS range.
The full output for the two fio invocations is below:
Full comparison output
# ./fio --name=test --filename=/dev/sda --runtime=10 --direct=1
test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.34
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=110MiB/s][r=28.3k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=528049: Tue May 2 14:09:51 2023
read: IOPS=28.2k, BW=110MiB/s (115MB/s)(1101MiB/10001msec)
clat (usec): min=25, max=11609, avg=34.81, stdev=22.28
lat (usec): min=25, max=11610, avg=34.90, stdev=22.29
clat percentiles (usec):
| 1.00th=[ 27], 5.00th=[ 31], 10.00th=[ 31], 20.00th=[ 31],
| 30.00th=[ 32], 40.00th=[ 33], 50.00th=[ 35], 60.00th=[ 38],
| 70.00th=[ 38], 80.00th=[ 39], 90.00th=[ 40], 95.00th=[ 40],
| 99.00th=[ 44], 99.50th=[ 47], 99.90th=[ 64], 99.95th=[ 113],
| 99.99th=[ 135]
bw ( KiB/s): min=110408, max=120584, per=100.00%, avg=112741.89, stdev=2068.25, samples=19
iops : min=27602, max=30146, avg=28185.47, stdev=517.06, samples=19
lat (usec) : 50=99.74%, 100=0.20%, 250=0.06%, 500=0.01%
lat (msec) : 20=0.01%
cpu : usr=2.76%, sys=16.56%, ctx=281754, majf=0, minf=11
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=281748,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=110MiB/s (115MB/s), 110MiB/s-110MiB/s (115MB/s-115MB/s), io=1101MiB (1154MB), run=10001-10001msec
Disk stats (read/write):
sda: ios=278433/0, merge=0/0, ticks=8554/0, in_queue=8554, util=99.09%
# ./fio --name=test --startdelay=3s --filename=/dev/sda --runtime=10 --direct=1 --name=bg --filename=/dev/sda --rw=randread --direct=1 --runtime=14s
test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
bg: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.34
Starting 2 processes
Jobs: 1 (f=1): [_(1),r(1)][58.3%][r=2202KiB/s][r=550 IOPS][eta 00m:10s]
test: (groupid=0, jobs=1): err= 0: pid=528042: Tue May 2 14:09:33 2023
read: IOPS=1801, BW=7204KiB/s (7377kB/s)(70.4MiB/10001msec)
clat (usec): min=104, max=87255, avg=554.04, stdev=3622.48
lat (usec): min=104, max=87255, avg=554.16, stdev=3622.48
clat percentiles (usec):
| 1.00th=[ 147], 5.00th=[ 153], 10.00th=[ 206], 20.00th=[ 215],
| 30.00th=[ 219], 40.00th=[ 225], 50.00th=[ 227], 60.00th=[ 229],
| 70.00th=[ 229], 80.00th=[ 231], 90.00th=[ 241], 95.00th=[ 269],
| 99.00th=[ 1369], 99.50th=[28181], 99.90th=[56361], 99.95th=[69731],
| 99.99th=[82314]
bw ( KiB/s): min= 3680, max=13704, per=100.00%, avg=6737.68, stdev=3211.98, samples=19
iops : min= 920, max= 3426, avg=1684.42, stdev=803.00, samples=19
lat (usec) : 250=91.42%, 500=4.55%, 750=0.89%, 1000=0.08%
lat (msec) : 2=2.11%, 10=0.03%, 20=0.23%, 50=0.52%, 100=0.16%
cpu : usr=0.55%, sys=1.20%, ctx=18016, majf=0, minf=12
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=18012,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
bg: (groupid=0, jobs=1): err= 0: pid=528041: Tue May 2 14:09:33 2023
read: IOPS=66, BW=267KiB/s (273kB/s)(3740KiB/14006msec)
clat (usec): min=2736, max=70544, avg=14940.17, stdev=7633.70
lat (usec): min=2736, max=70544, avg=14940.65, stdev=7633.71
clat percentiles (usec):
| 1.00th=[ 4490], 5.00th=[ 6194], 10.00th=[ 7570], 20.00th=[ 9372],
| 30.00th=[10683], 40.00th=[11863], 50.00th=[13042], 60.00th=[14353],
| 70.00th=[15926], 80.00th=[19268], 90.00th=[26870], 95.00th=[30540],
| 99.00th=[36963], 99.50th=[41681], 99.90th=[70779], 99.95th=[70779],
| 99.99th=[70779]
bw ( KiB/s): min= 208, max= 360, per=4.88%, avg=264.89, stdev=45.27, samples=27
iops : min= 52, max= 90, avg=66.22, stdev=11.32, samples=27
lat (msec) : 4=0.43%, 10=24.49%, 20=56.79%, 50=18.07%, 100=0.21%
cpu : usr=0.28%, sys=0.12%, ctx=940, majf=0, minf=11
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=935,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=5411KiB/s (5541kB/s), 267KiB/s-7204KiB/s (273kB/s-7377kB/s), io=74.0MiB (77.6MB), run=10001-14006msec
Disk stats (read/write):
sda: ios=18933/0, merge=0/0, ticks=23640/0, in_queue=23639, util=98.70%
- I am not the only one unfamiliar with these measurements as this blog post was inspired by a recent inquiry on the fio mailing list asking about the multiple bandwidth and IOPS figures fio reports.
- What if I'm not interested in these periodic measurements and want to disable
this feature? Fio does not have an option to explicitly disable this feature.
However, it is possible to reduce the frequency of these measurements by
specifying large values for
bwavgtime
andiopsavgtime
. On a system with 4-byte integersbwavgtime=4294967295
will essentially disable periodic bandwidth measurement. - The final measurement is typically missed. The first example above ran for three seconds and measurements were collected every half second, yet only five measurements were collected. This is because fio terminated the job and disabled periodic measurement before the final measurement could be collected.
- Part of the reason why this feature is not well known is because the
discussion of these output lines in fio's documentation is quite terse. We
should improve this. The discussion for
log_avg_msec
should also mention that the logging frequency can be affected bybwavgtime
andiopsavgtime
.