Performance - CESNET/UltraGrid GitHub Wiki

UltraGrid Performance

This site contains the results of performance tests of UltraGrid.

End-to-End Latency

The following table compares performance of individual cards. The results are measured as end-to-end frame-delay, i.e. the number of frames sent before the reciever outputs the original frame.

Linux

Setup:

  • Testing machine: hd2
  • Reference machine: hd4 with BlackMagic Decklink HD Extreme
  • For the test, we used 1080i HD video at 29.97Fps.
Card send recieve
DVS Centaurus II 5 3
BlackMagic DeckLink HD Extreme 3.75 3.75
BlackMagic DeckLink 4K Extreme 3.5 3
BlackMagic Decklink Quad 4.5 4
BlackMagic Decklink Intensity PRO 4.5 4
BlackMagic Decklink Intensity 4.5 4.5
Deltacast 3G 4.5 3
OpenGL - 2.5

macOS

Setup:

  • Testing machine: hd7
  • Reference machine: hd4 with BlackMagic Decklink HD Extreme
  • For the test, we used 1080i HD video at 29.97Fps.
Card send recieve
AJA Kona 3G 4 3.5
DeckLink HD Pro (Quicktime) 4.5 5.5
DeckLink HD Pro (native API) 4.5 4
OpenGL (with VSync) - 2.25
OpenGL (without VSync) - 1.75

Windows

Setup:

  • Testing machine: hd7
  • Reference machine: hd4 with BlackMagic Decklink HD Extreme
  • For the test, we used 1080i HD video at 29.97Fps.
Card send recieve
BlackMagic DeckLink HD Extreme 4.5 3.5
BlackMagic Decklink Quad 4.5 4
BlackMagic Decklink Intensity PRO 4.5 3.5
BlackMagic Decklink Intensity 4.5 4
Deltacast 3G 4 3

Compression

Performance

Below you can see the encoding performance of individual compression modules. We used a video with increasing framerate to pin down the biggest achievable that still offers fluent video experience.

Encoder (setting) res FPS Content HW Ver
cineform 2160p 65 NZ2500 i7-4960X 1.6
gpujpeg (90) 2160p 157 NZ2500 5960X+9 1.3
libx264 2160p 43 NZ i7-4960X
libx265 2160p 16 NZ AMD 3970X 1.8.5
libsvt_hevc 2160p 89 NZ AMD 3970X 1.8.5
libvpx 2160p 28 NZ i7-4960X
mjpeg 2160p 65 NZ i7-4960X
cineform - R10k 2160p 42 NZ2500 i7-4960X 1.6
cineform - R12L 2160p 25 NZ2500 i7-4960X 1.6
gpujpeg (90) 2160p 150 NZ2500 kypo 1.3
gpujpeg (90) 2160p 137 NZ2500 hd12 1.3
gpujpeg (90) 2160p 201 NZ2500 5960X+B 1.3
gpujpeg (90:8) 2160p 207 NZ2500 5960X+B 1.3
gpujpeg (90:16) 2160p 200 NZ2500 5960X+B 1.3
gpujpeg (90) 2160p 178+146 NZ2500 5960X+B9 1.3
gpujpeg (90) 2160p 92 NZ2500 bunny 1.3
gpujpeg (90) 2160p 96+96 NZ2500 bunnyX2 1.3
gpujpeg (90) 2160p 130 NZ2500 hdd1 1.3
libsvt_hevc 2160p 44 NZ i9-9820X 1.8.5
libx264 2160p 32 NZ i7-980X
libx264 1080p 4 NZ VIA Nano 1.5d
mjpeg 1080p 10 NZ VIA Nano 1.5d

Legend:

  • kypo
    i7-4770S, NV GTX 980, 4x8G DDR3@1600 (kypowall0)
  • hd12
    i7-4960X, 32 GB 1866 MHz DDR3, NV GTX 960
  • 5960X+9
    i7-5960X, DDR4@2166, NV GTX 960
  • 5960X+B
    i7-5960X, DDR4@2166, GeForce GTX Titan Black
  • 5960X+B9
    i7-5960X, DDR4@2166, GeForce GTX Titan Black + GTX 960
  • bunny
    2x Xeon E5-2660 v2, [email protected], GeForce GTX Titan
  • bunnyX2
    2x Xeon E5-2660 v2, [email protected], 2 x GeForce GTX Titan
  • hdd1
    i7-4930K, 780Ti,2x4GB DDR3@1333
  • VIA Nano
    Via Nano U2250, 1 GB ram<
  • 1.3
    v1.3-140-g08dba83
  • 1.5d
    1.5 (rev 3fa1a0d7)
  • NZ
    New Zealand UYVY
  • NZ2500
    New Zealand frame 2500 UYVY

Latency

We also measured the latency added by the compression modules. For the tests 1080p@30fps was used, compressing at hd7 running Ubuntu and decompression done by hd2

module end-to-end latency
uncompressed 3.75
cuda_dxt 3.75 (+0)
RTDXT:DXT1 6 (+2)
RTDXT:DXT5 5.5 (+1.75)
JPEG:90:0 4 (+0.25)
JPEG:97:0 4 (+0.25)
H.264 5 (+1.25)

Bandwidth

Here you can see the measured bandwidth including overhead with 9000B Ethernet frames. Uncompressed signal was 8-bit YUV422.

module 1080i@30 2k@30 4k (4096 × 2160)@25fps
uncompressed 980 Mbps 1504 Mbps 3489 Mbps
DXT1 245 Mbps 376 Mbps 870 Mbps
DXT5 YCoCg 489 Mbps 752 Mbps
JPEG:90 80 Mbps 85 Mbps 160 Mbps
H.264 22 Mbps 22 Mbps 60 Mbps
cineform:quality=4 (default) 580 Mbps
cineform:quality=1 300 Mbps
⚠️ **GitHub.com Fallback** ⚠️