Audio Settings - CESNET/UltraGrid GitHub Wiki

Table of Contents

Audio Devices

Currently you can choose following capture/playback devices:

  • portaudio - use system sound via PortAudio (multiplatform)
  • jack - use JACK Audio Connection Kit
  • alsa - use Advanced Linux Sound Architecture (ALSA)
  • coreaudio - use Core Audio (macOS)
  • wasapi - use WASAPI (Windows)

Further 3 devices are used to capture/playback audio with device currently used for video. Therefore, -d <device> option always needs to be present and indicates which HW device to use sound from.

  • embedded - grab/play audio embedded in SDI
  • analog - grab/play analog audio connected to capture card
  • AESEBU - grab/play AES/EBU (digital) audio connected to capture card

Sending

  • send video and analog audio (XLR-connected) from Decklink

uv -t decklink:0:6:UYVY -s analog <destIP>

  • Send system’s default audio, without any video stream

uv -s portaudio 192.0.43.20 # sender

  • the same, but with ALSA, Coreaudio or JACK

uv -s alsa 192.0.43.20 # use default ALSA device
uv -s coreaudio 192.0.43.20 # use default Coreaudio device
uv -s jack:system:capture 192.0.43.20 # send sound captured with JACK

  • send selected audio source (CoreAudio)

uv -t dvs:37:UYVY -s coreaudio:1

to get list of available audio devices, issue:

uv -s help

You can specify number of captured (and sent) audio channels:

uv -s alsa --audio-capture-format channels=2 # send stereo

Receiving

  • receive sound and embed it into SDI (DeckLink and DELTACAST)

uv -d decklink -r embedded
uv -d deltacast -r embedded

  • playback video and play audio through AES/EBU (note that, however, Decklinks always play audio through all available outputs - analog, AES/EBU and embedded into SDI):

uv -d decklink -r AESEBU

  • receive only audio with PortAudio (Windows/OS X/Linux)

uv -r portaudio # use default system sound device

  • the same with CoreAudio/ALSA/JACK

uv -r coreaudio # use default CoreAudio device (OS X)
uv -r alsa # use ALSA (Linux)
uv -r jack:system:playback # playback received audio with JACK Audio Connection Kit

  • receive video with deltacast card, audio with ALSA PulseAudio plugin

uv -d deltacast -r alsa:pulse

  • you can also use Decklink as a playback driver. The difference compared to “-r embedded” is that here is DeckLink used solely for audio playback (useful when you want to use Decklink’s XLR outputs). Please note that in this setup you cannot use the same decklink device for video playback. If you want so, use “-d decklink -r embedded” option.

uv -r decklink:0

to get list of available playback devices issue uv -r help

Other options used when playing received audio are:

  • –audio-channel-map - can remap audio channels, mix/split them etc. For more info issue ‘uv –audio-channel-map help’
  • –audio-scale - scale audio signal, either by a constant value or adaptively, pleas see ‘uv –audio-scale help’ for details

Codecs

UltraGrid sends uncompressed PCM by default. If this is not desired, another codec can be specified with --audio-codec parameter:

uv -s testcard --audio-codec Opus ug.example.com

You can get list of available codecs with uv --audio-codec help. This parameter has also additional options like changing sample rate of the audio.

Recommended codec for most use cases is Opus. It is preferred over FFmpeg native AAC endoder because it provides better quality (see 1,2). Somehow better than the native encoder is FDK AAC, but you'd need to have custom FFmpeg with FDK AAC enabled. Anyways, Opus also defeats the best AAC encoder from Apple (3,4), while fdk_aac scores worse (1).

Other examples

  • send and receive only audio, use Portaudio sound device nr. 3 for both input and output (you can list available devices with -r/-s help paramaters);

uv -r portaudio:3 -s portaudio:3 <dest_addr>

Using echo-cancelling microphones (Chat 150)

The echo-cancelling must be used as both sender and receiver and this must be passed to ultragrid via the -s/-r parameters.

uv -s alsa:plughw:CARD=C150 -r alsa:plughw:CARD=C150 <dest_addr>

This should be sufficient for most setups. Echo cancelling microphones are always one channel, therefore is here specified a plughw device, which enables all SW conversions (eg. from 2 channels to 1).

Alternatively, you may want to use one channel directly, by specifying this explicitly:

uv -s alsa:hw:CARD=C150 -r alsa:hw:CARD=C150 --audio-capture-format channels=1 --audio-channel-map 0:0 --audio-scale none <dest_addr>

–audio-capture-format channels=1 tells that we want to capture and send only one channel (default behavior).

–audio-channel-map is here to instruct that we need to receive only one channel (received channel 0 is mapped to played channel 0). Other channels are dropped. This is for situation when we receive more than one channel, otherwise it can be omitted.

–audio-scale none tells that we do not want to scale received audio channel. The automatic scaling is enabled when we use –audio-channel-map option. You should remove this option if you know you will receive 2 channels and you want to mix them together - this would be –audio-channel-map 0:0,1:0.

Multi-channel setup

To capture more than one channel, use option:

--audio-capture-format channels=N with requested number of input channels

By default, receiver receives all captured channels, if this is not desired behavior, you can select channels that you want to play with following option:

--audio-channel-map

Syntax is following:

source1_chan:dest1_chan[,source2_chan:dest2_chan...]

eg.:

--audio-channel-map 4:0,5:1

Plays only channel 4 and 5 (indexed from zero) as channel 0 and 1. All other received channels are dropped.

JACK transport

You can JACK as a source on a sender and sink on a receiver. This is especially useful in conjunction with JackTrip.

Note: For JACK as a normal audio driver (output through JACK at receiver, input at sender side) please refer to audio section and use JACK in a similar way to other audio drivers like CoreAudio or ALSA.

Usage:

  • sender (DVS, JACK source) – this commands take sound from SDI/portaudio and send it via JACK server and specified port

    uv -t dvs:37:UYVY -s embedded -j po=system:playback
    uv -t dvs:37:UYVY -s portaudio -j po      # if port name ommited, UG chooses from what is offered by JACK server
    
  • receiver (Quicktime, JACK sink) - this command takes sound capture via JACK (eg. netjack) and passes it to quicktime driver

    uv -d quicktime:65561 -r embedded -j pi[=system:caputre]
    

Echo cancellation

Important: Echo cancellation is currently experimental in UltraGrid, therefore it may not work correctly for you!

Requirements:

  • Both playback and recording must be done using the same audio device. Using separate sound cards is possible only if they are using the same external clock source.
  • Any sound effects or distortions that would introduce nonlinearity must not be present (clipping, dynamic compression, etc.).
  • Packet loss must be minimal to nonexistent. Any audio drop due to packet loss will cause the canceller to lose convergence and it will take a couple of seconds to start cancelling echo again.

Basic usage:

uv --echo-cancellation ...

Toggles on acoustic echo cancellation. You need to have only one channel in both sides to use this. To set egress audio channels you can use –audio-capture-format channels=1 option.

If you are receiving more than one channel, you can mix them with –audio-channel-map. Typical usage is the following - receiving stereo audio, sending one channel echo-cancelled audio:

uv --echo-cancellation --audio-capture-format channels=1 --audio-channel-map 0:0,1:0 -r alsa -s alsa

To troubleshoot possible echo cancellation issues you can use the parameter --param echo-cancel-dump-audio. This will cause UltraGrid to write a file named "echo_cancel_dump.wav" which will contain 3 audio channels in this order: microphone input (near end), speaker output (far end) and echo cancelled microphone. There should be no stuttering in any of the channels and the sounds from far end should appear a tiny bit sooner in far end than near end.

You can use --param echo-cancel-delay=<samples> to try to adjust near end delay. The parameter --param echo-cancel-filter-length=<samples> can be used to adjust the amount of samples the filter takes into account. Larger values are more computationally intensive and increase the time it takes to converge. Ideally should be the third of the length of the room's impulse response (assuming nearly perfect near and far end alignment).

Echo cancellation through Pulseaudio

Echo cancellation works through Pulseaduio on Linux. To enable it in Pulseaudio use:

pacmd load-module module-echo-cancel

You can test if the module is present with pacmd and entering list-modules. Echo cancelling can be enabled by default editing /etc/pulse/default.pa and adding

### Enable Echo/Noise-Cancelation
load-module module-echo-cancel

Once the echo cancellation is enabled you can run UltraGrid as:

./uv -r alsa:pulse -s alsa:pulse ...

and select the appropriate echo cancelled input and output devices for recording and playback using pavucontrol.

Loopback

If trying to use a loopback audio, the approach differs depending on operating system:

  1. Windows

    This is the easiest setup, because WASAPI allows capturing computer audio directly: uv -s wasapi:loopback

  2. Linux – you can use ALSA loopback device as described here for FFmpeg

  3. macOS - a 3rd party application is required for the task, recommended and tested one is BlackHole

Notes

ALSA

If you experience problems eg. with wrong sample format or channel sound, try to use a plugdev ALSA device, which has all SW conversions enabled

⚠️ **GitHub.com Fallback** ⚠️