IO guide - Quefumas/gensound GitHub Wiki
There are three main I/O operations, which are further described in the sections below:
-
Signal.play()
- playback to speakers. -
Signal.export(filename)
- export to file. -
WAV(filename)
- load Signal from audio file (Wave and AIFF always supported; for other formats, see below).
The supported file formats for exporting and loading, as well as the actual device used for playback, depend on which libraries and software are available to the user. To see which options are supported, skip to the end of the page.
Gensound uses lazy evaluation, meaning that
Signal
objects are not actually computed until required for playback or file export. This means that mixing, concatenating and applying Transforms is very fast, but callingplay
andexport
could take time, as they recompute the entire Signal tree. If you wish to use both on the same Signal object, it is advisable to callSignal.realise(sample_rate)
, and save the returnedAudio
object, which supports both operations with the same interface.
For playback, use the method Signal.play()
. It has the following interface:
-
sample_rate
- defaults to 44100. Valid values are8000, 11025, 16000, 22050, 24000, 32000, 44100, 48000, 96000
. Note that a Signal object doesn't know/care what sample rate it is supposed to be played in, even when it was directly loaded from a Wave file, so you will have to be explicit about sample rates other than 44.1 kHz. -
byte_width
- defaults to 2 (16-bit). Supported values are1,2,4
(TODO I think? there was something about floating points) -
max_amplitude
- defaults toNone
. This argument is used to adjust the output volume, has the following behaviour, which has been updated in version 0.4:-
If
max_amplitude
is a positive number, Gensound will shrink/stretch the amplitudes so that they peak (in absolute value) at exactlymax_amplitude
. Numbers greater than 1 are allowed, but will result in clipping. Note that in this case, any gain or amplitude modification applied to the master Signal will have no effect. -
If
max_amplitude == 0
, Gensound will not perform any kind of amplitude adjustment. -
If
max_amplitude
isNone
(the default), Gensound will ensure the amplitudes fit within the range [-1,1], so as to avoid clipping. If there was no clipping in the first place, the audio will remain the same.
-
Use Signal.export(filename)
. After the filename
argument, the user may also specify the three argument available for play
, as above.
As of 0.4, Gensound supports exporting to WAV/AIFF/AIFC files.
Gensound signals are theoretical constructs that have no knowledge of the sample rate and byte width they are intended to have. Depending on whether we specify durations as milliseconds or number of samples, we may get very different results when exporting.
Sometimes we wish to evaluate the Signal without passing it to playback or saving as a file. One such case is to pass the data to another module. To obtain the amplitudes directly, use the following code:
someSignal.realise(sample_rate).audio # this is an instance of numpy.ndarray
This retrieves a numpy.ndarray
with dtype=numpy.float64
, with shape (num_channels, num_samples)
.
Another use case is when we want to obtain a properly encoded byte stream of the audio. This is similar the previous case, but with two crucial difference:
- The samples are no longer 64-bit floats, and are instead converted to a specified encoding. Currently supported are unsigned 8-bit integers (when
byte_width
is 1), signed 16-bit integers (byte width 2), and 32-bit integers (byte width 4), with partial and upcoming support for other encodings. The default should be good enough for most cases; if it isn't, then you probably already know enough to find a solution. - The samples are interleaved in memory. This means that the samples will be arranged chronologically, with simultaneous samples from the various channels grouped together (as opposed to writing all samples from the entire first channel; then from the second, etc.).
Example:
bytestream = someSignal.to_bytes() # available from 0.4
The to_bytes
function may also receive the same arguments as for play()
.
Use the WAV
class to load audio from file.
For a WAV or Raw Signal, use the resample
method which recomputes the associated audio stream so it can be played using the desired sample rate without affecting pitch.
from gensound import WAV, test_wav
WAV(test_wav).play(sample_rate=32000) # will sound lower in pitch, since test_wav is in 44.1 kHz
WAV(test_wav).resample(32000).play(sample_rate=32000) # playback will now be at proper pitch
Or in order to save the file using the new sample rate:
WAV(test_wav).resample(32000).export("new_file.wav", sample_rate=32000)
Since WAV/Raw data is cached, calling
resample
once will affect all other objects contained the same WAV file. We may choose to implement an option to keep both the original as well as the resampled copies, but if that bothers you, you're most likely looking for the Stretch transform instead. Either way, one workaround is to load several unique copies of the original wave file, and another one is usingRaw(WAV(filename).mixdown(original_sample_rate))
which effectively makes a unique copy in memory.
resample
is a method of Raw, inherited by WAV. Note that it is not a Transform, for good reasons. If you thought about using it to stretch the audio while altering the pitch, you are probably looking for the Stretch transform, which works very much the same.
Running IO.status()
(with IO
available at gensound.io
) will print to console the current system's support for playback and file read/write.
Supplying the argument True
will also print out alternative options which are available for each operation.
For playback, Gensound knows how to interact with the following alternatives:
-
pygame
(available here) -
simpleaudio
(available here) -
playsound
(available here) -
winsound
(0.5.1) - always available on Windows systems, relies on export to temporary file, but does not open external player (likeos
does). -
os
- this one is always available, and involves exporting to a temporary file and playing with the system's default player.
Gensound will try to pass any additional keyword arguments from play()
to the appropriate function on the hardware end.
For instance, when using pygame
, it will accept the additional keyword argument loops
, as specified in the PyGame documentation. In addition, it will also return a pygame.mixer.Sound
object, which can be stopped.
When several options are available for some I/O operation (as can be seen with IO.status(True)
), it is possible to tell Gensound to use a particular one, rather than letting it make its own mind. Use IO.set_io(action, interface_name, format=None)
:
-
action
should be one ofplay
,load
orexport
, and indicates which operation should use the specified interface. -
interface_name
is the name of the interface (i.e.pygame
,playsound
) to be used for this operation, exactly as appears in the output ofIO.status(True)
. -
format
(optional) is relevant only whenaction
isload
orexport
, and indicates for which file formats this interface should be used. This defaults to*
, meaning all formats.
For both reading and writing, Gensound relies on Python's builtin support for *.wave, *.wav
and *.aiff, *.aifc, *.aif
files.
In addition, Gensound automatically detects if FFMPEG is installed (make sure it appears in the PATH
environment variable), and if so, read/write capabilities are available for pretty much any format supported by FFMPEG.