ContributingSvxLinkServerLanguageTranslation - sm0svx/svxlink GitHub Wiki
The SvxLink Server application use sound clips to play back announcements. These are small sound clips that may range from part of a word to full help messages with multiple sentences. The default language used in SvxLink is US English so using that setup as a template is a good start. The end result after completing the steps below should be an archive containing sound clips and possibly some language specific TCL script adaptions. This is called a "language pack". More on the TCL script adaptions below.
Start by setting up a build directory for the sounds and cd into it.
mkdir sounds cd sounds
Prepare the build environment by getting hold of the build scripts that are used to build the final sound clips. If you already have a checked out source tree the best way may be to softlink to the script files which can be found under 'src/svxlink/scripts/'. Another way is to get them directly from Git.
wget https://raw.githubusercontent.com/sm0svx/svxlink/master/src/svxlink/scripts/play_sound.sh wget https://raw.githubusercontent.com/sm0svx/svxlink/master/src/svxlink/scripts/filter_sounds.sh
Make sure that the 'sox' utility is installed on the system since that is used to process the sound clips.
Now decide what your directory for this translation should be called. The directory should be named "orig-ll_LL-name" where "ll" and "LL" should be replaced with the ISO language codes (e.g. sv_SE for Swedish) and name should be replaced with the voice name. The word "orig" should not be replaced since it just signifies that this is the original, unprocessed sound clip files. The default voice is called en_US-heather since it’s US English and the voice is called Heather. To get a template to start from we now use the en_US-heather directory. This is most easily obtained using Subversion to get the latest version.
svn export https://github.com/sm0svx/svxlink-sounds-en_US-heather/trunk orig-ll_LL-name
The orig-ll_LL-name directory now created should contain a base structure, configuration files and text files. The text files are the text representation used to generate a specific sound clip. Either keep them as a reference or remove them. If kept, they should be edited to reflect the language that the language pack is being created for.
Now comes the hard part, which is generating all the sound clips. The preferred way to generate sound clips is to use the online service at Acapela Box. The service is not free but generate good quality sound clips in a lot of languages. Using a publicly available service is good since other people may want to generate additional sound clips for a specific installation and then it’s easy to get the same voice. There are lots of other commercial text to speech applications but the license on the generated sound clips may make it illegal to release the clips publicly. There are also a number of free text to speech applications but I have so far not found one that is good enough. Of course it’s also possible to use your own voice to record the sound clips but then only you can generate new sound clips with the same voice. The sampling rate for the sound clips should be at least 16kHz. If possible use an even higher sampling rate, like 48kHz, to get the highest quality in the originals. A script will be used later to convert the sampling rate to the one SvxLink is using.
So what sound clips should be generated then? In each subdirectory under orig-ll_LL-name you will find a configuration file called 'subdir.cfg'. In this file you can find a list of file names for all sound clips that need to be generated for this subdirectory. The file contains three configuration variables. The two that is of interest right now is MAXIMIZE_SOUNDS and TRIM_SOUNDS. For each listed sound clip name, a corresponding wav file should be created that is named clipname.wav (e.g. help.wav, press_0_for_help.wav etc). Another way of finding out which clips to generate is to use the text files mentioned above. Simply generate a sound clip with the same name as the text file. For example, if there is a text file called help.txt, a file named help.wav should be created. What content the sound clip should have can be found out by looking at the content of the text file. Note that the text in some text files have been misspelled on purpose to force the text to speech system to generate the correct pronunciation.
A good tip is to start small and try the whole process of generating the final sound clips before going ahead and create all clips. Start with the clips is the Core subdirectory, the first one being "online". That clip should contain the translation of "SvxLink online".
The script 'filter_sounds.sh' is used to filter the sound clips to enhance them further. For example, the clips are maximized in level and silence is trimmed at the beginning and end to get them tighter together. The script use some configuration files. Directly under the orig-ll_LL-name directory there is a configuration file called 'filter_sounds.cfg'. At this moment we’re only interested in the SUBDIRS configuration variable. Comment out the original lines and set SUBDIRS="Core". This will make the script only process the Core directory. If we don’t do this, an error message will be printed for each missing file and there will be a lot of them. Now we can try to run the script. You should stand in the same directory as where the script is located. This should be the "sounds" directory, that is the directory above the orig-ll_LL-name directory, if this instruction have been followed correctly. Now run the following.
./filter_sounds.sh orig-ll_LL-name ll_LL-name-16k
This will process the files under the orig-ll_LL-name directory and put them in the ll_LL-name-16k directory. The latter directory will contain the end product, the files that SvxLink can use. A warning will be printed for each missing file. The default sampling rate for the generated sound clips is 16kHz, hence the directory name suffix for the target directory. The filter_sounds.sh script can also generate 8k sound clips if that is needed. Add the "-r 8000" command line switch to the command to achieve that. Remember to rename the target directory.
A tar archive will also be created with the contents in the target directory. It will be named sounds-ll_LL-name-16k.tar.bz2.
To try your language pack in SvxLink, unpack the generated tar archive in the sounds directory configured in SvxLink. See the InstallationInstructions for how to do that. Set the DEFAULT_LANG configuration variable in svxlink.conf to match the language code for the generated language pack. If you only generated the 'online.wav' clip you can trigger a manual identification ({star}) to make it use that clip. Missing sound clips will be printed out.
If everything seem to work, start creating the rest of the sound clips. The most important subdirectories are "Core" and "Default". Then there is one subdirectory for each SvxLink module. If it is desirable to tweak the quality of the sound in some way that is done in the filter_sounds.cfg file. Read the comments above the configuration variables to understand what they do. To choose which sound clips that should be created, edit the subdir.cfg file in each subdirectory. There may be clips that are unnecessary for a certain language and to suppress the error printout they should be removed from the subdir.cfg file.
Not all languages have the same word ordering as English have. In this case the TCL scripts, that decide in which order sound clips should be played, have to be modified for correct word ordering. Frequent candidates for differences is when announcing time and numbers. Time and numbers are handled in the file locale.tcl which can be found under the /usr/share/svxlink/events.d directory on a standard installed SvxLink system. Create a events.d directory in the orig-ll_LL-name directory and copy the locale.tcl file there. Then modify it for correct word ordering. If you don’t know how to do this, request help on the svxlink-devel mailing list. The events.d directory will be copied to the target directory by the filter_sounds.sh script.