Forced Alignment - langdoc/FRechdoc GitHub Wiki

There are tons of tools out there to perform this task, and this is one of the nicest lists of possible choices. Now things are always a bit different when we want to do these things on smaller languages, but I think especially when we want to do it on quite rough level (just matching the sentences, nothing more) we can try to get creative with the choices. It may be that for word or phoneme level result we need a very good language model, but maybe even worse one can get us good enough results here. Who knows. I have tested WebMAUS with emuR R package and aeneas. Here I'm going through some of the segmentations I've tested with aeneas, but also emuR is absolutely worth checking.

It can very well be that in order to make very nice segmentation/forced alignment we need to run many different tools in parallel and merge the result in the end. I'm assuming that these two tasks run somehow together, as forced alignment is not very useful if the segments are very bad anyway.

Unorganized notes

http://espeak.sourceforge.net/data/
mv ru_dict-48 /usr/local/Cellar/espeak/1.48.04_1/share/espeak-data/ru_dict