Automated Pipeline for Training Dataset Creation from Unlabeled Audios for Automatic Speech Recognition - volodymyr-sokolov/publications GitHub Wiki

Book Chapter

Oleh Romanovskyi , Ievgen Iosifov , Olena Iosifova , Volodymyr Sokolov , Feodosiy Kipchuk , Igor Sukaylo

Abstract

In the paper, we present a software pipeline for speech recognition to automate the creation of training datasets, based on desired unlabeled audios, for low resource languages and domain-specific area. Considering the commoditizing of speech recognition, more teams build domain-specific models as well as models for local languages. At the same time, lack of training datasets for low to middle resource languages significantly decreases possibilities to exploit last achievements and frameworks in the Speech Recognition area and limits the wide range of software engineers to work on speech recognition problems. This problem is even more critical for domain-specific datasets. The pipeline was tested for building Ukrainian language recognition and confirmed that the created design is adaptable to different data source formats and expandable to integrate with existing frameworks.

https://link.springer.com/chapter/10.1007/978-3-030-80472-5_3 | 10.1007/978-3-030-80472-5_3

Keywords

ASR; Asynchronous graphs; Automatic Speech Recognition; Dataset creation pipeline; Natural language processing; NLP

SciVal Topics

Surface Defect; Piezoelectricity; Image Processing


Publisher

SCImago Journal & Country Rank

2021 International Conference on Computer Science, Engineering and Education Applications (ICCSEEA)

23–24 January 2021 Kyiv, Ukraine

First Online: 21 July 2021


Indices

  • INSPEC: 23013480

  • KUBG: 36974


Cite

CEUR-WS

O. Romanovskyi, et al., Automated Pipeline for Training Dataset Creation from Unlabeled Audios for Automatic Speech Recognition, Advances in Computer Science for Engineering and Education IV, vol. 83 (2021) 25–36. doi:10.1007/978-3-030-80472-5_3.

⚠️ **GitHub.com Fallback** ⚠️