Automated Speech2txt - selmling/Analytics-and-Data-Exploration GitHub Wiki

This library of scripts will take a wav file and turn it into a raw speech transcript file. There will be manual cleaning of these .txt files after generation to clear out mistakes committed by the google speech to text API.

S2T Manual

S2T Instructions

Interestingly, Google speech2txt was trained on adult-directed speech, thus it has trouble with infant-directed speech which organizes in quite distinct patterns.

Future directions for this script library:

  • discern speech bouts from noise bouts

  • delete empty cells from the transcript

  • find algorithm which was trained on child-directed / infant-direct speech