Automatic subtitle generation (PocketSphinx) - lmmx/devnotes GitHub Wiki

  • Download swig - apt-get install swig for ease
  • Download Sphinxbase (currently v.5, check sourceforge GitHub for latest)
    • mv sphinxbase-XXXX sphinxbase; cd sphinxbase
    • ./configure
    • make
  • Now its dependency is satisfied, add PocketSphinx via GitHub
    • run ./autogen.sh

    • ./configure

    • make clean all

        ../../src/libpocketsphinx/.libs/libpocketsphinx.so: undefined reference to `cmd_ln_print_values_r'
      
    • make check

    • sudo make install


Accurate subtitles (provided in text file) to .srt

  • Install maven2 and then build sphinx4
  • Make a test video clip, transcribe it manually, convert video to proper format:

ffmpeg -i Short_clip.mp4 -acodec pcm_s16le -ac 1 -ar 16000 Short_clip.wav

  • Use long audio alignment being developed in Sphinx

  • videogrep --input Short_clip.mp4 --transcribe produces Short_clip.mp4.transcription.txt

  • Manually transcribe to Short_clip_manual_transcription.txt

java -cp sphinx4-samples/target/sphinx4-samples-1.0-SNAPSHOT-jar-with-dependencies.jar edu.cmu.sphinx.demo.aligner.AlignerDemo Short_clip.wav Short_clip_manual_transcription.txt en-us-generic cmudict-5prealpha.dict cmudict-5prealpha.fst.ser

⚠️ **GitHub.com Fallback** ⚠️