Using FAVE extract - JoFrhwld/FAVE GitHub Wiki

Currently, FAVE-extract is set up so that it must be run from the main directory of the FAVE-extract package. To run FAVE-extract, three arguments are required:

  • the WAV file containing the speech data,
  • the TextGrid file containing the alignments,
  • and the name of an output file for the extracted formants.

So, in the directory FAVE-extract, type:

python bin/extractFormants.py filename.wav filename.TextGrid outputFile

Adjusting Configuration Options

There are many configuration parameters that can alter the behavior of extractFormants. The user can modify their values by creating a config file; otherwise, default values will be set internally. You can call the config file anything, but conventionally it will be called config.txt. To pass the config options to extractFormants.py, add +config.txt to the call. Any option you set in the config file could also be passed as flags to the call.

python bin/extractFormants.py +config.txt filename.wav filename.TextGrid outputFile

The syntax of the config file is as follows:

--flag
value

That is, you should include the specific flag on one line, and any value you want to pass to it on the following line. The config file that currently comes with FAVE reads like so:

--outputFormat
txt
--speechSoftware
Praat
--formantPredictionMethod
mahalanobis
--measurementPointMethod
faav
--nSmoothing
12
--remeasure
--vowelSystem
phila

Taking config file item by item:

--outputFormat
txt

The output format of the data from extract formants should be a tab delimited file.


--speechSoftware
Praat

Use Praat to do the LPC analyses.


--formantPredictionMethod
mahalanobis

Use the mahalanobis method to set the LPC parameters. (Recommended. This is a large part of FAVE's "special sauce".)


--nSmoothing
12

Smooth the formant tracks based on a linear smooth of the 12 adjacent samples.


--remeasure

Re-estimate formant values based on the speaker's own distribution of data (recommended).


--vowelSystem
phila

Recode the CMU transcription assuming a Philadelphia sound system.

Here is a list of all of the configuration parameters that can be set by the user in the config file, along with their possible values and the default value that is set internally. Any parameters without values provided in the value column are True/False flags. If they are present, they trigger the behavior in the description column. If they are absent, they don't.

Parameter Value: default (other) Description
--candidates Return all candidate measurements in output
--case upper (lower) Return word transcriptions in specified case.
--covariances covs.txt covariances, required for mahalanobis method
--formantPredictionMethod mahalanobis (default) Formant prediction method
--maxFormant 5000 changed if using mahalanobis method
--means means.txt mean values, required for mahalanobis method
--measurementPointMethod faav (fourth, third, mid, lennig, anae, maxint) Method for determining measurement point
--minVowelDuration 0.05 Minimum duration in seconds, below which vowels won't be analyzed.
--multipleFiles Interpret positional arguments as files of listed .wav, .txt and output files.
--nFormants 5 Specify the order of the LPC analysis to be conducted
--noOutputHeader Don't include output header in text output.
--nSmoothing 12 Specifies the number of samples to be used for the smoothing of the formant tracks.
--onlyMeasureStressed Only measure stressed vowels
--outputFormat txt (text, plotnik, Plotnik, plt, both) Output format. Tab delimited file, plotnik file, or both.
--preEmphasis 50 The cut-off value in Hz for the application of a 6 dB/octave low-pass filter.
--phoneset cmu_phoneset.txt
--pickle save vowel measurement information as a picklefile
--remeasurement A second pass is performed on the data, using the speaker's own system as the base of comparison for the Mahalanobis distance
--removeStopWords Don't measure vowels in stop words.
--speechSoftware praat (praat, Praat, esps, ESPS) The speech software program to be used for LPC analysis.
--speaker ... *.speaker file, if used
--stopWords ... Words to be excluded from measurement
--stopWordsFile ... file containing words to exclude from analysis
--tracks Write full formant tracks.
--vowelSystem NorthAmerican (phila, Phila, PHILA, simplifiedARPABET) If set to Phila, a number of vowels will be reclassified to reflect the phonemic distinctions of the Philadelphia vowel system.
--verbose verbose output. useful for debugging
--windowSize `0.025 In sec, the size of the Gaussian window to be used for LPC analysis.

Default stopwords are:

"AND", "BUT", "FOR", "HE", 
"HE'S", "HUH", "I", "I'LL", 
"I'M", "IS", "IT", "IT'S", "ITS", 
"MY", "OF", "OH", "SHE", "SHE'S", 
"THAT", "THE", "THEM", "THEN", "THERE", 
"THEY", "THIS", "UH", "UM", "UP", 
"WAS", "WE", "WERE", "WHAT", "YOU"