Using FAVE extract - JoFrhwld/FAVE GitHub Wiki
Currently, FAVE-extract is set up so that it must be run from the main directory of the FAVE-extract package. To run FAVE-extract, three arguments are required:
- the WAV file containing the speech data,
- the TextGrid file containing the alignments,
- and the name of an output file for the extracted formants.
So, in the directory FAVE-extract
, type:
python bin/extractFormants.py filename.wav filename.TextGrid outputFile
Adjusting Configuration Options
There are many configuration parameters that can alter the behavior of extractFormants.
The user can modify their values by creating a config file; otherwise, default values will be set internally.
You can call the config file anything, but conventionally it will be called config.txt
.
To pass the config options to extractFormants.py
, add +config.txt
to the call.
Any option you set in the config file could also be passed as flags to the call.
python bin/extractFormants.py +config.txt filename.wav filename.TextGrid outputFile
The syntax of the config file is as follows:
--flag
value
That is, you should include the specific flag on one line, and any value you want to pass to it on the following line. The config file that currently comes with FAVE reads like so:
--outputFormat
txt
--speechSoftware
Praat
--formantPredictionMethod
mahalanobis
--measurementPointMethod
faav
--nSmoothing
12
--remeasure
--vowelSystem
phila
Taking config file item by item:
--outputFormat
txt
The output format of the data from extract formants should be a tab delimited file.
--speechSoftware
Praat
Use Praat to do the LPC analyses.
--formantPredictionMethod
mahalanobis
Use the mahalanobis method to set the LPC parameters. (Recommended. This is a large part of FAVE's "special sauce".)
--nSmoothing
12
Smooth the formant tracks based on a linear smooth of the 12 adjacent samples.
--remeasure
Re-estimate formant values based on the speaker's own distribution of data (recommended).
--vowelSystem
phila
Recode the CMU transcription assuming a Philadelphia sound system.
Here is a list of all of the configuration parameters that can be set by the user in the config file, along with their possible values and the default value that is set internally. Any parameters without values provided in the value column are True/False flags. If they are present, they trigger the behavior in the description column. If they are absent, they don't.
Parameter | Value: default (other) | Description |
---|---|---|
--candidates |
Return all candidate measurements in output | |
--case |
upper (lower ) |
Return word transcriptions in specified case. |
--covariances |
covs.txt |
covariances, required for mahalanobis method |
--formantPredictionMethod |
mahalanobis (default ) |
Formant prediction method |
--maxFormant |
5000 |
changed if using mahalanobis method |
--means |
means.txt |
mean values, required for mahalanobis method |
--measurementPointMethod |
faav (fourth, third, mid, lennig, anae, maxint ) |
Method for determining measurement point |
--minVowelDuration |
0.05 |
Minimum duration in seconds, below which vowels won't be analyzed. |
--multipleFiles |
Interpret positional arguments as files of listed .wav, .txt and output files. | |
--nFormants |
5 |
Specify the order of the LPC analysis to be conducted |
--noOutputHeader |
Don't include output header in text output. | |
--nSmoothing |
12 |
Specifies the number of samples to be used for the smoothing of the formant tracks. |
--onlyMeasureStressed |
Only measure stressed vowels | |
--outputFormat |
txt (text, plotnik, Plotnik, plt, both ) |
Output format. Tab delimited file, plotnik file, or both. |
--preEmphasis |
50 |
The cut-off value in Hz for the application of a 6 dB/octave low-pass filter. |
--phoneset |
cmu_phoneset.txt |
|
--pickle |
save vowel measurement information as a picklefile | |
--remeasurement |
A second pass is performed on the data, using the speaker's own system as the base of comparison for the Mahalanobis distance | |
--removeStopWords |
Don't measure vowels in stop words. | |
--speechSoftware |
praat (praat, Praat, esps, ESPS ) |
The speech software program to be used for LPC analysis. |
--speaker |
... |
*.speaker file, if used |
--stopWords |
... |
Words to be excluded from measurement |
--stopWordsFile |
... |
file containing words to exclude from analysis |
--tracks |
Write full formant tracks. | |
--vowelSystem |
NorthAmerican (phila, Phila, PHILA, simplifiedARPABET ) |
If set to Phila, a number of vowels will be reclassified to reflect the phonemic distinctions of the Philadelphia vowel system. |
--verbose |
verbose output. useful for debugging | |
--windowSize |
`0.025 | In sec, the size of the Gaussian window to be used for LPC analysis. |
Default stopwords are:
"AND", "BUT", "FOR", "HE",
"HE'S", "HUH", "I", "I'LL",
"I'M", "IS", "IT", "IT'S", "ITS",
"MY", "OF", "OH", "SHE", "SHE'S",
"THAT", "THE", "THEM", "THEN", "THERE",
"THEY", "THIS", "UH", "UM", "UP",
"WAS", "WE", "WERE", "WHAT", "YOU"