Speech Refactor Use Case Notes - nvaccess/nvda GitHub Wiki
Speech Refactor Use Case Notes
Here are some quick notes about how the new speech framework implemented in #7599 can be used to implement some of the more tricky use cases. They are by no means complete.
Profile switching use cases
Anything that needs to change voice, synth, etc. can be implemented using ConfigProfileTriggerCommand
in speech sequences. Rather than having specific, restrictive settings (e.g. set synth for specific language, set rate for math, etc.), profile triggers can be used instead. This allows for a lot more flexibility. For example, rather than only being able to set the rate for math, you could also change the voice if you wish; anything you can change in a profile is possible. We might want to provide wizards or similar to help set up common use cases, though.
Relevant use cases include:
- Switch to specific synths for specific languages #279
- Changing speeds for different languages #4738
- Voice aliases #4433
- Introduce a special Math speech rate #7274
The general idea is:
- Create an appropriate profile trigger; e.g. a
LanguageProfileTrigger
which is used whenever aLangChangeCommand
is encountered. - Use
speech.ConfigProfileTriggerCommand
to enter/exit this trigger in a speech sequence when appropriate. For example, we'd probably add commands forLanguageProfileTrigger
inspeech.speak
. For math, we might output commands forMathProfileTrigger
inspeech._speakTextInfo_addMath
. - Provide GUI for users to configure this trigger in the New Profile/Config Profile Triggers dialogs. This is pretty easy for something like math; it can be done similar to say all. However, we might need some kind of UI to add additional triggers for things with a lot of possible options such as languages.
NVDA Remote
I (@jcsteh) discussed this a little with @tspivey. He suggested an issue be filed against NVDA Remote covering this, but I haven't done that because I won't be able to follow this through. Here are some notes for when we're ready to tackle this.
We probably want NVDA Remote to still support older synths that don't implement the new API. Unfortunately, that means NVDA Remote will still need to use its existing patching/lastIndex polling code for those synths. You can test for those synths using speech.shouldUseCompatCodeForIndexing
. This existing code won't work at all for newer synths which use the new framework.
Regarding how to implement things for the new framework:
- For the most part, the slave should just relay speech sequences passed to
speech._SpeechManager.speak
to the master.- An extension point will needed to be added for that once we figure out exactly where.
- Commands like
BeepCommand
,EndUtteranceCommand
andWaveFileCommand
can be serialized pretty easily. Custom callbacks can't be serialized, though. - This means NVDA Remote will get capital letter indication, since spelling is now one speech sequence which uses
BeepCommand
,PitchCommand
, etc. See NVDARemote/NVDARemote#110.
- @leonardder noted that we'll need some way to differentiate standalone calls to
tones.beep
andnvwave.playWaveFile
from calls made due to speech commands. Otherwise, they'll double up. This should be pretty trivial with an additional argument.- I guess we just shouldn't fire the extension point in the case of a speech command?
- Related: https://github.com/nvaccess/nvda/pull/7594
- Say all is a bit trickier. We want to sync the cursor with the speech from the master, not the slave, since the master is the one primarily doing the controlling.
- I think this can be done by having the slave remove say all callbacks from the speech sequence and storing them in a map with an identifier. We'll need a filter in core for this. Also, we'll need a way to distinguish say all callbacks; right now, they're just
CallbackCommand
s. We could probably create a simpleSayAllCallbackCommand
subclass. - The slave would pass this identifier to the master as part of the speech sequence.
- The master would wrap this in a callback. When called, the callback would notify the slave that this identifier was reached.
- The slave would then grab the original callback from the map and call it, thus syncing the cursor, pushing more speech, etc.
- I think this can be done by having the slave remove say all callbacks from the speech sequence and storing them in a map with an identifier. We'll need a filter in core for this. Also, we'll need a way to distinguish say all callbacks; right now, they're just
Determine whether NVDA is speaking
As requested by: Add the ability to determine if NVDA is speaking in NVDA Controller.dll #5638
It should be pretty trivial to add a function to do this now. It can check speech._manager._curPriQueue
. If it's None, there is no speech in progress.