Qt Speech Module
This page contains notes about the development of a qt speech module.
Currently it is about tts (text to speech).
Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.
https://codereview.qt.io/#admin,project,qt/qtspeech,info
ssh://codereview.qt.io:29418/qt/qtspeech.git
Current State
There is a basic implementation on Mac/Win/Linux/Android.
Linux uses speech-dispatcher.
Windows uses sapi5.
OSX uses Cocoa NSSpeechSynthesizer api.
Todo
Decide on either plugins or only having one backend per platform.
To implement on each platform:
Collection of resources and links that should help defining a cross-platform API:
API for language selection
QLocale seems like a good candidate for languages.
Voice selection
Summarize here which native API offers what. For example SAPI 5 has first names as voice identifiers.
Platform/API
|
Voice Properties
|
Link
|
Win SAPI 5
|
gender, age, name, lang, vendor - names such as "Microsoft Anna" or "Mike"
|
http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx
|
Mac Carbon
|
similar to cocoa api
|
https://developer.apple.com/library/mac/documentation/Carbon/Reference/Speech_Synthesis_Manager/Reference/reference.html#//apple_ref/doc/uid/TP30000211
|
Mac Cocoa NSSpeechSynthesizer
|
id, name, age, gender, language, locale - can be enumerated
|
https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/Reference.html
|
Linux SpeechDisp
|
name, language (2-letter), variant, voice type enum with Male1..3, Female1..3 and childMale, childFemale
|
http://cvs.freebsoft.org/doc/speechd/speech-dispatcher.html
|
espeak
|
lang (two letter [_region-subregion]), age(?), gender, name(=long language name)
|
espeack —voices
|
festival
|
name, gender, maybe more
|
http://www.cstr.ed.ac.uk/projects/festival/
|
Android
|
has a concept of engines, setLanguage locale based, isLanguageAvailable() but no language listing
|
http://developer.android.com/reference/android/speech/tts/TextToSpeech.html
|
CSS and/or XML in strings to be spoken