The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Qt Speech Module
This page contains notes about the development of a qt speech module.
Currently it is about tts (text to speech).
Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.
https://codereview.qt-project.org/#/admin/projects/qt/qtspeech
ssh://codereview.qt.io:29418/qt/qtspeech.git
Current State
There is a basic implementation on Mac/Win/Linux/Android.
Linux uses speech-dispatcher.
Windows uses sapi5.
OSX uses Cocoa NSSpeechSynthesizer api.
Todo
Decide on either plugins or only having one backend per platform.
To implement on each platform:
Collection of resources and links that should help defining a cross-platform API:
API for language selection
QLocale seems like a good candidate for languages.
Voice selection
Summarize here which native API offers what. For example SAPI 5 has first names as voice identifiers.
Platform/API
|
Voice Properties
|
Link
|
Win SAPI 5
|
gender, age, name, lang, vendor - names such as "Microsoft Anna" or "Mike"
|
http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx
|
Mac Carbon
|
similar to cocoa api
|
https://developer.apple.com/library/mac/documentation/Carbon/Reference/Speech_Synthesis_Manager/Reference/reference.html#//apple_ref/doc/uid/TP30000211
|
Mac Cocoa NSSpeechSynthesizer
|
id, name, age, gender, language, locale - can be enumerated
|
https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/Reference.html
|
Linux SpeechDisp
|
name, language (2-letter), variant, voice type enum with Male1..3, Female1..3 and childMale, childFemale
|
http://cvs.freebsoft.org/doc/speechd/speech-dispatcher.html
|
espeak
|
lang (two letter [_region-subregion]), age(?), gender, name(=long language name)
|
espeack —voices
|
festival
|
name, gender, maybe more
|
http://www.cstr.ed.ac.uk/projects/festival/
|
Android
|
has a concept of engines, setLanguage locale based, isLanguageAvailable() but no language listing
|
http://developer.android.com/reference/android/speech/tts/TextToSpeech.html
|
CSS and/or XML in strings to be spoken