QtSpeech

From Qt Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Qt Speech Module

This page contains notes about the development of a qt speech module. Currently it is about tts (text to speech). Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.

https://codereview.qt-project.org/#/admin/projects/qt/qtspeech ssh://codereview.qt.io:29418/qt/qtspeech.git

Current State

There is a basic implementation on Mac/Win/Linux/Android. Linux uses speech-dispatcher. Windows uses sapi5. OSX uses Cocoa NSSpeechSynthesizer api.

Todo

Decide on either plugins or only having one backend per platform.

To implement on each platform:

Collection of resources and links that should help defining a cross-platform API:

API for language selection

QLocale seems like a good candidate for languages.

Voice selection

Summarize here which native API offers what. For example SAPI 5 has first names as voice identifiers.

Platform/API Voice Properties Link
Win SAPI 5 gender, age, name, lang, vendor - names such as "Microsoft Anna" or "Mike" http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx
Mac Carbon similar to cocoa api https://developer.apple.com/library/mac/documentation/Carbon/Reference/Speech_Synthesis_Manager/Reference/reference.html#//apple_ref/doc/uid/TP30000211
Mac Cocoa NSSpeechSynthesizer id, name, age, gender, language, locale - can be enumerated https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/Reference.html
Linux SpeechDisp name, language (2-letter), variant, voice type enum with Male1..3, Female1..3 and childMale, childFemale http://cvs.freebsoft.org/doc/speechd/speech-dispatcher.html
espeak lang (two letter [_region-subregion]), age(?), gender, name(=long language name) espeack —voices
festival name, gender, maybe more http://www.cstr.ed.ac.uk/projects/festival/
Android has a concept of engines, setLanguage locale based, isLanguageAvailable() but no language listing http://developer.android.com/reference/android/speech/tts/TextToSpeech.html

CSS and/or XML in strings to be spoken