QtSpeech: Difference between revisions
No edit summary |
(Fix url to code review for QtSpeech) |
||
(7 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
=Qt Speech Module= | = Qt Speech Module = | ||
This page contains notes about the development of a qt speech module. | |||
Currently it is about tts (text to speech). | |||
Speech recognition may be introduced, but is a lot less trivial at this point in time it seems. | |||
https://codereview.qt-project.org/#/admin/projects/qt/qtspeech | |||
ssh://codereview.qt.io:29418/qt/qtspeech.git | |||
== Current State == | |||
There is a basic implementation on Mac/Win/Linux/Android. | |||
Linux uses speech-dispatcher. | |||
Windows uses sapi5. | |||
OSX uses Cocoa NSSpeechSynthesizer api. | |||
== Todo == | |||
Decide on either plugins or only having one backend per platform. | |||
To implement on each platform: | |||
* iOS: backend needs a bit of thought about how old of iOS should be supported as noted here: https://codereview.qt-project.org/98704 | |||
Collection of resources and links that should help defining a cross-platform | Collection of resources and links that should help defining a cross-platform API: | ||
=== | === API for language selection === | ||
QLocale seems like a good candidate for languages. | QLocale seems like a good candidate for languages. | ||
===Voice selection=== | === Voice selection === | ||
Summarize here which native | Summarize here which native API offers what. For example SAPI 5 has first names as voice identifiers. | ||
{| | {| | ||
| Platform/API | |Platform/API | ||
| Voice Properties | |Voice Properties | ||
| Link | |Link | ||
|- | |- | ||
| Win | |Win SAPI 5 | ||
| gender, age, name, lang, vendor | |gender, age, name, lang, vendor - names such as "Microsoft Anna" or "Mike" | ||
| | |http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx | ||
http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx | |||
|- | |- | ||
| Mac Carbon | |Mac Carbon | ||
| similar to cocoa api | |similar to cocoa api | ||
| | |https://developer.apple.com/library/mac/documentation/Carbon/Reference/Speech_Synthesis_Manager/Reference/reference.html#//apple_ref/doc/uid/TP30000211 | ||
https://developer.apple.com/library/mac/documentation/Carbon/Reference/Speech_Synthesis_Manager/Reference/reference.html#//apple_ref/doc/uid/TP30000211 | |||
|- | |- | ||
| Mac Cocoa NSSpeechSynthesizer | |Mac Cocoa NSSpeechSynthesizer | ||
| id, name, age, gender, language, locale | |id, name, age, gender, language, locale - can be enumerated | ||
| | |https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/Reference.html | ||
https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/Reference.html | |||
|- | |- | ||
| Linux SpeechDisp | |Linux SpeechDisp | ||
| name, language (2-letter), variant, voice type enum with Male1..3, Female1..3 and childMale, childFemale | |name, language (2-letter), variant, voice type enum with Male1..3, Female1..3 and childMale, childFemale | ||
| | |http://cvs.freebsoft.org/doc/speechd/speech-dispatcher.html | ||
http://cvs.freebsoft.org/doc/speechd/speech-dispatcher.html | |||
|- | |- | ||
| espeak | |espeak | ||
| lang (two letter [_region-subregion]), age(?), gender, name(=long language name) | |lang (two letter [_region-subregion]), age(?), gender, name(=long language name) | ||
| espeack —voices | |espeack —voices | ||
|- | |- | ||
| festival | |festival | ||
| name, gender, maybe more | |name, gender, maybe more | ||
| | |http://www.cstr.ed.ac.uk/projects/festival/ | ||
http://www.cstr.ed.ac.uk/projects/festival/ | |||
|- | |- | ||
| Android | |Android | ||
| has a concept of engines, setLanguage locale based, isLanguageAvailable() but no language listing | |has a concept of engines, setLanguage locale based, isLanguageAvailable() but no language listing | ||
| | |http://developer.android.com/reference/android/speech/tts/TextToSpeech.html | ||
http://developer.android.com/reference/android/speech/tts/TextToSpeech.html | |||
|} | |} | ||
=== | === CSS and/or XML in strings to be spoken === | ||
Latest revision as of 20:46, 31 August 2015
Qt Speech Module
This page contains notes about the development of a qt speech module. Currently it is about tts (text to speech). Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.
https://codereview.qt-project.org/#/admin/projects/qt/qtspeech ssh://codereview.qt.io:29418/qt/qtspeech.git
Current State
There is a basic implementation on Mac/Win/Linux/Android. Linux uses speech-dispatcher. Windows uses sapi5. OSX uses Cocoa NSSpeechSynthesizer api.
Todo
Decide on either plugins or only having one backend per platform.
To implement on each platform:
- iOS: backend needs a bit of thought about how old of iOS should be supported as noted here: https://codereview.qt-project.org/98704
Collection of resources and links that should help defining a cross-platform API:
API for language selection
QLocale seems like a good candidate for languages.
Voice selection
Summarize here which native API offers what. For example SAPI 5 has first names as voice identifiers.