QtSpeech: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
h1. Qt Speech Module | |||
This page contains notes about the development of a qt speech module.<br /> Currently it is about tts (text to speech).<br /> Speech recognition may be introduced, but is a lot less trivial at this point in time it seems. | This page contains notes about the development of a qt speech module.<br />Currently it is about tts (text to speech).<br />Speech recognition may be introduced, but is a lot less trivial at this point in time it seems. | ||
https://codereview.qt.io/#admin,project,qt/qtspeech,info<br /> ssh://codereview.qt.io:29418/qt/qtspeech.git | https://codereview.qt.io/#admin,project,qt/qtspeech,info<br />ssh://codereview.qt.io:29418/qt/qtspeech.git | ||
==Current State== | == Current State == | ||
There is a basic implementation on Mac/Win/Linux/Android.<br /> Linux uses speech-dispatcher. | There is a basic implementation on Mac/Win/Linux/Android.<br />Linux uses speech-dispatcher. | ||
==Todo== | == Todo == | ||
Decide on either plugins or only having one backend per platform. | Decide on either plugins or only having one backend per platform. | ||
To implement on each platform:<br />* OS X: Pitch (patch in review currently "here&quot;:https://codereview.qt.io/99691)<br />* OS X: availableVoices (patch in review currently "here&quot;:https://codereview.qt.io/106097)<br />* Windows: availableVoices/setVoice/voice<br />* Linux: add all outputModule voices to availableVoices API<br />* Example widget: Remove voicetype combobox or add new api to implement it properly. | |||
=== | Collection of resources and links that should help defining a cross-platform API: | ||
=== API for language selection === | |||
QLocale seems like a good candidate for languages. | QLocale seems like a good candidate for languages. | ||
===Voice selection=== | === Voice selection === | ||
Summarize here which native | Summarize here which native API offers what. For example SAPI 5 has first names as voice identifiers. | ||
{| | {| | ||
| Platform/API | |Platform/API | ||
| Voice Properties | |Voice Properties | ||
| Link | |Link | ||
|- | |- | ||
| Win | |Win SAPI 5 | ||
| gender, age, name, lang, vendor | |gender, age, name, lang, vendor - names such as "Microsoft Anna&quot; or "Mike&quot; | ||
| | |http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx | ||
http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx | |||
|- | |- | ||
| Mac Carbon | |Mac Carbon | ||
| similar to cocoa api | |similar to cocoa api | ||
| | |https://developer.apple.com/library/mac/documentation/Carbon/Reference/Speech_Synthesis_Manager/Reference/reference.html#//apple_ref/doc/uid/TP30000211 | ||
https://developer.apple.com/library/mac/documentation/Carbon/Reference/Speech_Synthesis_Manager/Reference/reference.html#//apple_ref/doc/uid/TP30000211 | |||
|- | |- | ||
| Mac Cocoa NSSpeechSynthesizer | |Mac Cocoa NSSpeechSynthesizer | ||
| id, name, age, gender, language, locale | |id, name, age, gender, language, locale - can be enumerated | ||
| | |https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/Reference.html | ||
https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/Reference.html | |||
|- | |- | ||
| Linux SpeechDisp | |Linux SpeechDisp | ||
| name, language (2-letter), variant, voice type enum with Male1..3, Female1..3 and childMale, childFemale | |name, language (2-letter), variant, voice type enum with Male1..3, Female1..3 and childMale, childFemale | ||
| | |http://cvs.freebsoft.org/doc/speechd/speech-dispatcher.html | ||
http://cvs.freebsoft.org/doc/speechd/speech-dispatcher.html | |||
|- | |- | ||
| espeak | |espeak | ||
| lang (two letter [_region-subregion]), age(?), gender, name(=long language name) | |lang (two letter [_region-subregion]), age(?), gender, name(=long language name) | ||
| espeack —voices | |espeack —voices | ||
|- | |- | ||
| festival | |festival | ||
| name, gender, maybe more | |name, gender, maybe more | ||
| | |http://www.cstr.ed.ac.uk/projects/festival/ | ||
http://www.cstr.ed.ac.uk/projects/festival/ | |||
|- | |- | ||
| Android | |Android | ||
| has a concept of engines, setLanguage locale based, isLanguageAvailable() but no language listing | |has a concept of engines, setLanguage locale based, isLanguageAvailable() but no language listing | ||
| | |http://developer.android.com/reference/android/speech/tts/TextToSpeech.html | ||
http://developer.android.com/reference/android/speech/tts/TextToSpeech.html | |||
|} | |} | ||
=== | === CSS and/or XML in strings to be spoken === | ||
Revision as of 09:53, 24 February 2015
h1. Qt Speech Module
This page contains notes about the development of a qt speech module.
Currently it is about tts (text to speech).
Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.
https://codereview.qt.io/#admin,project,qt/qtspeech,info
ssh://codereview.qt.io:29418/qt/qtspeech.git
Current State
There is a basic implementation on Mac/Win/Linux/Android.
Linux uses speech-dispatcher.
Todo
Decide on either plugins or only having one backend per platform.
To implement on each platform:
* OS X: Pitch (patch in review currently "here":https://codereview.qt.io/99691)
* OS X: availableVoices (patch in review currently "here":https://codereview.qt.io/106097)
* Windows: availableVoices/setVoice/voice
* Linux: add all outputModule voices to availableVoices API
* Example widget: Remove voicetype combobox or add new api to implement it properly.
Collection of resources and links that should help defining a cross-platform API:
API for language selection
QLocale seems like a good candidate for languages.
Voice selection
Summarize here which native API offers what. For example SAPI 5 has first names as voice identifiers.
Platform/API | Voice Properties | Link |
Win SAPI 5 | gender, age, name, lang, vendor - names such as "Microsoft Anna" or "Mike" | http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx |
Mac Carbon | similar to cocoa api | https://developer.apple.com/library/mac/documentation/Carbon/Reference/Speech_Synthesis_Manager/Reference/reference.html#//apple_ref/doc/uid/TP30000211 |
Mac Cocoa NSSpeechSynthesizer | id, name, age, gender, language, locale - can be enumerated | https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/Reference.html |
Linux SpeechDisp | name, language (2-letter), variant, voice type enum with Male1..3, Female1..3 and childMale, childFemale | http://cvs.freebsoft.org/doc/speechd/speech-dispatcher.html |
espeak | lang (two letter [_region-subregion]), age(?), gender, name(=long language name) | espeack —voices |
festival | name, gender, maybe more | http://www.cstr.ed.ac.uk/projects/festival/ |
Android | has a concept of engines, setLanguage locale based, isLanguageAvailable() but no language listing | http://developer.android.com/reference/android/speech/tts/TextToSpeech.html |