QtSpeech: Difference between revisions

From Qt Wiki
Jump to navigation Jump to search
(Updated todo list.)
(Fix url to code review for QtSpeech)
 
(One intermediate revision by the same user not shown)
Line 4: Line 4:
Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.
Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.


https://codereview.qt.io/#admin,project,qt/qtspeech,info
https://codereview.qt-project.org/#/admin/projects/qt/qtspeech
ssh://codereview.qt.io:29418/qt/qtspeech.git
ssh://codereview.qt.io:29418/qt/qtspeech.git


Line 19: Line 19:


To implement on each platform:
To implement on each platform:
* Windows: availableVoices/setVoice/voice https://codereview.qt-project.org/108169
* Linux: speech-dispatcher connection failure handling https://codereview.qt-project.org/108820
* Android: backend could use a lot of implementation of methods that are empty currently
* iOS: backend needs a bit of thought about how old of iOS should be supported as noted here: https://codereview.qt-project.org/98704
* iOS: backend needs a bit of thought about how old of iOS should be supported as noted here: https://codereview.qt-project.org/98704



Latest revision as of 20:46, 31 August 2015

Qt Speech Module

This page contains notes about the development of a qt speech module. Currently it is about tts (text to speech). Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.

https://codereview.qt-project.org/#/admin/projects/qt/qtspeech ssh://codereview.qt.io:29418/qt/qtspeech.git

Current State

There is a basic implementation on Mac/Win/Linux/Android. Linux uses speech-dispatcher. Windows uses sapi5. OSX uses Cocoa NSSpeechSynthesizer api.

Todo

Decide on either plugins or only having one backend per platform.

To implement on each platform:

Collection of resources and links that should help defining a cross-platform API:

API for language selection

QLocale seems like a good candidate for languages.

Voice selection

Summarize here which native API offers what. For example SAPI 5 has first names as voice identifiers.

Platform/API Voice Properties Link
Win SAPI 5 gender, age, name, lang, vendor - names such as "Microsoft Anna" or "Mike" http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx
Mac Carbon similar to cocoa api https://developer.apple.com/library/mac/documentation/Carbon/Reference/Speech_Synthesis_Manager/Reference/reference.html#//apple_ref/doc/uid/TP30000211
Mac Cocoa NSSpeechSynthesizer id, name, age, gender, language, locale - can be enumerated https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/Reference.html
Linux SpeechDisp name, language (2-letter), variant, voice type enum with Male1..3, Female1..3 and childMale, childFemale http://cvs.freebsoft.org/doc/speechd/speech-dispatcher.html
espeak lang (two letter [_region-subregion]), age(?), gender, name(=long language name) espeack —voices
festival name, gender, maybe more http://www.cstr.ed.ac.uk/projects/festival/
Android has a concept of engines, setLanguage locale based, isLanguageAvailable() but no language listing http://developer.android.com/reference/android/speech/tts/TextToSpeech.html

CSS and/or XML in strings to be spoken