QtSpeech: Difference between revisions

From Qt Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
h1. Qt Speech Module
h1. Qt Speech Module


This page contains notes about the development of a qt speech module.<br />Currently it is about tts (text to speech).<br />Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.
This page contains notes about the development of a qt speech module.
Currently it is about tts (text to speech).
Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.


https://codereview.qt.io/#admin,project,qt/qtspeech,info<br />ssh://codereview.qt.io:29418/qt/qtspeech.git
https://codereview.qt.io/#admin,project,qt/qtspeech,info
ssh://codereview.qt.io:29418/qt/qtspeech.git


== Current State ==
== Current State ==


There is a basic implementation on Mac/Win/Linux/Android.<br />Linux uses speech-dispatcher.
There is a basic implementation on Mac/Win/Linux/Android.
Linux uses speech-dispatcher.


== Todo ==
== Todo ==
Line 13: Line 17:
Decide on either plugins or only having one backend per platform.
Decide on either plugins or only having one backend per platform.


To implement on each platform:<br />* OS X: Pitch (patch in review currently &quot;here&amp;quot;:https://codereview.qt.io/99691)<br />* OS X: availableVoices (patch in review currently &quot;here&amp;quot;:https://codereview.qt.io/106097)<br />* Windows: availableVoices/setVoice/voice<br />* Linux: add all outputModule voices to availableVoices API<br />* Example widget: Remove voicetype combobox or add new api to implement it properly.
To implement on each platform:
* OS X: Pitch (patch in review currently "here":https://codereview.qt.io/99691)
* OS X: availableVoices (patch in review currently "here":https://codereview.qt.io/106097)
* Windows: availableVoices/setVoice/voice
* Linux: add all outputModule voices to availableVoices API
* Example widget: Remove voicetype combobox or add new api to implement it properly.


Collection of resources and links that should help defining a cross-platform API:
Collection of resources and links that should help defining a cross-platform API:
Line 31: Line 40:
|-
|-
|Win SAPI 5
|Win SAPI 5
|gender, age, name, lang, vendor - names such as &quot;Microsoft Anna&amp;quot; or &quot;Mike&amp;quot;
|gender, age, name, lang, vendor - names such as "Microsoft Anna" or "Mike"
|http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx
|http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx
|-
|-

Revision as of 10:14, 25 February 2015

h1. Qt Speech Module

This page contains notes about the development of a qt speech module. Currently it is about tts (text to speech). Speech recognition may be introduced, but is a lot less trivial at this point in time it seems.

https://codereview.qt.io/#admin,project,qt/qtspeech,info ssh://codereview.qt.io:29418/qt/qtspeech.git

Current State

There is a basic implementation on Mac/Win/Linux/Android. Linux uses speech-dispatcher.

Todo

Decide on either plugins or only having one backend per platform.

To implement on each platform:

  • OS X: Pitch (patch in review currently "here":https://codereview.qt.io/99691)
  • OS X: availableVoices (patch in review currently "here":https://codereview.qt.io/106097)
  • Windows: availableVoices/setVoice/voice
  • Linux: add all outputModule voices to availableVoices API
  • Example widget: Remove voicetype combobox or add new api to implement it properly.

Collection of resources and links that should help defining a cross-platform API:

API for language selection

QLocale seems like a good candidate for languages.

Voice selection

Summarize here which native API offers what. For example SAPI 5 has first names as voice identifiers.

Platform/API Voice Properties Link
Win SAPI 5 gender, age, name, lang, vendor - names such as "Microsoft Anna" or "Mike" http://msdn.microsoft.com/en-us/library/ms720151(v=vs.85).aspx#API_for_Text-To-Speech http://msdn.microsoft.com/en-us/library/ms723601(v=vs.85).aspx
Mac Carbon similar to cocoa api https://developer.apple.com/library/mac/documentation/Carbon/Reference/Speech_Synthesis_Manager/Reference/reference.html#//apple_ref/doc/uid/TP30000211
Mac Cocoa NSSpeechSynthesizer id, name, age, gender, language, locale - can be enumerated https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSSpeechSynthesizer_Class/Reference/Reference.html
Linux SpeechDisp name, language (2-letter), variant, voice type enum with Male1..3, Female1..3 and childMale, childFemale http://cvs.freebsoft.org/doc/speechd/speech-dispatcher.html
espeak lang (two letter [_region-subregion]), age(?), gender, name(=long language name) espeack —voices
festival name, gender, maybe more http://www.cstr.ed.ac.uk/projects/festival/
Android has a concept of engines, setLanguage locale based, isLanguageAvailable() but no language listing http://developer.android.com/reference/android/speech/tts/TextToSpeech.html

CSS and/or XML in strings to be spoken