MultilingualSpeechSynthesis

Differences between revisions 5 and 6
Revision 5 as of 2006-11-30 19:32:37
Size: 2496
Editor: host-81-191-165-41
Comment: OrcaEspeak spec, pckaging and boot issues
Revision 6 as of 2006-12-03 14:41:40
Size: 2976
Editor: host-81-191-165-41
Comment: clarified design and implementation
Deletions are marked like this. Additions are marked like this.
Line 24: Line 24:
 * Write an eSpeak driver for gnome-speech
 * Promote eSpeak to Main
Line 27: Line 30:
 * List according to a language which free or commercial voice synthesizer is available. -- Where? In help or on the web or in some config application? -- HenrikOmma
 * Provide an easy way to change voices and synthesizers after install (apt-get one and change settings in Orca/Speech``Dispatcher)
 * Write an eSpeak deriver for gnome-speech and make eSpeak the default driver.
 * Selecting drivers and voices: Orca already has a facility for selecting the synthesiser and voice to be used via gnome-speech.
 * Information on installing additional voices will be provided on {{{http://access.ubuntu.com/speech}}}

=== Optional ===

We may also provide support for various commercial synthesisers such as DECtalk, IBM-TTS and Mbrola. These would be available as gnome-speech drivers or via speech dispatcher (Mbrola). Implementation of this depends on licensing discussions.
Line 35: Line 43:
 * Add the Mbrola engine and voices to multiverse or commercial repository  * Add support for DECtalk, IBM-TTS, Loquendo, etc to gnome-speech so that these voices can be purchased and installed by users without needing to recompile gnome-speech (pending licensing discussions)
 * Add the Mbrola engine and voices to multiverse or commercial repository (pending licensing discussions)
Line 39: Line 48:
 * Where should the languages be packaged, as eSpeak data packages or with the language packs?
Line 41: Line 49:
 * [http://tcts.fpms.ac.be/synthesis/mbrola/mbrlicen.html MBROLA license]: contact mbrola at tcts.fpms.ac.be .  * Contact makers of speech engines (commercial and academic) to sort out licensing questions

Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.

Summary

Improve speech synthesis support in multiple languages with multiple languages on the Live CD and additional options after install.

Rationale

We aim to provide native language support to our users where we can.

Use cases

  • Marie is a French visually impaired person relying on voice synthesizer. She uses either Speakup on the command line and Orca in Gnome and would like her documents and system messages to be read out in a French voice.
  • Lao is a Chinese blind person. There are currently no Free Chinese voice synthesizer in GNU/Linux, so he uses an English version.
  • Hans is a screen reader user living in Switzerland where there are three main languages. During installation he chose to use a German desktop, with a German synthesiser voice, but he also needs to be able to switch easily to French or Italian.

Scope

  • Write an eSpeak driver for gnome-speech
  • Promote eSpeak to Main

Design

  • Make eSpeak the default synthesizer on the Live CD. It supports more languages and has a much smaller footprint than Festival (though the voices are less natural).
  • Write an eSpeak deriver for gnome-speech and make eSpeak the default driver.
  • Selecting drivers and voices: Orca already has a facility for selecting the synthesiser and voice to be used via gnome-speech.
  • Information on installing additional voices will be provided on http://access.ubuntu.com/speech

Optional

We may also provide support for various commercial synthesisers such as DECtalk, IBM-TTS and Mbrola. These would be available as gnome-speech drivers or via speech dispatcher (Mbrola). Implementation of this depends on licensing discussions.

Implementation

  • This spec depends on the [:Accessibility/Specs/OrcaEspeak:OrcaEspeak] spec

  • Ship eSpeak with up to 10-15 voices on the Live CD (follow the list of most used languages = most likely to have language pack on the CD)
  • The Casper accessibility script checks the language variable set during boot and sets the corresponding eSpeak voice
  • Add support for DECtalk, IBM-TTS, Loquendo, etc to gnome-speech so that these voices can be purchased and installed by users without needing to recompile gnome-speech (pending licensing discussions)
  • Add the Mbrola engine and voices to multiverse or commercial repository (pending licensing discussions)

Unresolved issues

  • How do we make language selection accessible at the gfxboot menu?
  • Contact makers of speech engines (commercial and academic) to sort out licensing questions


CategorySpec

Accessibility/Specs/MultilingualSpeechSynthesis (last edited 2008-08-06 16:39:31 by localhost)