Commit 5ab4e7f3 authored by Philip Withnall's avatar Philip Withnall

text-to-speech: Fix spacing in bullet point lists

Blank lines between list entries mean a ‘loose list’ in CommonMark.
While we want that for lists with long entries (i.e. each entry is
paragraph-length), we don’t want it for some of the shorter lists, as it
makes them appear too spread out.

(Semantically it is also incorrect for the lists which use bullets to
split up a sentence, as each of the clauses is not a paragraph in
itself.)
Signed-off-by: default avatarPhilip Withnall <philip.withnall@collabora.co.uk>
Differential Revision: https://phabricator.apertis.org/D4066
parent 9d7b3dd9
......@@ -20,10 +20,8 @@ started.
The major considerations with a TTS API are:
- Simple API for applications to use
- Swappable voices through the application bundling system and
application store
- Output priorities controlled by the same set of audio manager
policies which control other application audio output
......@@ -77,10 +75,8 @@ played:
- The system could pause reading the original e-mail, read the
notification, then resume reading the original e-mail; or
- it could pause reading the original e-mail, read the notification,
then *not* resume reading the original e-mail; or
- it could continue reading the original e-mail at a lower volume, and
read the notification louder mixed over the top.
......@@ -513,59 +509,41 @@ which are available already.
#### espeak
- Supports many languages (importantly, non-Latin languages)
- Sounds robotic
- Can be used with mbrola voices to make it more natural; not
supported very well by speech-dispatcher
([*http://espeak.sourceforge.net/mbrola.html*](http://espeak.sourceforge.net/mbrola.html))
- Already packaged for Ubuntu (as are mbrola voices)
- [*http://espeak.sourceforge.net/*](http://espeak.sourceforge.net/)
#### Festival
- Sounds less robotic than espeak, but still quite robotic (example
here:
[*http://tts.speech.cs.cmu.edu:8083/*](http://tts.speech.cs.cmu.edu:8083/))
- A bit slower
- Already packaged for Ubuntu
- Supports 3 languages (English, Spanish and
Welsh)
- [*http://www.cstr.ed.ac.uk/projects/festival/*](http://www.cstr.ed.ac.uk/projects/festival/)
#### pico
- License: Apache License v2
- By SVOX; used in Android
- Written in Java; C API available in picoapi.h
- Supports 37 languages (importantly, non-Latin languages)
- Sounds very good (example here:
[*https://svoxmobilevoices.wordpress.com/demos/*](https://svoxmobilevoices.wordpress.com/demos/))
- Not as well tested through
speech-dispatcher
- [*https://en.wikipedia.org/wiki/SVOX*](https://en.wikipedia.org/wiki/SVOX)
- Publicly available source;
[*https://android.googlesource.com/platform/external/svox/*](https://android.googlesource.com/platform/external/svox/)
- Already packaged for Debian and Ubuntu
- As this is a component of Android, we are not sure about the
openness of the development practices, and whether it’s possible to
get involved in them.
- It’s certainly possible to file bugs about the packaging with the
[Debian bug tracker][pico-tracker], but that won’t necessarily help for bugs in
the source itself.
......@@ -573,18 +551,14 @@ which are available already.
#### acapela
- Non-FOSS
- Best quality
- [*http://www.acapela-group.com/*](http://www.acapela-group.com/)
#### Nuance
- Non-FOSS
- Has been used previously in
eCore
- *http://www.nuance.com/for-business/text-to-speech/vocalizer/index.htm\#demo*
## Approach
......@@ -644,7 +618,6 @@ Additionally, speech-dispatcher has the disadvantages that it:
- does not enforce separation between clients, meaning they may
control each others' output; and
- provides a C API which is not GLib-based, so would be hard to
introspect and expose in other languages (such as JavaScript).
......@@ -663,14 +636,10 @@ applications to essentially provide the functionality of turning a text
string into an audio stream. It would provide the following major APIs:
- Say a text string.
- Stop, pause and resume speech.
- Signal on starting, pausing, resuming and ending audio output, plus
on significant progress through output.
- Set the language for a request.
- ‘Sound icon’ API for associating audio files with specific strings.
The stop, pause and resume APIs would operate on specific requests,
......@@ -741,15 +710,10 @@ sensible, unconfigurable, defaults for the other options.
Configuration options may include:
- Voice to use
- Whether to vocalise punctuation
- Voice type (male or female)
- Speech rate
- Pitch
- Volume
By storing the options in GSettings, it becomes possible to apply
......@@ -763,9 +727,7 @@ modify them.
Configuration which is exposed to applications via the TTS API could be:
- Pitch
- Speech rate
- Volume
These options must be exposed purely as *modifiers* on the system-wide
......@@ -773,9 +735,7 @@ values. These modifiers could be defined symbolically, for example as a
set of three volume modifiers:
- Emphasised (120% of system-wide volume)
- Normal (100% of system-wide volume)
- De-emphasised (80% of system-wide volume)
A non-symbolic numerical modifier might be introduced in future.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment