Jul 24, 2010

Ubuntu Linux Text-to-Speech

Introduction

Using Festival Text-to-Speech in Ubuntu doesn’t work after the install. Here are some steps I took to fix it. Also, some changes to make it useful in everyday work.

Getting Festival to Work

Festival is the free text-to-speech engine that is extremely popular.

Here’s how to get it:

sudo apt-get install festival

Here’s how to test it:

echo "hello world"|festival --tts

You may see this error:

Linux: can't open /dev/dsp

If so, add the following lines to your .festivalrc file:

(Parameter.set 'Audio_Command "aplay -q -c 1 -t raw -f s16 -r $SR $FILE")
(Parameter.set 'Audio_Method 'Audio_Command)

Getting it to read the clipboard

Now, if you want it to read info from your clipboard, install this:

sudo apt-get install xclip

And type this:

xclip -o|festival --tts

Now, you can go a step further and create a shortcut key for reading text. Here’s a good one:

#!/bin/bash

# This script reads the information from the clipboard outloud.

# Look for festival being run.
running=$(pgrep festival)

if [ -z $running ]
then
    # read it
    xclip -o|festival --tts
else
    # kill it
    killall festival;killall aplay;sleep .1;killall aplay
fi

I call it talk.sh. Be sure to do a chmod +x talk.sh to it.

Assigning a Shortcut

Now, to assign to a shortcut key. I’m using Ubuntu which uses GNOME. if you use something else..you’re on your own. Otherwise, click System->Keyboard Shortcuts. Then add the path to the script and assign a shortcut.

I assigned it to the Windows-A keystroke. You can click it once to start and again to stop. Unfortunately, the script assumes you only have one instance of festival.

Adjusting the Playback Speed

If you want it to read faster, change the .festivalrc file:

(Parameter.set 'Audio_Command "aplay -q -c 1 -t raw -f s16 -r $(($SR*140/100)) $FILE")

The ¹⁴⁰⁄₁₀₀ means 140% of original speed which seems about right to me for most texts.

Improving Voice (Ubuntu 12)

The default voices in Festival do not sound great. Try downloading cmu_us_slt_arctichts.tar.gz. This is tar.gz file. Here’s how to install it:

tar xvzf cmu_us_slt_arctichts.tar.gz
sudo mv cmu_us_slt_arctic_hts /usr/share/festival/voices/english/
Update /usr/share/festival/voices.scm to have cmu_us_slt_arctic_hts at the top (Hint: look for the work “kal” in the file)
```
(defvar default-voice-priority-list
'(cmu_us_slt_arctic_hts
kal_diphone
cmu_us_bdl_arctic_hts
cmu_us_jmk_arctic_hts
...
```

Improving Voices (Ubuntu 10)

The above instructions may not work in Ubuntu 10..instead, try this:

Here’s a bash script to add new voices. These are the best I could find anywhere:

# Setup
cd
dir=nitech_us
mkdir $dir
cd $dir

# Download the voices
for voice in awb bdl clb rms slt jmk
do
  wget "http://hts.sp.nitech.ac.jp/archives/2.0.1/festvox_nitech_us_"$voice"_arctic_hts-2.0.1.tar.bz2"
done

# Unpack
tar xvf *.bz2

# Install
sudo mkdir -p /usr/share/festival/voices/us
sudo mv lib/voices/us/* /usr/share/festival/voices/us/
sudo mv lib/hts.scm /usr/share/festival/hts.scm

Setting a Default Voice

The default voice in Festival is configurable, but it doesn’t seem to work. It was necessary to change /usr/share/festival/voices.scm directly. Simply update the default-voice-priority-list. It should like something like this:

(defvar default-voice-priority-list
'(nitech_us_slt_arctic_hts
kal_diphone
cmu_us_bdl_arctic_hts
cmu_us_jmk_arctic_hts
cmu_us_slt_arctic_hts
cmu_us_awb_arctic_hts
; cstr_rpx_nina_multisyn ; restricted license (lexicon)
; cstr_rpx_jon_multisyn ; restricted license (lexicon)
; cstr_edi_awb_arctic_multisyn ; restricted license (lexicon)
; cstr_us_awb_arctic_multisyn
ked_diphone
don_diphone
rab_diphone
en1_mbrola
us1_mbrola
us2_mbrola
us3_mbrola
gsw_diphone ;; not publically distributed
el_diphone
)
"default-voice-priority-list
List of voice names. The first of them available becomes the default voice.")

Notice how I put nitech_us_slt_arctic_hts at the top. This is my favorite voice.