Weblog + Audio

As an experiment, I’ve begun to translate the posts on this weblog into audio files, using the Festival text-to-speech system.

Festival is a free, open project of The Centre for Speech Technology Research at the University of Edinburgh. The system makes it easy to translate text — like the words in this post — into speech — like that which you’ll hear if you click on the little speaker icon beside the post title, or the “listen” link at the bottom of the post.

The speech isn’t quite “human,” but I’ve found it clear enough to allow me to understand posts. And because I’ve converted the audio files to MP3, it’s possible to dump them onto an audio player, or anything else that one might do with an MP3 file. I’m not intending the audio files to replace the text, simply to offer new ways to manipulate it.

As an aid to experimentation, I’ve also added a new RSS feed that links directly to the audio versions of posts.

I welcome comments.


Steven Garrity's picture
Steven Garrity on October 4, 2003 - 04:52 Permalink

A few thoughts: This strikes me as something that would make more sense being done on the client side. Give us good (standards compliant) code, and our browsers or aggregators can do this. This saves bandwidth and moves the processing power to our overpowered client machines.

Also, try out the demo of AT&T’s Natural Voices text-to-speech system. It’s a generation ahead of anything else I’ve heard.

Pretty cool though.

Peter Rukavina's picture
Peter Rukavina on October 4, 2003 - 22:00 Permalink

I does make sense, at least from a bnadwidth point of view, to leave this sort of thing to the client side. However how many users, even Mac users, where speech has been second nature since almost the beginning, even know how to have text read to them?

Jevon's picture
Jevon on October 5, 2003 - 02:58 Permalink

I’ve used a cross-site post to implement something similar for a client in the past (who only wanted a patchwork solution)

<form method=”post” name=”demoForm” action=”http://morrissey.naturalvoices…“>
<input type=”hidden” name=”txt” value=”This is the text to be read.”>
<input type=”hidden” name=”voice” value=”mike”>
<input type=”hidden” name=”rate” value=”8000”>
<input type=”submit” name=”speakButton” value=”Speak”>

Jevon's picture
Jevon on October 5, 2003 - 02:59 Permalink

Err.. there was html in there….

nathan's picture
nathan on October 5, 2003 - 15:30 Permalink

Too bad the robot voice sounds just about the same as the Mac or my Amiga did 12 years ago. The link Steve posted to the Natural Voices demo is certainly a generation ahead. With it I am able to listen normally, without having to concntrate on each word of the robot. I think it speaks at faster, more natural rhythem too.

Peter Rukavina's picture
Peter Rukavina on October 5, 2003 - 16:14 Permalink

Yes, Natural Voices is better. Natural Voices also costs $295.

nathan's picture
nathan on October 5, 2003 - 17:44 Permalink

Yes, I certainly understand why you’re using an Open Source solution. Natural Voices is insteresting since it demonstrates that better text-to-speech is even possible. With all the computing advances of the past decade the robot voice has remained as constant as its own monotone speech.

Chris Corrigan's picture
Chris Corrigan on October 7, 2003 - 17:29 Permalink

I always get an unnerving feeling when the library calls to tell me I have an overdue book. I can’t help but think that Stephen Hawking has fallen on hard times.