PDA

View Full Version : Integrate Text2Speech with SlimServer?



Paul Gordon
2004-03-02, 12:24
I don't know if this helps any, (tell me to shut up if it doesn't...)
but a few years ago I remember coming across a small utility that would
make an MP3 file directly from the TTS engine, thus any spoken text
could be made into an MP3 file with a single command line...

Sorry, but I don't remember what it was called, where I saw it, or if it
was free/commercial (helpful huh?), I just remember that I saw it, and
that it worked... that would be an ultra-simple way of getting TTS
through to the slim wouldn't it?

Paul G.



-----Original Message-----
From: dean blackketter [mailto:dean (AT) slimdevices (DOT) com]
Sent: 02 March 2004 16:36
To: Slim Devices Discussion
Subject: [slim] Integrate Text2Speech with SlimServer?

If we can get the WAV output, then we can stream it to the player with
only a little effort.

-dean

On Mar 2, 2004, at 2:11 AM, Oliver Cookson wrote:

> Hi,
>
> I thought it would be quite straight forward to do it within windows
> but
> we would need to merge the WAV output to the slim stream which I don't
> think would be easy? Though I could well be mistaken.
>
> This would be a HUGE ACCESSIBILITY plus point for slim. Just imagine
> the
> whole market this could open up for the partially blind & blind as
well
> as being a cool gizmo for the more fortunate in this world.
>
> Cheers
>
>
>
> -----Original Message-----
> From: Bob Myers [mailto:rtm (AT) gol (DOT) com]
> Sent: 01 March 2004 17:54
> To: discuss (AT) lists (DOT) slimdevices.com
> Subject: [slim] Integrate Text2Speech with SlimServer?
>
> On Windows, using SAPI which I know superficially, it's about three
> lines of code to create a WAV file of any bit of text. XP comes with
> all this stuff installed. Totally non-cross-platform but could be
cute
> enough to make it worthwhile. I'd write a plug-in but that's a little
> beyond my level. Actually, if the goal is to announce the currently
> song using TTS, it's not clear to me that the currente plug-in
> architecture supports that kind of thing. Does it?
>
> --
> Bob Myers
>
> Quoting dean blackketter:
>
>> Hi Oliver,
>>
>> It certainly is possible, although it would probably entail some
>> significant development effort.
>>
>> -dean
>>
>> On Mar 1, 2004, at 4:15 AM, Oliver Cookson wrote:
>>
>>> Hi,
>>>
>>> Do you think it would be possible to integrate a text2speech (like
>>> the windows version) with Slim Server? It would be great for reading
>>> ID3 tags and pronouncing song titles etc... I think it would be a
>>> cool gadget, and a VERY useful feature for the blind.
>>>
>>> Anybody feel this would be a worthy addition?
>
>

kdf
2004-03-02, 12:50
There also seems to be a CPAN module to deal with "Festival", which is
apparently some sort of text to speech server:
http://www.cstr.ed.ac.uk/projects/festival/
http://search.cpan.org/~rcaley/speech_pm_1.0/Speech/Festival/Synthesiser.pm

-kdf


Quoting Paul Gordon <Paul (AT) paulgordon (DOT) homeip.net>:

> I don't know if this helps any, (tell me to shut up if it doesn't...)
> but a few years ago I remember coming across a small utility that would
> make an MP3 file directly from the TTS engine, thus any spoken text
> could be made into an MP3 file with a single command line...
>
> Sorry, but I don't remember what it was called, where I saw it, or if it
> was free/commercial (helpful huh?), I just remember that I saw it, and
> that it worked... that would be an ultra-simple way of getting TTS
> through to the slim wouldn't it?
>
> Paul G.
>
>
>
> -----Original Message-----
> From: dean blackketter [mailto:dean (AT) slimdevices (DOT) com]
> Sent: 02 March 2004 16:36
> To: Slim Devices Discussion
> Subject: [slim] Integrate Text2Speech with SlimServer?
>
> If we can get the WAV output, then we can stream it to the player with
> only a little effort.
>
> -dean
>
> On Mar 2, 2004, at 2:11 AM, Oliver Cookson wrote:
>
> > Hi,
> >
> > I thought it would be quite straight forward to do it within windows
> > but
> > we would need to merge the WAV output to the slim stream which I don't
> > think would be easy? Though I could well be mistaken.
> >
> > This would be a HUGE ACCESSIBILITY plus point for slim. Just imagine
> > the
> > whole market this could open up for the partially blind & blind as
> well
> > as being a cool gizmo for the more fortunate in this world.
> >
> > Cheers
> >
> >
> >
> > -----Original Message-----
> > From: Bob Myers [mailto:rtm (AT) gol (DOT) com]
> > Sent: 01 March 2004 17:54
> > To: discuss (AT) lists (DOT) slimdevices.com
> > Subject: [slim] Integrate Text2Speech with SlimServer?
> >
> > On Windows, using SAPI which I know superficially, it's about three
> > lines of code to create a WAV file of any bit of text. XP comes with
> > all this stuff installed. Totally non-cross-platform but could be
> cute
> > enough to make it worthwhile. I'd write a plug-in but that's a little
> > beyond my level. Actually, if the goal is to announce the currently
> > song using TTS, it's not clear to me that the currente plug-in
> > architecture supports that kind of thing. Does it?
> >
> > --
> > Bob Myers
> >
> > Quoting dean blackketter:
> >
> >> Hi Oliver,
> >>
> >> It certainly is possible, although it would probably entail some
> >> significant development effort.
> >>
> >> -dean
> >>
> >> On Mar 1, 2004, at 4:15 AM, Oliver Cookson wrote:
> >>
> >>> Hi,
> >>>
> >>> Do you think it would be possible to integrate a text2speech (like
> >>> the windows version) with Slim Server? It would be great for reading
> >>> ID3 tags and pronouncing song titles etc... I think it would be a
> >>> cool gadget, and a VERY useful feature for the blind.
> >>>
> >>> Anybody feel this would be a worthy addition?
> >
> >

David N. Blank-Edelman
2004-03-02, 15:08
Howdy-

So I actually went down this path on a lark about three years ago when
thinking about how to improve an MP3 player I use (the PJB100, one of the
first and perhaps still one of the better hard-drive based units). Here's a
copy of the message I posted to a user list for the device, it might give
folks one possible path for exploration. I'm not posting it to the
developer list because it isn't fully formed enough to be useful to
someone actually develop an API for this sort of thing. Just a proof of
concept that it can be done.

-- dNb

-----------

To: pjb100 (AT) yahoogroups (DOT) com
From: dnb (AT) ccs (DOT) neu.edu
Date: 14 Sep 2001 00:55:54 -0400
Subject: [PJB-100] I may be totally nuts (a little long)

Howdy-
Before I get too far into an implementation, I'd thought I'd mention
a random idea I had and the results of a few tests I've done. I
can't decide if this is potentially useful or just nuts.

First, let me say that I'm really happy with the design work, both
hardware and software, done by the Compaq folks for the pjb100 (and
continue to be very appreciative of people like Andrew who are still
improving on that basic design). It is a great unit.

I was musing about what the next generation unit might offer. The
strangest idea I had came to me as I was walking along and trying to
locate a specific disc on the unit while it was playing clipped on
to my belt. It occurred to me that this task would be easier if the
unit could speak the titles of the discs to me as I browsed through
them. This doesn't seem like too outrageous an idea given current
the current state of artificial speech production and speech
production hardware. But given that there's no custom hardware in a
pjb100 to do this, and I'm not even worthy to look upon the wizard
that could program the DSP chip thus, it seemed like a total pipe
dream.

But is it? I then thought "well, you know, if I could somehow get an
mp3 of a disc's title being spoken on the unit, that might work."
Inspiration for this idea probably comes from the folks who early on
provided mp3s of blank space to use as spacers on the unit.

Now one way to produce these little spoken blurbs would be to sit
down at the computer and record snippets of someone's voice reading
each title. But that's no fun. We're in the 21st century, computers
can talk, let's make them do the work.

So, to begin this exploration I downloaded the free and quite
portable Festival Speech Synthesis System from
http://www.cstr.ed.ac.uk/projects/festival/. With one small change
and a little bit of compile time, it built cleanly under Cygwin
under win2000. I suspect a Linux build (or just using the
prepackaged distributions) would be equally as easy.

Then I wrote a quick 10-line Perl script to process the output from
"pjb.exe ls" (pre-XML output) into something a little easier to
understand when read out loud (e.g. I organize my pjb100 with
"Last, First/Title" which is harder to understand than having the
name be read "First Last Title"). Toni will be happy to know that
the first version of this program was a pipe of grep, cut and sed
commands.

I fed this file to Festival's text2wave program. This produced a wav
file that I converted to an mp3 using the LAME encoder
(http://home.pi.be/~mk442837/). The resulting output turns out to be
pretty understandable, especially if you already have some sense of
what is already on the machine.

Just to prove to you that I haven't been pulling your leg with this
message, I'll leave http://www.otterbook.com/pjb100.mp3 available at
least for a little while. This is the spoken version of
http://www.otterbook.com/pjb100.txt, essentially the current
contents of my pretty-new pjb100.

Disclaimer #1: the program that generated the text file was whipped
out quickly, so you can see it isn't doing all of the reformatting
one might like. Disclaimer #2: the speech file comes from the
default settings/voice of Festival. I haven't done anything to tune
it.

Soooo...the question is, is this idea at all interesting or just
plain wacky? Should I finish the job and write something that
a) automatically generates little speech mp3s speaking the name of
every disk and then b) loads each file on the pjb100 at the
beginning of every disk?

Or should I just up the dosage?

Peace,
dNb