PDA

View Full Version : Announce track



uli42
2014-05-30, 02:15
Hello,


I was wondering if there's a way to announce tracks via the speaker before actually playing the track. This would be handy for blind people or children or devices where the display is too far away or even broken. So is there a plugin for that? Can you recommend a software that easily creates mp3s with the title name and integrate them into a playlist?

Thank you,

Uli42

epoch1970
2014-05-31, 10:05
I've had the same moment.
http://forums.slimdevices.com/showthread.php?98024-How-to-determine-the-language-of-a-song-or-album-name

The SBS server will send an event when a song starts I think, so you could imagine having the machine say stuff from time to time.

Using squeezelite I was in fact simply checking what was playing every so often and saying that, just like my SB3 which switches between displaying song info and showing the analog VU meter every minute or so.
The script would lower the audio before speaking aloud, and bring the volume back right after. It wasn't working too bad, given it was made with a simplistic perl script querying the SBS server with the CLI.

However I abandoned the project because of issues with pronunciation.
Would you use an English or a Swedish speaker to speak this text: "Esbjörn Svensson Trio - Live in Stockholm 19.Jun.99 - 3. Announcement by Esbjörn" ?
Believe it or not, I tried switching languages, for artist/title/album or even for words... But what I got was a patchwork of different voices, not of different accents with the same voice... The result was worse than sticking to one single voice. With mostly good results, marred with excruciatingly poor ones, unfortunately.

uli42
2014-06-01, 12:53
The script would lower the audio before speaking aloud, and bring the volume back right after. It wasn't working too bad, given it was made with a simplistic perl script querying the SBS server with the CLI.

Thanks for the answer. I am not too experienced with writing plugins, so could you please help me out how you implemented this? Is there some code I could (re)use?

For now I have generated wav files using pico2wave and put them as first files into the folders which is ok but requires some work if want to do it for a bigger collection. So I'd like to automate that.



However I abandoned the project because of issues with pronunciation.
Would you use an English or a Swedish speaker to speak this text: "Esbjörn Svensson Trio - Live in Stockholm 19.Jun.99 - 3. Announcement by Esbjörn" ?
Believe it or not, I tried switching languages, for artist/title/album or even for words... But what I got was a patchwork of different voices, not of different accents with the same voice... The result was worse than sticking to one single voice. With mostly good results, marred with excruciatingly poor ones, unfortunately.


In my case there's no problem with that because I have audio books in one language only (german).

Uli42

epoch1970
2014-06-01, 16:24
Nobody wants to reuse my code, believe me ;)
Not even I, I'm afraid, because there is a good chance I've either removed or lost the files.
I'll try to go back and search the computer I was using at the time, but there is really nothing much to expect.

I'm not sure you need a plugin (I don't know how to write one). An external script may be sufficient. You can gather all the information you want from the CLI provided by the server.
The major task, probably, is setting up a TTS system that sounds ok. You're better off using a Mac than a Linux machine if you want it to speak.. But last time, it seems I've used svox to some effect. Prior to that I would have used a mix of espeak and Mbrola. And before that the recipe was different again.
The only constant over many years is that I never got anything sensible out of Festival, despite its glowing reviews.
Also, I wouldn't try using the Linux system extensions for the blind like Speakup or Orca. These programs render voice very fast, but they don't sound nice.

If you're using Linux you can start looking for the TTS chain that will sound right. Getting a title out of SBS is much easier.

I'll look at the machine and will post back here, probably in a few days, if I find something of interest.

Good luck.

epoch1970
2014-06-04, 15:13
So,
I found some code, you'll find in the attached zip 3 files:
- json-ui2.pl: a simple test program that runs in a loop and queries SBS from the CLI (using JSON) for the status of one player, cooks a status string, sends it to the speech daemon, slides the volume down, then up, then goes to sleep and start again. It's in perl and your system may lack some required modules. You should find them in CPAN in any case. It uses a hardcoded value for the MAC address of the player you want to speak about, and expects to run from the same machine as the SBS server. If you know perl you'll find your way around but in any case I also attach a log, this along with looking at the code (esp. CLI queries) may be enough to get you started.
- cata_speak: a sysV init script that uses the debian (?) start-stop-daemon facility to daemonize the speech program. It is not necessary, but without it the program runs in blocking mode
- cata_speak.pl: the speech program intended to run as a daemon. You will have to configure the script to define the sound device to use (alsa), make sure directories for pipes and flags do exist, and binaries are present on your system. This program uses pico2wave to generate an audio file and aplay to send it to the sound device. Last time I checked, pico2wave wasn't able to pipe its output to aplay, but insisted on a file for output. It's slower but sounds nicer than my previous TTS chain (still commented in the file) which used "espeak | mbrola | aplay".

Some comments. Cata_speak.pl was initially designed to speak system status on a headless computer. So it had to sit and wait in the system, speak when activated (reading from a pipe) and then go quiet. I reused it with json-ui.pl: the program sends strings into the pipe and ends the sequence with a line that reads "SPEAK". When Cata_speak.pl sees the "SPEAK" sentence, it dumps all the previous contents of the pipe to the audio device. This takes some time, so to allow clients to manage I use some flags which indicate the speech program is already active. E.g. the json-ui.pl script only slides the volume down when pico2wave has finished generating the wav file, and waits until aplay has finished speaking to raise volume again.
This is a bit complicated and certainly more fragile and less capable that I'd like it to be. An SBS plugin would certainly not work this way, but I don't know my way around plugins so I let this part to you.

Hope this helps, and good luck.

Here is a trace. Note how long it takes to speak some sentences. 20 secs seems definitely possible (and wav file generation by pico2wav is on top of that.) This is why I wanted to use mixing and volume sliding instead of pausing the player.

max:/home/max# ./json-ui2.pl
[./json-ui2.pl] Starting. Status updates every 120 secs or so. Press Ctrl-c to break.
[1401916318] Sending [You're listening to: "Love Will Tear Us Apart", from "The Complete BBC Recordings", by "Joy Division". Next up: "Novelty", from "Live at The Factory, Manchester, 13 July 1979 \(Unknown Pleasures Collector\'s Edition\)", same artist.]
[1401916321] Sliding vol: 100 -> 70 (step:-5): [100..95..90..85..80..75..70]
[1401916341] Sliding vol: 70 -> 100 (step:5): [70..75..80..85..90..95..100]
[1401916343] Sleeping 66 more seconds (28 secs randomly subtracted)
[1401916409] Sending [You're listening to: "Novelty", from "Live at The Factory, Manchester, 13 July 1979 \(Unknown Pleasures Collector\'s Edition\)", by "Joy Division". Next in about 3 minutes: "Mercy Seat", from "Ultra Vivid Scene", by "Ultra Vivid Scene".]
[1401916417] Sliding vol: 100 -> 70 (step:-5): [100..95..90..85..80..75..70]
[1401916436] Sliding vol: 70 -> 100 (step:5): [70..75..80..85..90..95..100]
[1401916437] Sleeping 30 more seconds (62 secs randomly subtracted)
[1401916467] Sending [You're listening to: "Novelty", from "Live at The Factory, Manchester, 13 July 1979 \(Unknown Pleasures Collector\'s Edition\)", by "Joy Division". Next in about 2 minutes: "Mercy Seat", from "Ultra Vivid Scene", by "Ultra Vivid Scene".]
[1401916475] Sliding vol: 100 -> 70 (step:-5): [100..95..90..85..80..75..70]
[1401916495] Sliding vol: 70 -> 100 (step:5): [70..75..80..85..90..95..100]
[1401916496] Sleeping 90 more seconds (1 secs randomly subtracted)
[1401916586] Sending [You're listening to: "Novelty", from "Live at The Factory, Manchester, 13 July 1979 \(Unknown Pleasures Collector\'s Edition\)", by "Joy Division". Next up: "Mercy Seat", from "Ultra Vivid Scene", by "Ultra Vivid Scene".]
[1401916594] Sliding vol: 100 -> 70 (step:-5): [100..95..90..85..80..75..70]
[1401916613] Sliding vol: 70 -> 100 (step:5): [70..75..80..85..90..95..100]
[1401916614] Sleeping 83 more seconds (9 secs randomly subtracted)
[1401916697] Sending [You're listening to: "Mercy Seat", from "Ultra Vivid Scene", by "Ultra Vivid Scene". Next in about 2 minutes: "Uncertain Smile", from "Soul Mining", by "The The".]
[1401916702] Sliding vol: 100 -> 70 (step:-5): [100..95..90..85..80..75..70]
[1401916713] Sliding vol: 70 -> 100 (step:5): [70..75..80..85..90..95..100]
[1401916715] Sleeping 69 more seconds (33 secs randomly subtracted)
^C
Exiting after 4 runs.