View Full Version : Voice Command for Squeezebox? Still Not Much Interest?

2010-06-16, 12:13
After trying to train my primary customer (spouse) on the Squeezebox with various interfaces (Presets on Radio, SB Controller, Ipeng) I keep coming back to what I think would be a killer usability feature: Voice Command for the Squeezebox platform.

I saw an old thread that someone had played around with this at one point (2008) but haven't seen much about it recently. Anyone else think that this would be a worthwhile feature?

I believe the new Iphones have something similar... we've all used voice command to make phone calls on various platforms... Microsoft has the Sync software in Ford vehicles. Voice Command works without training lately, so it's something that could really be something easy for users.

It doesn't even have to be too complicated - I'm looking for commands like: "Playlist Shane Favorites Random", "Skip Track", "Mute All", "Sync All Players", "Play Artist Pink Floyd", "Play Album Dark Side of the Moon", etc.

For some simple and common commands, Voice Command would be much faster than navigating menus, no matter how well presented those menus are.

In some threads like this, there's always someone that comes up with examples of how a certain keyword in a song would cause undesired behavior "What if they say 'stop' in the song??" but on any of these systems there is usually a button press or something similar that can easily distinguish a command from normal background noise or music.

I think Voice Command continues to get better and better - someday I would love to see some way to run Squeezebox this way...


2010-06-16, 12:36
I've tried various voice command systems in the past and found them unnerving, slow, ambiguous and hard to learn. They never do what you just need and never fast enough.

Voice commanding IMHO is a solution looking for a problem.

It works for very limited applications "call John smith" but even there often not very good "John Smith mobile, work or private?" or "call Genevieve" "Jennifer Miller or Jennifer Jones?" it's unusable for more complex tasks.

And this is NOT a technical issue, the issue is that a normal sentence needs too much context information to decipher (which the system doesn't have) or you as the user have to learn the systems syntax which can be very frustrating and is usually pretty limited.

2010-06-16, 12:59
I'm right behind you.

For years I've found it incredible that we are have to use a stupidly limited keyboard to communicate from an amazingly complex human brain to an increasingly powerful computer. I've had my eyes on Dragon Dictate since before it was released for Windows in 1997. Unfortunately the promise of voice controlled computing power has still not been realised 13 years later. But, the iPhone has re-enthused me. I frequently use the iPhone and Google voice control to make telephone calls, play music and search the web. And it works very well for me (Southern English accent).

In another thread I've made the point that I think single-use dedicated hardware controllers will soon become a thing of the past and that increasingly people will own their own personal hand-held touch-screen smart device (the iPhone leads the way) that they will use to control a whole host of things.

Maybe a realistic option for Logitech is to release an iPhone and Droid app that enables voices controls on these devices. I'm sure it's possible.... but at this moment in time it would probably be a bit bleeding edge for a company such as Logitech.... and require much more resources than they have. However, it is pretty certain that voice control will eventually be the way we control our personal smart devices.... and the apps installed on them.

Next session I'll expound on the merging of silicon, nano-technology and bio-technology and the amazing cyborg future this will almost inevitably bring the human race. (Personally I can't wait).


2010-06-16, 13:03
And this is NOT a technical issue, the issue is that a normal sentence needs too much context information to decipher (which the system doesn't have) or you as the user have to learn the systems syntax which can be very frustrating and is usually pretty limited.
It's not insurmountable though. My iPhone's iPod device already plays tracks, artists etc reliably whenever I ask it to.


2010-06-16, 13:21

SB: Now Playing 'Silence' by John Cage

Me: shut up

SB: Now Playing 'Shut Up' by The Monks.

SB (on hearing the above): Now Playing 'Shut up! Shut up!' by The Residents

etc etc..

Voice command would be problematic in situations where the music is sufficiently loud to enjoy.

2010-06-16, 13:40
It's not going to be perfect but using the first word (or two) as a control word (e.g. play album) and the following words to define the object (track/album/artist) could work. Perhaps a slightly longer gap between control and object words might be required. I can't see that this would be too hard for most people to grasp. Don't talk fast.

e.g. Play track . Silence

To alleviate problems with loud music requires a type of microphone sensitivity that isolates nearby sounds. It just can't be that hard. The Google and iPhone voice stuff is just about there now, and with Google currently putting shed loads of resources into voice control it's almost inevitable they'll get it to work well enough.


2010-06-16, 14:30
If you are using Homeseer as you Home Control system then setting up voice commands for Squeezebox should not be that hard. I use Homeseer voice control for lights and a few other things but have not gotten around to trying it with music yet, even though I do have my Squeezeboxes linked into Homeseer. You might be restricted to a limited universe of playlists and radio stations, but it is certainly doable.

2010-06-16, 14:33
The problem is not the technology.
But for how much of your music do you exactly know the title? How often do you play a single track? How often do you know in advance what exactly you are going to play? For services: How often do you know exactly what is available?

In all other cases you will do menu navigation:
"My Music, Internet Radio, My Apps, Extras or Settings?"
"My Music"
"Albums, Artists, Genres or Playlists?"

Voice control will work as soon as your computer understands phrases like: "I want that other album of that band that played as a support act for Seeed last year". This is still a bit out.

2010-06-17, 08:14
Pippen and Snarly - both good points but not insurmountable.

First of all, it would really be useful for me in the "common" situations in my household - usually simple commands. I would never use it to try and create a playlist or add tracks (which you're right, I don't always know the exact title of) but rather things like "Playlist Shane Favorites" or "Play Radio Station KEXP" or "Play Podcast Jim Rome" - sure it would need a little set up like naming your favorites correctly or naming your playlists correctly, etc. but nothing crazy. This would NEVER replace the controller or other interface for more complex things, but would be very nice for common everday things.

As far as the microphone picking up unwanted things, this would be a non-issue because of isolating microphones (as Model mentioned) and having keywords or even a button press needed before speaking the command...

I'll have to take a look at Homeseer and see how that would work... this hobby never ends right?

2010-06-17, 08:21
This is a pretty cool hack:


Phil Leigh
2010-06-17, 10:50
I don't want to talk to my bloody hi-fi!

I spend all day talking to people.
When I get home and open that well-earned bottle of burgandy/chablis I do NOT want to engage the hi-fi in conversation.

"I'm sorry, Phil, that track is unsuitable for your current mood" :-)

Now - as a usability aid for folks with limited motor function or visibility - absolutely! - very important.

Get that working first.

2010-06-17, 13:38
Here's a music server activated by voice:


2010-06-18, 14:26
When I get home and open that well-earned bottle of burgandy/chablis I do NOT want to engage the hi-fi in conversation.
It won't answer back. A conversation is a two way thang.

You use a keyboard a lot (I know), maybe you use more than two fingers...even if you do the it's a really awful way to communicate between your truly awesome brain (I know this too) and your PC (which is not quite as awesome, but getting better by the the month).

It's taking a long time to get there but speech really is the way.

Example: speech to Google app on iPhone "Laughing Fish, Isfield, Tel". Then a touch to ring the pub to say that our walk is taking longer than we expected and we'll be 30 mins late for out reserved lunch table.

Typing whilst walking is hard.

I can do similar voice commands if I'm out by myself and want music (but I don't as the you can't beat the sound of the countryside).

Beam me up Scotty.