PDA

View Full Version : SB3 hanging after song in Random Mix Mode



0xdeadbeef
2005-12-11, 05:17
I received my SB3 yesterday and getting it ro tun etc. was no problem (had the server up and running some days before and used softsqueeze). So after several hours of playing in "random mix" mode, it kinda stopped playing. Indeed in the web interface it showed that it was still playing the first song of an album (MP3/VBR), but the song was over already and slimserver/sb seems to have missed this fact.

However, I must admit that I somehow failed to reproduce this problem. Well, it happened again the first time I retried, but after that, no matter what I tried, the transisiton to the 2nd song was ok.

Any clues?

RichardG
2005-12-12, 08:08
Oh no.. I am seeing similiar problems accross a wide range of platforms with 6.2.1 code base. Seems that after playing continuously for a number of hours or days, the timers get screwed up and SQB fails to recognize that the song has ended. It will stay in the "play" state for many, many hours after the last song has completed. When in this state, starting the next song via the web browser causes the song to play, finish and to never move to the following song. But the player mode still indicates "play". I tried some debug things.. here is a sample of timers while in the odd state:

2005-12-12 08:48:05.0401 settimer Normal: CODE(0x9738ed4), now: 1134395285.04007, time: 1134395286.03228
2005-12-12 08:48:05.0465 select_time: 0.132722616195679
2005-12-12 08:48:05.1820 firing high timer 0.00277447700500488 late.
2005-12-12 08:48:05.1851 settimer High: CODE(0x9fbde60), now: 1134395285.18502, time: 1134395285.32919
2005-12-12 08:48:05.1881 select_time: 0.141104698181152
2005-12-12 08:48:05.3320 firing high timer 0.00277447700500488 late.
2005-12-12 08:48:05.3350 settimer High: CODE(0x9fbde60), now: 1134395285.33495, time: 1134395288.92919
2005-12-12 08:48:05.3379 select_time: 0.627464056015015
2005-12-12 08:48:05.9680 firing timer 0.00260186195373535 late.
2005-12-12 08:48:05.9706 screenSaver idle display 43777.5302000046(mode:playlist)
2005-12-12 08:48:05.9739 settimer Normal: CODE(0x9a47ad4), now: 1134395285.97384, time: 1134395286.9704
2005-12-12 08:48:05.9770 select_time: 0.0553300380706787
2005-12-12 08:48:06.0350 firing timer 0.00267910957336426 late.
2005-12-12 08:48:06.0376 settimer Normal: CODE(0x9738ed4), now: 1134395286.03751, time: 1134395287.03228
2005-12-12 08:48:06.0440 select_time: 0.680253982543945
2005-12-12 08:48:06.7270 firing timer 0.00277996063232422 late.

Here is a sample of select:

2005-12-12 08:50:51.4068 select returns (0.0155661106109619):
2005-12-12 08:50:51.4092 errors: 0 of 3
2005-12-12 08:50:51.4120 reads: 0 of 6
2005-12-12 08:50:51.4146 writes: 0 of 0
2005-12-12 08:50:51.5288 select returns (0.110943078994751):
2005-12-12 08:50:51.5312 errors: 0 of 3
2005-12-12 08:50:51.5340 reads: 0 of 6
2005-12-12 08:50:51.5368 writes: 0 of 0
2005-12-12 08:50:51.6798 select returns (0.139132261276245):
2005-12-12 08:50:51.6822 errors: 0 of 3
2005-12-12 08:50:51.6849 reads: 0 of 6
2005-12-12 08:50:51.6876 writes: 0 of 0
2005-12-12 08:50:51.8298 select returns (0.138278245925903):
2005-12-12 08:50:51.8322 errors: 0 of 3
2005-12-12 08:50:51.8350 reads: 0 of 6
2005-12-12 08:50:51.8378 writes: 0 of 0
2005-12-12 08:50:51.9798 select returns (0.138145446777344):
2005-12-12 08:50:51.9822 errors: 0 of 3
2005-12-12 08:50:51.9849 reads: 0 of 6
2005-12-12 08:50:51.9876 writes: 0 of 0
2005-12-12 08:50:52.0328 select returns (0.0413010120391846):
2005-12-12 08:50:52.0352 errors: 0 of 3
2005-12-12 08:50:52.0380 reads: 0 of 6
2005-12-12 08:50:52.0408 writes: 0 of 0
2005-12-12 08:50:52.0773 select returns (0.0820064544677734):
2005-12-12 08:50:52.0798 errors: 0 of 3
2005-12-12 08:50:52.0825 reads: 1 of 6

All the other debug options seemed to report no information while in this state. I notice that it seems related to deletetracks and addtracks in Slim::Control::command.. could it be removemultipletracks? Wish I could get ahold of the log before this happens or during it, but the system can run for days until the problem happens.

Richard

RichardG
2005-12-12, 16:53
After source and slimproto debugs.. there is no Decoder underrun beging generated. Everything else seems to occur the same as other tracks.. but a message like this one: "Decoder underrun while this mode: playout-play" never appears and the stream subsystem stops.

Anyone have a faster way to get to the bottom of this? Should I be tracing the select/slimproto, looking at timer issues, or checking the stream? Lot of code. Some direction from experienced folks would be appreciated.

Richard

RichardG
2005-12-13, 06:45
phew.. ok I think I got it. I am testing now.

The problem is that DHCP is getting a renewal of the IP address, and this is killing the connection between the SQB and the server. And it does not come back without user intervention.

Here's the analysis. At exactly 2005-12-12 16:54:48.9615, the SQB sends a final slimproto message that it should be sending every 2 seconds or so (a STAT message). But no further messages of any kind appear for several hours after that even though the slimserver says that it is playing. After rebooting, the DHCP is set to its initial value before the slimserver starts, and so all is good again. The variable nature of the stopping is explained by the timeout of DHCP.

I checked the messages file and there's a corresponding DHCP renewal at exactly Dec 12 16:54:45. This renewal takes about 3 seconds to complete, and results in the restarting of
the network interface, and slimserver seems to be stuck listening on an old socket handle for the next STAT from the SQB.. one that never appears. By following threads on ntpd problems and DHCP, I was able to come up with a limited fix for ntpd, but for the SQB, the only answer is to run with a static IP address. Or perhaps someone at slimdevices can implement something better?

Here's a source for much of the information I found:

http://lists.ntp.isc.org/pipermail/questions/2005-August/006348.html

Richard

0xdeadbeef
2005-12-13, 14:46
Hm, interesting...

It just happened again to me by the way.
Indeed the Squeezebox was paused in the first song of an album, but it was displaying the date/time during this period of inactivity, so I don't get why it should get a DHCP timeout.

When resuming playback, it showed "00:00" (minutes/seconds left) in the display while in the web interface (Fishbone) the progress bar started over and over again at position zero. Just skipping to the next song with the remote freed the Squeezebox from this deadloop. I wonder why this should trigger a DHCP renewal!?

My DHCP server is my DSL modem/router by the way which can't really be configured or monitored regarding DHCP.

RichardG
2005-12-14, 06:08
Sorry.. got kind of caught up in my own problems. I did not mean to imply that your problem was for sure a DHCP problem. You might be having a situation where DHCP is causing some issues for you, but there could be something else going on. In my case, DHCP renegotiation was resulting in a new IP address for the server. So the SQB was just lost.

I verified that DHCP negotiation *not* resulting in a new IP address has no impact on the stream state, and results in no errors.

You can check if you are suffering from a DHCP negotiation resulting in a new IP address by using your remote to check your slimserver IP address under Settings/Information/Server Information/Host IP Address. Compare this IP address with the IP address of your slimserver (for ex. using "ipconfig" in a command prompt under Windows). If the addresses are different, it is possible that the server has recieved a new IP address.

in my case, the servers response to this error condition could be improved, and I will enter an enhancement request for this.. well, after I download the latest slimserver and verify that it is still a problem.

Richard

RichardG
2005-12-14, 18:02
Well, I have another situation that is resulting in the slimserver / SQB ending up in a "playing" state with no actual music being produced, and being stuck on the same song until some user intervention. I was *unable* to catch the condition with debug flags enabled. A post mortem reveals:

2005-12-14 14:38:10.6498 Got Slimproto frame, op STAT, length 41, IO::Socket::INET=GLOB(0xb6d5cd8)
2005-12-14 14:38:10.6505 00:04:20:05:d2:7f Squeezebox stream status:
fullness: 1990 (0%)
bytes_received 26964399
2005-12-14 14:38:11.6497 Got Slimproto frame, op STAT, length 41, IO::Socket::INET=GLOB(0xb6d5cd8)
2005-12-14 14:38:11.6504 00:04:20:05:d2:7f Squeezebox stream status:
fullness: 1990 (0%)
bytes_received 26964399
2005-12-14 14:38:12.6498 Got Slimproto frame, op STAT, length 41, IO::Socket::INET=GLOB(0xb6d5cd8)
2005-12-14 14:38:12.6505 00:04:20:05:d2:7f Squeezebox stream status:
fullness: 1990 (0%)
bytes_received 26964399

Normally, the buffer fullness should be 0 (0%), and this would cause the SQB to generate a decoder underrun frame. But the SQB is reporting that the buffer still contains an orphaned fragment, and never causes the run of DecoderUnderun in streamproto. The slimserver never recieves the "d" type STAT frame from the SQB. I am rerunning with debug flags enabled to get more details on the http and stream status when the error occurs.

I guess the question is why would the SQB firmware assume there was some tiny amount of data in the buffer and never clear it out?

Richard

enduser
2005-12-14, 20:42
Well, I have another situation that is resulting in the slimserver / SQB ending up in a "playing" state with no actual music being produced, and being stuck on the same song until some user intervention. I was *unable* to catch the condition with debug flags enabled. A post mortem reveals:

2005-12-14 14:38:10.6498 Got Slimproto frame, op STAT, length 41, IO::Socket::INET=GLOB(0xb6d5cd8)
2005-12-14 14:38:10.6505 00:04:20:05:d2:7f Squeezebox stream status:
fullness: 1990 (0%)
bytes_received 26964399
2005-12-14 14:38:11.6497 Got Slimproto frame, op STAT, length 41, IO::Socket::INET=GLOB(0xb6d5cd8)
2005-12-14 14:38:11.6504 00:04:20:05:d2:7f Squeezebox stream status:
fullness: 1990 (0%)
bytes_received 26964399
2005-12-14 14:38:12.6498 Got Slimproto frame, op STAT, length 41, IO::Socket::INET=GLOB(0xb6d5cd8)
2005-12-14 14:38:12.6505 00:04:20:05:d2:7f Squeezebox stream status:
fullness: 1990 (0%)
bytes_received 26964399

Normally, the buffer fullness should be 0 (0%), and this would cause the SQB to generate a decoder underrun frame. But the SQB is reporting that the buffer still contains an orphaned fragment, and never causes the run of DecoderUnderun in streamproto. The slimserver never recieves the "d" type STAT frame from the SQB. I am rerunning with debug flags enabled to get more details on the http and stream status when the error occurs.

I guess the question is why would the SQB firmware assume there was some tiny amount of data in the buffer and never clear it out?

RichardThis will cure your ills. Just takes a button push and a moment's wait:
http://www.provantage.com/buy-7LCRS043-remote-control-outlet-system-electrical-device-la-crosse-technology-rs-204u-cs-shopping.htm

0xdeadbeef
2005-12-15, 04:09
I don't see how this will eliminate user intervention !?

enduser
2005-12-15, 07:34
I don't see how this will eliminate user intervention !?It's just like Western medicine. It cures you by treating the symptoms.

The SB3 will often get into a dysfunctional state. Remote rebooting the SB3 is the only way to make it come back to working properly. I remote reboot once a day at least. Keeps the SB3's little brain from getting dazed and confused.

Maybe in 3-6 months these bugs will be fixed, but until then, remote rebooting the SB3 is quite handy.

Click. Wait. Play.

0xdeadbeef
2005-12-15, 08:22
Tbis thread is not about the SB freezing/crashing/hanging in a deadlock that can't be fixed with remote or web interface.
Indeed the "endless playing" state can be easily left by either remote or web interface. It's just that I would favor the SB not entering this state at all.

dean
2005-12-16, 11:04
Richard,

I wonder if there's a specific song that causes the player to get
into this state. Do you have way to reproduce the situation?


On Dec 14, 2005, at 5:02 PM, RichardG wrote:

>
> Well, I have another situation that is resulting in the
> slimserver / SQB
> ending up in a "playing" state with no actual music being produced,
> and
> being stuck on the same song until some user intervention. I was
> *unable* to catch the condition with debug flags enabled. A post
> mortem
> reveals:
>
> 2005-12-14 14:38:10.6498 Got Slimproto frame, op STAT, length 41,
> IO::Socket::INET=GLOB(0xb6d5cd8)
> 2005-12-14 14:38:10.6505 00:04:20:05:d2:7f Squeezebox stream status:
> fullness: 1990 (0%)
> bytes_received 26964399
> 2005-12-14 14:38:11.6497 Got Slimproto frame, op STAT, length 41,
> IO::Socket::INET=GLOB(0xb6d5cd8)
> 2005-12-14 14:38:11.6504 00:04:20:05:d2:7f Squeezebox stream status:
> fullness: 1990 (0%)
> bytes_received 26964399
> 2005-12-14 14:38:12.6498 Got Slimproto frame, op STAT, length 41,
> IO::Socket::INET=GLOB(0xb6d5cd8)
> 2005-12-14 14:38:12.6505 00:04:20:05:d2:7f Squeezebox stream status:
> fullness: 1990 (0%)
> bytes_received 26964399
>
> Normally, the buffer fullness should be 0 (0%), and this would cause
> the SQB to generate a decoder underrun frame. But the SQB is reporting
> that the buffer still contains an orphaned fragment, and never causes
> the run of DecoderUnderun in streamproto. The slimserver never
> recieves
> the "d" type STAT frame from the SQB. I am rerunning with debug flags
> enabled to get more details on the http and stream status when the
> error occurs.
>
> I guess the question is why would the SQB firmware assume there was
> some tiny amount of data in the buffer and never clear it out?
>
> Richard
>
>
> --
> RichardG
> ----------------------------------------------------------------------
> --
> RichardG's Profile: http://forums.slimdevices.com/member.php?
> userid=801
> View this thread: http://forums.slimdevices.com/showthread.php?t=19009
>
>

0xdeadbeef
2005-12-16, 11:21
It happens to me all the time. If I put the SB in pause mode for more than a few minutes, it will play "silence" with 00:00 minutes/seconds left forever until I skip the track.

I first thought this would be a DHCP problem since - for some reason - the Squeezebox gets new IPs assigned multiple times a day when set to DHCP mode.
Then again, I configured the SB to a static IP this morning and it already happened again today.
I'm puzzled...

RichardG
2005-12-16, 12:12
Richard,

I wonder if there's a specific song that causes the player to get
into this state. Do you have way to reproduce the situation?

<snip>



I have not been able to verify that it is a single file or some other kind of problem. I know which track was playing at the time, and I will play that track again to see if it causes the problem.

liv
2005-12-17, 07:12
Beef

I suspect you're on to something with using Pause over an extended period of time. I experienced my only true issue with the device about 2 weeks ago. The device stopped playing, the screen on the device was pixellated and I couldn't reconnect to the network. It appeared that my MAC was corrupted (as well as other settings) and required me to reset the device and re-enter all settings including the MAC address of the device. What's interesting is that I was extensively using the Pause control. For example, I would pause the Slim for hours as I left the house to run errands... I did this throughout the day as well as having the Random Mix playing for hours upon hours. Never had an issue...the only thing can isolate as a variable is the extended pause time.

Sorry for the long post but I wanted to relay concerns regarding the Pause control. Smoking gun ... maybe... hope I didn't create and confusion with the post.

Liv

0xdeadbeef
2005-12-19, 03:50
Hm, just happened again.
I paused inside a title. When I restarted after half an hour or so, the SB continued, but at the end of the song it did't skip to the next song, but played silence (time remaining 00:00 and so on).
I use a static IP now for the SB so this is definitely no DHCP issue.

I think I can reproduce this now, at least it worked the last few times:
The trick is to pause near the end of the song: 20 seconds before the end of the song should be enough. Then wait for the screensaver to kick in, then continue: the song will continue to its end and then silence is played.

I put a log here which clearly shows how the state is reached (2% to 0% and stays there). Maybe it helps...
http://home.arcor.de/0xdeadbeef/slim.log

RichardG
2005-12-23, 04:48
Almost ready to file a bug report here..

First off, playing the same song that caused the SQB to enter the "endless playing state" at various positions in the playlist had no impact. flac -t on the file showed 100% correct.

I tried the "pause" test as suggested in the previous post. I tested on windows XP SP2 with the source 6.2.1 package and a SQB 2. I played a track. I waited for 20 seconds remaining (19 to be exact), and pressed the pause button. I waited until the screensaver kicked in. I am using the "Now Playing" sceen saver while playing, and "None" when not playing.

After the screensaver kicked in, I waited another 60 seconds, and then pressed the pause button again. The SQB 2 I was testing with immediately lost its connection to the server and the screen went blank after a few moments of displaying the lost connection notice. I checked the server, and it said the player was playing. The web interface was fully operational.

I am now going to do some limited debug checking to get a better view of the environment when the SQB losses its connection, and to see if eliminating the screen saver solves the poblem.

I believe we have what we need to file a bug report. Thanks to the OP for his work on this.

RichardG
2005-12-23, 04:53
Mistake in this paragraph:

After the screensaver kicked in, I waited another 60 seconds, and then pressed the pause button again. The SQB 2 I was testing with immediately lost its connection to the server and the screen went blank after a few moments of displaying the lost connection notice. I checked the server, and it said the player was playing. The web interface was fully operational.

Should read:

After the screensaver kicked in, I waited another 60 seconds, and then pressed the pause button again. The SQB 2 played the song until its end (around 19 seconds), and then immediately lost its connection to the server. The SQB 2 screen went blank after a few moments of displaying the lost connection notice. I checked the server, and it said the player was playing. The web interface was fully operational. The system (slimserver/SQB) remained in that state indefinately.

0xdeadbeef
2005-12-23, 05:03
In the mean time I learned that the bug report already existed and was fixed for 6.5.x. Lately, the fix was merged into the 6.2.x branch.
Resume now works for me, there are just some glitches with the time remaining displayed.
See here:
http://forums.slimdevices.com/showthread.php?t=19322

RichardG
2005-12-23, 06:26
here's the bug:

http://bugs.slimdevices.com/show_bug.cgi?id=2745