Hi,
Ive reached a bit of a dead end and Im hoping someone here can point me in the right direction please?
Im running 3 Booms and one Duet on Ubuntu 10.4, recently upgraded to v12 (with no chage to this error) using Logitech Media Server Version: 7.7.2 - r33893 (recentlly upgraded from the previous veriosn again with no change to the error). I am running only these plugins that are not Logitech ones : autodimdisplay 2.1.10, music information screen 4.4.7, powersave 7.4 and superdatetime screensaver 5.9.14. Ive added these back in because their presence seems to make no difference, and I make use of all of them. The hardware is a Dell Core2 Duo with 4Gb memory and 500G hard drive (25% free space).
Periodically, that is up to 3 or 4 times a day on some days, the server consumes most of the available CPU, the web interfaces becomes unresponsive, and the Booms report "cant connect". The logs at the time dont tell me too much (see below). I cant find the event or circumstances that cause this to happen, but have witnessed it in action when Ive seem the CPU load shoot up while a Boom is playing internet radio (its most common use by far), after which the boom then drops out and usually reports "cant connect to server", although sometimes just stops playing.
I have written a script that restarts the SB server when the web interface becomes unresponsive, which temporarily fixes the issue, but I need to find the root cause as of course this upsets the "users" whos music goes off and requires manual intervention to recover it. The attached log is from that script, which grabs the tail of the log as it restarts the SB server.
Please note from the above I have effectively re-installed the SB software as part of upgrading, but I also lost a hard disk recently and have rebuilt the entire system from Ubuntu installation disk and home folder backups (hence the upgrade to Ubuntu 12, thought I may as well since I had to rebuild) and have had this situation both BEFORE and AFTER upgrades. If anything the problem is possibly more frequent after the upgrade to Ubuntu 12, but that's very subjective as it may equally depend on what the Booms are doing at the time (more likely Id say, since the problem occurs in Ubuntu 10 and 12 and across two different versions of SB server). It is possible that this error only happens when a Boom is in use, again a bit subjective but I dont recall it performing an SB restart when something wasnt playing (I know if it executes as it emails me).
Aside from that, Im at a loss to further track this one down.
While no genius, i can write and deploy scripts, and if needed execute a debug script at the time of failure, which I detect by doing a wget of the SB player web page, correctly deducing it is consuming vast amounts of CPU if the wget times out.
3 logs from time of failure attached.
Colin
Results 1 to 10 of 11
-
2012-06-22, 18:13 #1Junior Member
- Join Date
- Jun 2011
- Posts
- 9
Logitechmediaserver consumes all available CPU, becomes unresponsive
-
2012-06-23, 00:20 #2Junior Member
- Join Date
- Jun 2012
- Posts
- 1
Same issue here in Ubuntu 12.04. Unable to do any work on PC when SBT server is running.
Process monitor show 100% CPU usage which ideally 20% or less.
Configuration of my PC
Mobo Intel DG45 ID
Processor Core2duo
Ram-4GB
Sent from my GT-I9001 using Tapatalk 2
-
2012-07-17, 15:52 #3Junior Member
- Join Date
- Jun 2011
- Posts
- 9
Id still appreciate any help I can get - this issue is still happening several times daily.
I have notice that after the issue has occurred I seem to have several squeezeboxserver_safe instances running alongside squeezeboxserver (my devices are clearly sttaching to squeezeboxserver NOT _safe as the display is using music information screen).
I have also tried manually sudo service logitechmediaserver start after making sure all instances are stopped, and I still see two instances, normal and safe mode, and my devices are still using the instance with one of the few plugins I use.
Any help, anyone, please?
-
2012-07-18, 01:30 #4
Do you have any idea if the issue happens when you are listening to music, when you are browsing or doing some other operation or when the system isn't doing anything special ?
Erland Isaksson (My homepage)
(Developer of many plugins/applets (both free and commercial).
If you like to encourage future presence on this forum and/or third party plugin/applet development, consider purchasing some plugins)
You may also want to try my Android apps Squeeze Display and RSS Photo Show
Interested in the future of music streaming ? ickStream - A world of music at your fingertips.
-
2012-07-18, 04:36 #5Junior Member
- Join Date
- Jun 2011
- Posts
- 9
Thanks for replying - I have taken note of when it happens, looking for a pattern, but it has happened in front of my eyes when a Boom was playing internet radio, but also at 2am when all devices are switched off (ie connected, but doing nothing). Im at a loss to find a pattern, all I can see is as per the logs there are usually, but not always, entries for automatic scanning (now disabled anyway, to no effect) and attempts to reach a URL. But these are always nowhere near the time of the problem, in fact the logs are always empty around the time of the issue.
I wish I had more clues to offer.
-
2012-07-18, 07:59 #6
You might try seeing if there is a correlation between your freeze-up times and cron activity:
# sudo cat /var/log/syslog | grep 'CRON'
Some cron fired activity can potentially restart sbs/lms, i.e. logrotate. It's not supposed to happen, but it could if logrotate is misconfigured. Logrotate doesn't log itself, but you should see entries in syslog for cron jobs (i.e. cron.daily).
Anyway, this is another place to look.
-
2012-07-18, 08:25 #7
'squeezeboxserver_safe' is just a bash script that runs in the background and serves to restart the perl squeezeboxserver process if it dies. On ubuntu/debian systems, the '_safe' process is normal and you should expect to see it.
But you really shouldn't be seeing multiple instances of the safe script running. I.e., if the command:
# ps ax | grep -v grep | grep '_safe' | wc -l
..returns a number higher than '1', then I'd think that something is amiss with your system configuration.
If you think that might be the case, then install 'chkconfig':
# sudo apt-get update && apt-get install chkconfig
Then let us see the output of:
Code:# chkconfig --list | sort | grep "[1-5]:on"
-
2012-07-18, 19:04 #8Junior Member
- Join Date
- Jun 2011
- Posts
- 9
Multiple instances of squeezeboxserver_safe happen at least (ie may also happen other times) after my script has detected a CPU hogging problem. The other morning (2am) it did this three or 4 times in a row (3 minutes apart) Im now thinking its possibly another red herring as I think the restart command fails to complete within 3 minutes as SB is taking 100%CPU. So if another restart is issued before the last one completed (and it may never complete I guess) that might cause the multiple _safe instances? But not have anything to do with the CPU hogging.
For the record, the script (via sudo cron, */3) looks like this, and reliably detects when SB is unreponsive as experenced by end users (ie SB disconnect, script runs and usually successfull restarts SB :
Code:wget --timeout=5 --tries=1 --spider http://192.168.4.80:9000 1>/tmp/zzsbmon 2>/tmp/zzsbmon sbstat=$(grep "200 OK" /tmp/zzsbmon | wc -w) if [ $sbstat -eq 0 ] then zz=$(tail -n 50 /var/log/squeezeboxserver/server.log) echo "Squeezebox (LtMediaServer) needed restarting, log was : \n \n $zz" | mailx -v -s "SB restarted $(date)" myemail@yahoo.com.au service logitechmediaserver restart fi
which looks like this :Then let us see the output of:
Code:# chkconfig --list | sort | grep "[1-5]:on"
Code:acpi-support 0:off 1:off 2:on 3:on 4:on 5:on 6:off apache2 0:off 1:off 2:on 3:on 4:on 5:on 6:off dns-clean 0:off 1:on 2:on 3:on 4:on 5:on 6:off grub-common 0:off 1:off 2:on 3:on 4:on 5:on 6:off kerneloops 0:off 1:off 2:on 3:on 4:on 5:on 6:off killprocs 0:off 1:on 2:off 3:off 4:off 5:off 6:off logitechmediaserver 0:off 1:off 2:on 3:on 4:on 5:on 6:off ondemand 0:off 1:off 2:on 3:on 4:on 5:on 6:off postfix 0:off 1:off 2:on 3:on 4:on 5:on 6:off pppd-dns 0:off 1:on 2:on 3:on 4:on 5:on 6:off pulseaudio 0:off 1:off 2:on 3:on 4:on 5:on 6:off rc.local 0:off 1:off 2:on 3:on 4:on 5:on 6:off rsync 0:off 1:off 2:on 3:on 4:on 5:on 6:off saned 0:off 1:off 2:on 3:on 4:on 5:on 6:off speech-dispatcher 0:off 1:off 2:on 3:on 4:on 5:on 6:off sudo 0:off 1:off 2:on 3:on 4:on 5:on 6:off vboxballoonctrl-service 0:off 1:off 2:on 3:on 4:on 5:on 6:off vboxdrv 0:off 1:off 2:on 3:on 4:on 5:on 6:off vboxtoolinit 0:off 1:off 2:on 3:on 4:on 5:on 6:off vboxweb-service 0:off 1:off 2:on 3:on 4:on 5:on 6:off winbind 0:off 1:off 2:on 3:on 4:on 5:on 6:off
On the suggestion for CRON, I took a look at various entries, but all I see is my SB monitoring job and the occasional other backup script of mine but nowhere near the time of failure. For example, the last time a SBserver CPU hogging was detected, it was on 18th at 08:44 when syslog looked like this :You might try seeing if there is a correlation between your freeze-up times and cron activity:
Nothing suspicious that I can see Im afraid?Code:Jul 18 08:18:01 Gonzalez CRON[9297]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:20:01 Gonzalez CRON[9526]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:22:01 Gonzalez CRON[9751]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:24:01 Gonzalez CRON[9980]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:26:01 Gonzalez CRON[10204]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:28:01 Gonzalez CRON[10429]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:30:01 Gonzalez CRON[10658]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:32:01 Gonzalez CRON[10886]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:34:02 Gonzalez CRON[11114]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:36:01 Gonzalez CRON[11339]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:38:01 Gonzalez CRON[11577]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:40:01 Gonzalez CRON[11785]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:44:01 Gonzalez CRON[12315]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:48:01 Gonzalez CRON[12450]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:52:01 Gonzalez CRON[12467]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 08:56:01 Gonzalez CRON[12570]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 09:00:01 Gonzalez CRON[12770]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 09:04:10 Gonzalez CRON[12825]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 09:08:02 Gonzalez CRON[12989]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 09:12:01 Gonzalez CRON[13000]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 09:16:01 Gonzalez CRON[13016]: (root) CMD (/home/colic/scripts/sbmon) Jul 18 09:17:01 Gonzalez CRON[13027]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Im wondering if I should turn on SB logging debug, although its a shot in the dark because the output traffic is huge, and it may be nothing to do with any of them?
Thanks for all your help - I consider myself reasonably technically competent (but not in a refined way :-)), but this one has me stumped.
-
2012-07-19, 11:47 #9
OK, a couple of thoughts:
You might try modifying your watchdog script so that it just kills the perl instance. That will let squeezeboxserver_safe do it's intended job..which is to automatically restart the perl script if it dies. Adapting your script, it might look like:
Also, in terms of the services you've got configured to always run, you might have a look at this:Code:wget --timeout=5 --tries=1 --spider http://192.168.4.80:9000 1>/tmp/zzsbmon 2>/tmp/zzsbmon sbstat=$(grep "200 OK" /tmp/zzsbmon | wc -w) if [ $sbstat -eq 0 ] then zz=$(tail -n 50 /var/log/squeezeboxserver/server.log) echo "Squeezebox (LtMediaServer) needed restarting, log was : \n \n $zz" | mailx -v -s "SB restarted $(date)" myemail@yahoo.com.au pkill -fn 'perl.*/squeeze|perl.*/slim|perl.*/logitech' fi
http://computerhowto.us/optimize-ubu...-services.html
This is what I've got running on my headless, sans-gui Ubuntu 12.04 x86_64 server:
Finally, do you have a spare hard disk kicking around? Could you configure this box with a new OS drive with a fresh, clean install of Ubuntu 12.04 server? I've made and collected enough configuration helper scripts over the past couple of years that a chore like this will only take me about 90 minutes from initial boot off the install USB key to a functioning SBS/LMS server. I realize that this may be a heaver time investment for you and so I don't ask this lightly. But how much longer do you want to keep pulling your hair out? From everything you describe, it really sounds as though you've already gone through most everything that I could think of. I mean, short of a reinstall, have you fscked all the file systems and checked the smart data for all the drives?Code:# chkconfig --list | sort | grep "[2-5]:on" apcupsd 0:off 1:on 2:on 3:on 4:on 5:on 6:off cpufrequtils 0:off 1:off 2:on 3:on 4:on 5:on 6:off grub-common 0:off 1:off 2:on 3:on 4:on 5:on 6:off hddtemp 0:off 1:off 2:on 3:on 4:on 5:on 6:off lighttpd 0:off 1:off 2:on 3:on 4:on 5:on 6:off logitechmediaserver 0:off 1:off 2:on 3:on 4:on 5:on 6:off loadcpufreq 0:off 1:off 2:on 3:on 4:on 5:on 6:off minidlna 0:off 1:off 2:on 3:on 4:on 5:on 6:off ntp 0:off 1:off 2:on 3:on 4:on 5:on 6:off ondemand 0:off 1:off 2:on 3:on 4:on 5:on 6:off rc.local 0:off 1:off 2:on 3:on 4:on 5:on 6:off rsync 0:off 1:off 2:on 3:on 4:on 5:on 6:off smartmontools 0:off 1:off 2:on 3:on 4:on 5:on 6:off sudo 0:off 1:off 2:on 3:on 4:on 5:on 6:off winbind 0:off 1:off 2:on 3:on 4:on 5:on 6:off
-
2012-07-20, 02:47 #10Junior Member
- Join Date
- Jun 2011
- Posts
- 9
OK, thanks for the suggestion, Ill give that a try.You might try modifying your watchdog script so that it just kills the perl instance.
and I'll take a look at that while Im at it, although the box runs with heaps of spare CPU and memory, until that is SB takes all the CPU when it has a fit! - although I would assume, with it being a pretty standard Ubuntu setup, others would have complained but still worth a try.Also, in terms of the services you've got configured to always run, you might have a look at this:
I dont, however I could create a new VM and do it that way. I dont mind the time, but Im not sure what Ill achieve though which is making me hesitant. I have had this issue on Ubuntu 10.4 and 12 and recently my hard disk failed (so yes I have SMART and fsk checked the new disk (100%OK)) , so I ended up doing another fresh install of Ubuntu and SB on the new disk. You know, thinking about it, I wonder if there's something about the hardware SB doesnt like - but surely dual core and 4GB memory wouldnt worry it? Hmmmm.Finally, do you have a spare hard disk kicking around? Could you configure this box with a new OS drive with a fresh, clean install of Ubuntu 12.04 server?

Reply With Quote

