PDA

View Full Version : 100% CPU usage after mains powerdown of SB.. using 5.1.1 onFC1



Cormac Long
2004-05-04, 08:32
Folks,
I'm running 5.1.5 slimserver on Fedora Core 1 and notice that it goes
into 100% CPU usage sometimes when I poweroff my squeezebox at the
mains.. its not a cache/tag updating issue. Its definitely caused by the
hard switch-off of the squeezebox. This was also the case in all
versions of the server I've used (which is since 5.1.x).

Being so ever energy conscious.. I usually power down the stereo, tv,
etc before hitting the sack.. SB is in standby mode with the clock
display ticking nicely in the background when this happens.. its more or
less at that point that the CPU usage soars on the PC with slimserver.pl
being the culprit.

It doesn't always happen.. but at the moment its veering toward 30-40%
of the time.

Any ideas how to rectify this or debug it further?

Regards,
Cormac

dean
2004-05-04, 12:45
Hi Cormac,

How long do you get into a CPU hogging state? I know of one bug like
you describe, but after a minute or so the CPU should slow back down.

-dean

On May 4, 2004, at 8:32 AM, Cormac Long wrote:

> Folks,
> I'm running 5.1.5 slimserver on Fedora Core 1 and notice that it
> goes into 100% CPU usage sometimes when I poweroff my squeezebox at
> the mains.. its not a cache/tag updating issue. Its definitely caused
> by the hard switch-off of the squeezebox. This was also the case in
> all versions of the server I've used (which is since 5.1.x).
>
> Being so ever energy conscious.. I usually power down the stereo, tv,
> etc before hitting the sack.. SB is in standby mode with the clock
> display ticking nicely in the background when this happens.. its more
> or less at that point that the CPU usage soars on the PC with
> slimserver.pl being the culprit.
>
> It doesn't always happen.. but at the moment its veering toward
> 30-40% of the time.
>
> Any ideas how to rectify this or debug it further?
>
>
>
> Regards,
> Cormac
>

Cormac Long
2004-05-05, 02:10
Since I started noticing the 100%, I've never seen it back off and left
it alone for well over 2 hours one evening to see if the issue would
relax.. its always persisted at 100% until I did a stop or restart of
the service.

On Tue, 2004-05-04 at 20:45, dean blackketter wrote:

> Hi Cormac,
>
> How long do you get into a CPU hogging state? I know of one bug like
> you describe, but after a minute or so the CPU should slow back down.
>
> -dean
>
> On May 4, 2004, at 8:32 AM, Cormac Long wrote:
>
> > Folks,
> > I'm running 5.1.5 slimserver on Fedora Core 1 and notice that it
> > goes into 100% CPU usage sometimes when I poweroff my squeezebox at
> > the mains.. its not a cache/tag updating issue. Its definitely caused
> > by the hard switch-off of the squeezebox. This was also the case in
> > all versions of the server I've used (which is since 5.1.x).
> >
> > Being so ever energy conscious.. I usually power down the stereo, tv,
> > etc before hitting the sack.. SB is in standby mode with the clock
> > display ticking nicely in the background when this happens.. its more
> > or less at that point that the CPU usage soars on the PC with
> > slimserver.pl being the culprit.
> >
> > It doesn't always happen.. but at the moment its veering toward
> > 30-40% of the time.
> >
> > Any ideas how to rectify this or debug it further?
> >
> >
> >
> > Regards,
> > Cormac
> >

Adam Spiers
2004-05-05, 03:14
Cormac Long (cormac_long (AT) yahoo (DOT) co.uk) wrote:
> Since I started noticing the 100%, I've never seen it back off and left
> it alone for well over 2 hours one evening to see if the issue would
> relax.. its always persisted at 100% until I did a stop or restart of
> the service.

Can you run strace and ltrace on the running server and see what it's
up to which would cause the 100% ?

Cormac Long
2004-05-06, 03:11
ltrace didn't give any output.. but strace produces the following
repetitively..

write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
temporarily unavailable)
.....
..... just keeps repeating this

The above was after putting the player into standby and powering the
mains off a minute later. It took 5 attempts to bring about the 100% CPU
scenario.. so unlike some nights where it happens every time, it now
seems that this is not always the case.

Another repeat of the scenario also produced a continuous write call but
the string was longer (didn't grab the output, sorry).. that time, I had
the player powered up and playing music when I powered it off.

In case the above is not pinpointing the loop location, Is there a way
to force an abort and grab a core (or Perl-equivalent of same) that can
pinpoint the exact call stack etc?

On Wed, 2004-05-05 at 11:14, Adam Spiers wrote:

> Cormac Long (cormac_long (AT) yahoo (DOT) co.uk) wrote:
> > Since I started noticing the 100%, I've never seen it back off and left
> > it alone for well over 2 hours one evening to see if the issue would
> > relax.. its always persisted at 100% until I did a stop or restart of
> > the service.
>
> Can you run strace and ltrace on the running server and see what it's
> up to which would cause the 100% ?
>

Adam Spiers
2004-05-06, 04:20
Cormac Long (cormac_long (AT) yahoo (DOT) co.uk) wrote:
> ltrace didn't give any output.. but strace produces the following
> repetitively..
>
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource temporarily unavailable)

[snipped]

Yeah, this is what I get too. I'm pretty sure this is a "failure to
gracefully handle network issues" type of bug.

> .... just keeps repeating this
>
> The above was after putting the player into standby and powering the
> mains off a minute later. It took 5 attempts to bring about the 100% CPU
> scenario.. so unlike some nights where it happens every time, it now
> seems that this is not always the case.
>
> Another repeat of the scenario also produced a continuous write call but
> the string was longer (didn't grab the output, sorry).. that time, I had
> the player powered up and playing music when I powered it off.
>
> In case the above is not pinpointing the loop location, Is there a way
> to force an abort and grab a core (or Perl-equivalent of same) that can
> pinpoint the exact call stack etc?

You can get a hint by looking at what the filehandle corresponds to; try

$ lsof -p $server_pid -a -d 14

dean
2004-05-06, 07:17
Please try the latest nightly release, here:

http://www.slimdevices.com/downloads/nightly/latest

And let us know if this resolves the issue.

Thanks,

dean

On May 6, 2004, at 3:11 AM, Cormac Long wrote:

> ltrace didn't give any output.. but strace produces the following
> repetitively..
>
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> write(14, "\3 \3\4", 4) = -1 EAGAIN (Resource
> temporarily unavailable)
> ....
> .... just keeps repeating this
>
> The above was after putting the player into standby and powering the
> mains off a minute later. It took 5 attempts to bring about the 100%
> CPU scenario.. so unlike some nights where it happens every time, it
> now seems that this is not always the case.
>
> Another repeat of the scenario also produced a continuous write call
> but the string was longer (didn't grab the output, sorry).. that time,
> I had the player powered up and playing music when I powered it off.
>
> In case the above is not pinpointing the loop location, Is there a
> way to force an abort and grab a core (or Perl-equivalent of same)
> that can pinpoint the exact call stack etc?
>
> On Wed, 2004-05-05 at 11:14, Adam Spiers wrote:
> Cormac Long (cormac_long (AT) yahoo (DOT) co.uk) wrote:
> > Since I started noticing the 100%, I've never seen it back off and
> left
> > it alone for well over 2 hours one evening to see if the issue would
> > relax.. its always persisted at 100% until I did a stop or restart of
> > the service.
>
> Can you run strace and ltrace on the running server and see what it's
> up to which would cause the 100% ?
>

Cormac Long
2004-05-09, 13:47
So far so good.. haven't once recreated a 100%.. installed 2004_05_07 on
Friday.. been a well-behaved process since!
Will keep an eye and get back to the thread if it pops its head up
again.

From me and my CPU, man thanks for all the assistance.

On Thu, 2004-05-06 at 15:17, dean blackketter wrote:

> Please try the latest nightly release, here:
>
> http://www.slimdevices.com/downloads/nightly/latest
>
> And let us know if this resolves the issue.
>
> Thanks,
>
> dean
>




Regards,
Cormac