PDA

View Full Version : Help me figure out this problem with my server



MeSue
2011-09-03, 08:52
I have had this problem with my server for a while and have never been able to figure it out. Sometimes it doesn't happen for months at a time but now it has started happening regularly again and I need to get to the bottom of it because it is really annoying.

I have a headless HP MediaSmart Server. It connects to a Gigabit switch which is connected through a Buffalo router. What happens is every so often the server just disappears off the network. All the lights on the server are normal, but I can tell it is offline because WHS puts icons on the client machines and they turn gray, plus all the shares become inaccessible. And of course, if I am playing music, it stops.

Often it also knocks out the Internet connection for the entire network shortly after it goes offline. I used to be able to just wait it out and after 15-20 minutes it would all come back on its own, but now that doesn't seem to happen (I have left it overnight and it never came back). So I have to force a shutdown on the server by holding the button down and restarting it. Another weird thing that happens is if I unplug the server's Ethernet cable from the Gigabit switch, it reboots the server machine. That has happened with two different switches so it is not a problem with the switch.

I have looked through the event viewer and Squeezebox server logs to try to figure out what is happening at these times, but I can't find anything that indicates what could be causing it. It is happening at all different times of day, too.

The only thing I can figure is that the NIC in this machine is somehow faulty. Does that make sense? What else could it be? I'm afraid to tinker with any of the NIC settings because there is no monitor connection so the network is the only way I can access the machine.

I can't just swap out the NIC in this machine, because it's built in and there are no expansion ports. So I was thinking of getting a USB Gigabit adapter to see if it makes the problem go away. Something like this: http://www.monoprice.com/products/product.asp?c_id=103&cp_id=10311&cs_id=1031102&p_id=5345&seq=1&format=2

Any other ideas?

slate
2011-09-03, 10:27
To keep you busy until someone that are familiar with the WHS comes along ;-)

The usual things suspects:
- We you checked HP.com for new bios and drivers
- The first part of your problem made me think of some power settings. Try to go to Advanced power settings and check for network related settings.

I guess that you do have it set to wakeup on WOL packages

Edit: BTW I recall you were one of the first 7.6.0 upgraders to get burned; glad to see that you made it to 7.6.1

epoch1970
2011-09-03, 11:09
I don't know this machine but perhaps it is normal behavior that in absence of an uplink the machine reboots. I've had headless machines with a watchdog set to reboot when the network seems down, as a missing link is unlikely while a wedged OS might happen.
Usually the watchdog has a hardware portion (look at the BIOS) and when such hardware is detected, an OS task is started.
When the server is ok and you disconnect the cable, does it reboot as well ?

Around packet fragmentation: I would look at the ethernet settings in the server and make sure that the MTU is set to the standard 1500-byte frame size. If it is, I'd try setting the card to fast-ethernet speed 100Mbps duplex, and see what happens.
If the switch is manageable, I'd look at the queuing policy on ports. If the switch has to connect 100Mbps and 1000Mbps ports there is a chance its buffers fill up and then it locks up.

Perhaps the cause is in the power feed. I've had routers or other embedded devices wedge at random due to issues with mains power. Once I cured a router from locking up simply by adding a 2-meter extension cord between the Ac adapter and the wall plug...

Also, I'd try changing / swapping the ethernet cables and ports used in the switch and see if something changes.

One last thing I'd try is to run something like Ubuntu Live CD on the server and try stress-testing. If the problem is with the OS, Linux will react differently -worse or better- and this could bring some light to the issue.

That's all I can think of now. Good luck.

pski
2011-09-03, 14:31
When this happens to me, the reason is apparent.

Since I have several machines and devices on two different two-hop loops (1 at 100 and the other at 1 gig,) I lose devices geographically. Normally this is caused by a switch having a brain-fart. Repeated brain-farts have been followed by hard failures for two of the three switches I've had to replace. Surge (lightning so close the phones ring) got the other one.

Try to re-power the switch.

Also, early 1gig ethernet was twitchy about working on cat 5. I use only cat 6 on gig loops.

P

MeSue
2011-09-03, 19:23
Thanks for the suggestions, everyone!

I thought it might be the switch so I replaced it about 2 months ago. Also swapped out the cables, but I don't have any Cat6. I will see about getting some.

I found a USB Ethernet adapter I already had, so I am going to try switching to that tomorrow. It's not Gigabit speed, but it should help me pin down whether or not it has something to do with the built-in NIC.

MeSue
2011-09-03, 20:19
Hmmm... in reading posts on some WHS forums, I found out about a SMART addin for WHS. I installed it, and it is reporting one of my disks having critical health (53 bad sectors) and another one potentially failing (1 bad sector). I am running a checkdsk repair overnight.

I guess I will hold off on installing that USB adapter until I see if repairing, or if necessary, replacing, these disks changes the situation.

I remember I replaced one of the disks at one point and it may have been why the problem went away for a while. Wish I could remember the circumstances...

bobkoure
2011-09-06, 06:14
As a BTW, there are USB-to-VHS/HDMI available for around $40.
I'd think about getting one of these setup before I started messing with NIC settings.
Google shopping search for usb-vga adapters (http://www.google.com/search?num=50&hl=en&safe=active&tbs=p_ord%3Ap&tbm=shop&q=%22usb+to+vga%22+adapter&btnG=Search&oq=%22usb+to+vga%22+adapter&aq=f&aqi=g3g-m4&aql=&gs_sm=s&gs_upl=36180l39907l0l47155l2l2l0l0l0l0l347l540l0.1 .0.1l2l0)
I've used a few of these over the years (mostly for driving projectors from netbooks that couldn't produce the required resolution). The adapters come with a display driver, which may be all you need for WHS (not sure how much of the 2k3 display subsystem they've ripped out...).

MeSue
2011-09-06, 07:43
Thanks, Bob. I may need to resort to getting one of those, or the diagnostic cable that's made especially for the HP MSS.

I removed the critical drive with 58 bad sectors and I still had the problem last night. Now the drive that had 1 bad sector is up to 3 and another drive is showing 1 bad sector. This is getting worrying.

I don't know... it may just be time to build a new server.

bobkoure
2011-09-06, 08:06
You've looked at the error log?
Anything incriminating? Any warnings or errors about anything at all?
If they've hidden the event viewer from you, try start / run / "eventvwr.msc /s" (without the quotes).
If that doesn't work, check system32 for eventvwr.msc. If it isn't there, I can email you a 2k3 version that should work just fine.

Squeezed_Rotel
2011-09-06, 08:31
I don't know... it may just be time to build a new server.

I had too many issues with my MediaSmart. I only owned it for a few months and then built my own and never looked back.

MeSue
2011-09-06, 18:38
I found errors in the event log that led me to some evidence that the on-board NIC in this server has issues when combined with certain routers and switches. Not enough info out there to know what combos work and what don't, unfortunately.

I've ordered a USB NIC and Cat6 to replace the Cat5e and hopefully that will eliminate the problem. In the meantime, I've installed the slower USB NIC I already have and it hasn't happened since. Only one day, but I am hopeful that this is the right solution.

Am I going to lose much speed on a USB Gigabit adapter versus the on-board one?

MeSue
2011-09-13, 17:40
Just an update on this issue...

I didn't have one incident of my problem for a week after installing a USB Ethernet adapter in the EX470. Today I received the USB Gigabit Ethernet adapter and installed it with no problems. My music transfers are much speedier now. I can tell no difference between the speed of this and the onboard NIC.

I also replaced all my cables for the Gigabit connections with Cat6 cables, and I replaced my ancient Netgear powerline adapters with faster ones (200M) from Monoprice. These are a great deal, BTW. Half the price of most other powerline adapters. I'd had my eye on them, but was waiting until I had more to order from Monoprice.

If anyone else with an HP MediaSmart needs to replace the NIC, I can say that the one from Monoprice (linked in my first post) works with the XP2K 32-bit driver. You can find a link to the drivers from the Monoprice reviews.