[Tfug] Machine crashing woes

John Gruenenfelder johng at as.arizona.edu
Tue Oct 28 17:54:35 MST 2008


I recently upgraded my desktop machine which made my old desktop my new
fileserver/Mythbox.

The problem is that in its new role it has become rather unstable and I'm
having a really hard time figuring out why.

Here are some symptoms:

1) Two years ago when I upgraded the CPU from an Athlon64 to an Athlon64 X2 I
lost the ability to reboot.  Machine must be shutdown to restart it.
Annoying, but not critical.

2) At boot, when scanning the SATA bus, sometimes it seems that the BIOS
cannot find the two connected drives and keeps rebooting (this reboot does
work for some reason) until it succeeds.

3) During the Linux kernel boot at approx. +5 seconds, the kernels SATA driver
scans the bus and sometimes it too has trouble.  Shutting off/on can fix
this.  Letting it continue to try it eventually succeeded.  It skipped sda,
found sdb, and then a few seconds later found sda, though this caused the
RAID-1 array to need a resync.


Those are the only odds things.  I never see anything in the kernel logs
indicating hardware problems.  It will occasionally just lock up hard and even
Magic SysRq won't work.

If it was a faulty drive, can't the kernel semi-recover from this?  At the
very least, shouldn't I see some log messages?  And even if a drive dies, it
will hose the system but should not cause a hard lock, right?

Temperatures do not seem overly high.  About 36C for the drives, 37-42C for
the CPU, system/case at ~44C.

It has been suggested that power draw might be an issue, but I'm not sure.
The desktop configuration had one HDD, one audio card, and a decent video
card.  Mythbox config has two HDDs, one MPEG card, and the same video card.
The box is a small Shuttle case and those don't typically have powerful PSUs,
but I'm not using even 1/20 the capability of the video card so it seems that
power usage shouldn't be a problem.

Without the kernel to give me some hints I'm at something of a loss as to what
the problem is.  If it *is* a drive, I need to find out soon so I can RMA it
(both drives are new).  If it's the MB... I don't know.  Not sure I can afford
to replace it just yet.

My next course of action is to swap sda and sdb.  Then I can maybe see if the
kernel boot SATA stall occurs on sdb instead of sda.  I hate hardware
issues...


-- 
--John Gruenenfelder    Systems Manager, MKS Imaging Technology, LLC.
Try Weasel Reader for PalmOS  --  http://weaselreader.org
"This is the most fun I've had without being drenched in the blood
of my enemies!"
        --Sam of Sam & Max




More information about the tfug mailing list