[Tfug] Persistent Linux and X Crashes. How to track down?

Jeremy D Rogers jdrogers at optics.arizona.edu
Wed Aug 2 08:01:55 MST 2006


> Chad Woolley wrote:
[snip]
> > The problem is I keep having persistent OS crashes/lockups.  I
[snip]

That, IMHO, is the most frustrating thing to deal with. I've had 2
machines in the past 3 years with almost identical symptoms. Neither
was dual boot, so I couldn't test linux vs win, but I'm pretty
confident it was hardware. In one case, the lockups only happened when
I was doing BOTH playing mp3's AND doing large file transfers. I even
got it to go away for a while playing with kernel default schedulers..
in the end I stopped it by not using the on-board ethernet card on the
motherboard.  Putting a pci card in solved the lockups.

The other box, I never figured out, but interestingly it was also an
ASUS mb. I wound up trading it to a friend and he has had no problems.
I think that part of the problem is that hardware has gotten very
complex and it can be very difficult to get everything to play nicely
together.

I also have a thinkpad that is crashing often now, but it's been
heavily abused, and the lockups only happen when I push on certain
parts of the case.

> > On the most recent one, I could SSH but not start X.  The xorg log
> > said nothing more informative than "a crash happened".  I didn't know
> > what other logs to look in.

Good logs to check are /var/log/syslog an /var/log/kern.log, but
unfortunately, if its like my experience, not much useful stuff was in
there.

> > Any ideas?  I would think that linux should be more stable, but it
[snip
> > sometimes I still can't even ssh or ping.

I think linux is more stable by default, and can be MADE much more
stable by tweaking.. unfortunately, you can't do much about hardware.
Also, in rare occasions, windows drivers are aware of some dumb
hardware bug and work around it where linux doesn't. Of course, the
same is true of the reverse in other cases.

If you can ssh into the box and want to try restarting x, I usually
just kill the login manager with 'killall kdm' or 'killall gdm' and
then once I know its dead, start kdm as root: 'kdm'.

On 8/2/06, Ronald Sutherland <rsutherland at epccs.com> wrote:
> Sounds like hot hardware to me... open the case blow the dust off
[snip]
> damaged, bits will flip even if cold. Anyway good luck... and stay cool.

Also great advice. Do you run AC or swamp cooler? :-)




More information about the tfug mailing list