server crash after karmic upgrade

CLIFFORD ILKAY clifford_ilkay at dinamis.com
Sun Mar 21 10:26:56 UTC 2010


On 03/21/2010 04:53 AM, Patton Echols wrote:
> On 03/19/2010 02:12 AM, Patton Echols wrote:
>> Yesterday I upgraded my home server to Karmic.
>> The upgrade seems to have completed correctly.  However, when I came
>> home today, the server was off line, could not be pinged , no samba,
>> no ssh, no intranet pages.  The only way to restart was from the power
>> switch.  After restart, everything worked fine.  There were some
>> upgrades I applied when I ssh'ed in to the box, all seemed well.
>> About an hour later, it was off line again.
>>
>> At this point, I don't even know the right questions to ask or what
>> log files to check.  Any suggestions?  Tips for where to find more info?
>>
>
> Not even a guess where to start looking?

I'm running Fedora 12 on my desktop with the open source nVidia driver. 
The two latest kernel updates have both been broken for me so I'm using 
the kernel prior to those upgrades. While the symptoms weren't exactly 
the same as yours, they were strange. The system would boot fine. I 
could start KDE. Anywhere from minutes to hours later, I would lose 
control over the keyboard. The numlock indicator would stay on 
regardless, toggling capslock had no effect, and none of the sys request 
tricks worked All I could do was ssh into the box from my notebook and 
init 6.. The kernel update was four days ago. The system has been 
running continuously since then after rebooting with the older kernel.

Assuming you didn't purge the older kernels, you might want to try 
booting using an older kernel and see how it goes. If that doesn't work, 
I'd be suspicious of your hardware. We had a server that exhibited 
unpredictable shutdowns and it turned out to be bad capacitors on the 
motherboard. A machine I had once exhibited similar problems. It turned 
out to be a dying hard disk drive. It just so happened that sectors on 
which parts of the OS were stored were defective and that would cause 
random shutdowns. Replacing the disk drive fixed the problem. I've also 
seen bad RAM causing all sorts of weird problems. You can try running 
memtest86 and something like DFT (Drive Fitness Test). Your hard disk 
drive manufacturer might have a utility on their web site.

Such failures are very difficult to troubleshoot because you usually 
won't find anything useful in the logs. Sometimes all you can do is 
troubleshoot by the process of elimination. For instance, if booting 
from an older kernel doesn't help, you could move the hard disk to 
another machine and see if it misbehaves the same way. If it doesn't, 
you know it's something other than software or the hard disk. If it 
does, at least you've narrowed it down to either a hard disk problem or 
software, which is progress. Good luck.
-- 
Regards,

Clifford Ilkay
Dinamis
1419-3266 Yonge St.
Toronto, ON
Canada  M4N 3P6

<http://dinamis.com>
+1 416-410-3326




More information about the ubuntu-users mailing list