[Bug 273313] [NEW] TSC Clocksource Unstable Switches To acpi_pm But Server clock freezes/becomes unusable

Félim Whiteley felimwhiteley at gmail.com
Mon Sep 22 20:38:27 UTC 2008


Public bug reported:

[Note I had update bug 190414 but it was closed]

Hi There,

I'm having this exact problem on Hardy Server, but this is fairly
critical for me. The same error is logged:

Sep 20 10:22:25 host-01 kernel: [51281.289424] Clocksource tsc unstable (delta = 3323063740502 ns)
Sep 20 10:22:25 host-01 kernel: [51281.299403] Time: acpi_pm clocksource has been installed.
Sep 20 10:22:26 host-01 kernel: [51282.778316] NET: Registered protocol family 17

The problem here is once acpi_pm is installed the clock stays at that
time as ntp seems to either be unable to update time or it constanly
loses time and ntp manages to "hold" the time at the current time (Sep
20 10:22:26 in this case). This is a real showstopper as this causes all
sorts of problems, Cacti graphs stop logging as it effectivly freezes in
time as far as cacti is concerned and nagios/cron have problem
scheduling things. The server (I've two that do this) both running
2.6.24-19-server become highly unstable, ssh logins fail, the system
becomes highly unresponsive. I haven't pinned down when exactly this
occured but it's basically rendered Hardy Server useless. It seems to
have happened int he last couple of weeks but I'll be digging through
the old logs to see... the problem being the logs are pretty useless
with the time being completely off !

As regards changing clocksource the ones available are:
sudo cat /sys/devices/system/clocksource/clocksource0/available_clocksource
acpi_pm jiffies tsc

I'm trying jiffies as acpi_pm and tsc appear useless. The processor is a
Intel(R) Xeon(TM) CPU 2.40GHz and it's Dell Poweredge 2800 server, A 3rd
Server which is a HP proliant hasn't show same error and it does have
the hpet clocksource.


Ok well jiffies did not work, in cat when I switch ntp back on and ran
date every 5 secs or so I get this:

user at host-01:~$ date
Sat Sep 20 10:22:51 BST 2008
user at host-01:~$ date
Mon Sep 22 05:40:35 BST 2008
user at host-01:~$ date
Mon Sep 22 05:40:35 BST 2008
user at host-01:~$ date
Mon Sep 22 05:40:35 BST 2008
user at host-01:~$ date
Mon Sep 22 05:40:35 BST 2008

where the actual time should have been 06:00 Hrs +

This also seems to lock up the machine, I've tried a remote reboot and
while the ssh terminal was failry responsive some commands didn't seem
to complete and had to be ctrl-C to quit. This is exactly what happened
on the tcs/acpi_pm clocksource as well.

I've also read is forcing the CPU to stay at full speed can stop it
which would be a ok temporary solution for me as it's more important the
server works, but seems that scaling isn't available in Server ? Either
that or it's handled differntly fromt he scaling governors like it was
before:

user at host-01:~$ sudo ls -l /sys/devices/system/cpu/cpu0/
total 0
-r-------- 1 root root 4096 2008-09-22 11:43 crash_notes
drwxr-xr-x 2 root root 0 2008-09-22 11:42 topology
user at host-01:~$ sudo ls -l /sys/devices/system/cpu/cpu0/topology/
total 0
-r--r--r-- 1 root root 4096 2008-09-22 11:43 core_id
-r--r--r-- 1 root root 4096 2008-09-22 11:42 core_siblings
-r--r--r-- 1 root root 4096 2008-09-22 11:43 physical_package_id
-r--r--r-- 1 root root 4096 2008-09-22 11:43 thread_siblings

I've set clocksource=acpi_pm at boot to see if starting out on it rather
than switching from TSC solves the issue, I'll update as soon as I have
info. Just to add this never occurs right after or during boot, it can
take several hours to occur. I haven't spotted a pattern yet but I'll
keep my eyes open.

Attached Dmesg with acpi_pm enabled in grub.

Despite acpi_pm enabled at boot it still seems to use TSC (Have to
pardon my not understanding the inner workings of this) as I get the
following:

Sep 22 12:03:57 host-01 kernel: [ 1007.064201] Clocksource tsc unstable
(delta = 140599784626 ns)

It doesn't give me the switching to acpi_pm but the clock has already
started to wander after being up for only a short while.

PS: Apologies I know I just posted what was in my last comments to
190414 but I didn't want to leave out anything and might help following
it through....

** Affects: linux-meta (Ubuntu)
     Importance: Undecided
         Status: New

-- 
TSC Clocksource Unstable Switches To acpi_pm But Server clock freezes/becomes unusable
https://bugs.launchpad.net/bugs/273313
You received this bug notification because you are a member of Kernel
Bugs, which is subscribed to linux-meta in ubuntu.




More information about the kernel-bugs mailing list