[Bug 613273] Re: kernel panic on ec2 in system_call_fastpath

Stefan Bader stefan.bader at canonical.com
Wed Jun 29 10:21:32 UTC 2011


This problem seemed to have intensified for me in my test system in
Oneiric and we were finally able to track it down to /usr/share
/initramfs-tools/scripts/init-bottom/udev. In there the boot process
tries to stop udevd and then move all the special filesystems (/dev,
/proc, and /sys) over to the new rootfs and finally switching to that
before restarting udevd. However udevd is still launching processes to
create devnodes at that point. And it seems in some rare cases the pkill
(SIGTERM) fails to really kill all of the udevd processes, which leads
to situation were the initramfs cannot be completely nuked and that
triggers a panic.

In Oneiric udevadm has a way to stop udevd in a more sensible way which
also waits until udev actually stopped (udevadm control --exit). Thought
this is not possible with the versions of udev in Natty and Maverick.
Making udevd at least not starting new processes (udevadm control
--stop-exec-queue). Using that before the pkill would prevent a lot of
those ugly "workers have been killed" and "/dev/null not found"
messages. Unfortunately there still seemed to be a (much smaller) chance
to hit the problem where udevd does not stop on SIGTERM.

So I am not sure which path is the better / simpler to implement one.
Have the ability of using a --exit backported from newer udev packages
or possibly retry the pkill a few times and if that does not remove the
udev processes, switch to a more brutal signal before finally giving
up... But either way it is not a kernel problem but udev or initramfs-
tools side.

** Changed in: linux (Ubuntu)
       Status: Triaged => Invalid

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to initramfs-tools in Ubuntu.
https://bugs.launchpad.net/bugs/613273

Title:
  kernel panic on ec2 in system_call_fastpath

Status in “initramfs-tools” package in Ubuntu:
  Triaged
Status in “linux” package in Ubuntu:
  Invalid

Bug description:
  In testing alpha-3 build 20100813.2, I found the following kernel oops:
  [    0.473264] Kernel panic - not syncing: Attempted to kill init!^M
  [    0.473277] Pid: 1, comm: run-init Not tainted 2.6.35-14-virtual #19-Ubuntu^M
  [    0.473284] Call Trace:^M
  [    0.473294]  [<ffffffff815a0109>] panic+0x90/0x111^M
  [    0.473303]  [<ffffffff81006b1b>] ? __raw_callee_save_xen_irq_enable+0x11/0x26^M
  [    0.473312]  [<ffffffff8106276d>] forget_original_parent+0x33d/0x350^M
  [    0.473318]  [<ffffffff81061b94>] ? put_files_struct+0xc4/0xf0^M
  [    0.473325]  [<ffffffff8106279b>] exit_notify+0x1b/0x190^M
  [    0.473330]  [<ffffffff810640a5>] do_exit+0x1c5/0x3f0^M
  [    0.473336]  [<ffffffff81006afd>] ? __raw_callee_save_xen_irq_disable+0x11/0x1e^M
  [    0.473343]  [<ffffffff810643d7>] sys_exit+0x17/0x20^M
  [    0.473349]  [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b^M

  This happened only once in my testing so far today (dozens of boots).
  I'm attaching the console log of failed instance.  The instance that reported this bug is same ami/region different instance.

  ProblemType: Bug
  DistroRelease: Ubuntu 10.10
  Package: linux-image-2.6.35-14-virtual 2.6.35-14.19
  Regression: Yes
  Reproducible: No
  ProcVersionSignature: User Name 2.6.35-14.19-virtual 2.6.35
  Uname: Linux 2.6.35-14-virtual x86_64
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  CurrentDmesg: [   12.900041] eth0: no IPv6 routers present
  Date: Wed Aug  4 01:19:11 2010
  Ec2AMI: ami-09c3bc5b
  Ec2AMIManifest: ubuntu-images-testing-ap-southeast-1/ubuntu-maverick-daily-amd64-server-20100803.2.manifest.xml
  Ec2AvailabilityZone: ap-southeast-1b
  Ec2InstanceType: m1.large
  Ec2Kernel: aki-11d5aa43
  Ec2Ramdisk: unavailable
  Frequency: Once a week.
  Lspci:
   
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  ProcCmdLine: root=LABEL=uec-rootfs ro
  ProcEnviron:
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcModules: acpiphp 18752 0 - Live 0xffffffffa0000000
  SourcePackage: linux

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/613273/+subscriptions




More information about the foundations-bugs mailing list