[Bug 1822118] Re: Kernel Panic while rebooting cloud instance

Colin Ian King 1822118 at bugs.launchpad.net
Thu Sep 26 13:59:31 UTC 2019


I kicked off another ~20K reboot tests with Standard_B2S instances and
hit hangs again:

IP addr         Mac Addr	        Kernel	                Reboots
104.42.3.161	00:0d:3a:37:82:ee	5.0.0-1020-azure	100
13.91.5.23	00:0d:3a:5a:74:23	5.0.0-1020-azure	57   [ HANG ]
13.91.5.222	00:0d:3a:5a:75:1a	5.0.0-1020-azure	100
13.64.117.146	00:0d:3a:5a:74:da	5.0.0-1020-azure	100
13.64.117.17	00:0d:3a:37:67:0e	5.0.0-1020-azure	100
13.91.6.207	00:0d:3a:3a:cc:2c	5.0.0-1020-azure	100
40.78.30.129	00:0d:3a:36:6e:eb	5.0.0-1020-azure	100
104.210.36.238	00:0d:3a:5a:73:da	5.0.0-1020-azure	100
13.91.6.143	00:0d:3a:3a:c8:ec	5.0.0-1020-azure	100
40.83.249.58	00:0d:3a:3a:c0:7a	5.0.0-1020-azure	100
104.45.216.53	00:0d:3a:3b:8a:55	5.0.0-1020-azure	100
104.210.42.18	00:0d:3a:5a:73:5c	5.0.0-1020-azure	100
40.78.27.21	00:0d:3a:3a:c9:19	5.0.0-1020-azure	100
40.83.252.110	00:0d:3a:5a:79:93	5.0.0-1020-azure	100
13.64.119.204	00:0d:3a:5a:7e:bc	5.0.0-1020-azure	100
			
104.210.34.4	00:0d:3a:31:18:ee	5.0.0-1020-azure	250
138.91.197.202	00:0d:3a:31:1d:c1	5.0.0-1020-azure	94   [ HANG ]
138.91.196.241	00:0d:3a:31:15:2b	5.0.0-1020-azure	250
104.210.33.44	00:0d:3a:31:16:f3	5.0.0-1020-azure	250
40.83.248.76	00:0d:3a:32:af:a7	5.0.0-1020-azure	250
40.83.253.204	00:0d:3a:32:ba:09	5.0.0-1020-azure	250
168.62.202.8	00:0d:3a:32:a0:11	5.0.0-1020-azure	250
40.83.249.8	00:0d:3a:32:bd:ce	5.0.0-1020-azure	250
40.83.249.93	00:0d:3a:32:b7:32	5.0.0-1020-azure	250
40.83.253.187	00:0d:3a:32:b9:cd	5.0.0-1020-azure	250
23.99.9.88	00:0d:3a:37:96:c9	5.0.0-1020-azure	250
104.40.29.184	00:0d:3a:36:9f:e0	5.0.0-1020-azure	250
137.135.40.122	00:0d:3a:36:9f:eb	5.0.0-1020-azure	250
137.135.49.43	00:0d:3a:36:92:aa	5.0.0-1020-azure	250
138.91.251.8	00:0d:3a:37:9e:ef	5.0.0-1020-azure	250

13.64.146.175	00:0d:3a:31:de:ee	5.0.0-1020-azure	500
104.42.23.145	00:0d:3a:31:da:d7	5.0.0-1020-azure	500
104.42.29.99	00:0d:3a:31:d4:4f	5.0.0-1020-azure	500
40.78.106.12	00:0d:3a:31:d9:8a	5.0.0-1020-azure	500
138.91.233.210	00:0d:3a:31:df:84	5.0.0-1020-azure	500
104.42.25.30	00:0d:3a:31:c9:a4	5.0.0-1020-azure	500
13.64.150.69	00:0d:3a:31:dd:47	5.0.0-1020-azure	321   [ HANG ]
104.42.25.23	00:0d:3a:31:d3:c9	5.0.0-1020-azure	500
104.42.24.176	00:0d:3a:31:d8:36	5.0.0-1020-azure	500
13.64.79.133	00:0d:3a:31:d5:b4	5.0.0-1020-azure	500
104.42.29.146	00:0d:3a:31:de:73	5.0.0-1020-azure	500
104.42.19.191	00:0d:3a:31:d4:78	5.0.0-1020-azure	500
40.118.249.118	00:0d:3a:31:db:20	5.0.0-1020-azure	500
40.112.219.112	00:0d:3a:31:dc:da	5.0.0-1020-azure	500
104.42.17.115	00:0d:3a:31:d3:21	5.0.0-1020-azure	500
40.83.212.164	00:0d:3a:5a:ab:48	5.0.0-1020-azure	500
52.160.123.4	00:0d:3a:36:0d:6a	5.0.0-1020-azure	500
52.160.83.37	00:0d:3a:5a:ab:79	5.0.0-1020-azure	500
52.160.122.92	00:0d:3a:36:00:4c	5.0.0-1020-azure	500
52.160.122.71	00:0d:3a:36:0f:bd	5.0.0-1020-azure	500
52.160.123.12	00:0d:3a:36:04:39	5.0.0-1020-azure	500
104.210.60.218	00:0d:3a:36:b6:25	5.0.0-1020-azure	500
52.160.123.221	00:0d:3a:5a:a9:a3	5.0.0-1020-azure	500
52.160.123.234	00:0d:3a:5a:a7:1c	5.0.0-1020-azure	500
104.210.61.139	00:0d:3a:37:b7:84	5.0.0-1020-azure	500
104.210.61.43	00:0d:3a:36:b5:96	5.0.0-1020-azure	500
40.83.212.185	00:0d:3a:5a:af:9c	5.0.0-1020-azure	500
52.160.82.111	00:0d:3a:5a:a9:a9	5.0.0-1020-azure	500
52.160.82.167	00:0d:3a:5a:a7:17	5.0.0-1020-azure	500
104.210.61.135 	00:0d:3a:36:b3:97	5.0.0-1020-azure	500

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  Description:   In the event a particular Azure cloud instance is
  rebooted it's possible that it may never recover and the instance will
  break indefinitely.

  In My case, it was a kernel panic. See specifics below..

  
  Series: Disco
  Instance Size: Basic_A3
  Region: (Default) US-WEST-2
  Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

  
  I had a simple script to reboot an instance (X) amount of times, I chose 50, so the machine would power cycle by issuing a "reboot" from the terminal prompt just as a user would.   Once the machine came up, it captured dmesg and other bits then rebooted again until it reached 50. 

  After the 4th attempt, my script timed out, I took a look at the
  instance console log and the following displayed on the console.

  
  [  OK  ] Reached target Reboot.
  /shutdown: error while loading shared libra[   89.498980] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
  [   89.498980]
  [   89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure #13-Ubuntu
  [   89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007  06/02/2017
  [   89.508026] Call Trace:
  [   89.508026]  dump_stack+0x63/0x8a
  [   89.508026]  panic+0xe7/0x247
  [   89.508026]  do_exit.cold.23+0x26/0x75
  [   89.508026]  do_group_exit+0x43/0xb0
  [   89.508026]  __x64_sys_exit_group+0x18/0x20
  [   89.508026]  do_syscall_64+0x5a/0x110
  [   89.508026]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [   89.508026] RIP: 0033:0x7f7bf0154d86
  [   89.508026] Code: Bad RIP value.
  [   89.508026] RSP: 002b:00007ffd6be693b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
  [   89.508026] RAX: ffffffffffffffda RBX: 00007f7bf015e420 RCX: 00007f7bf0154d86
  [   89.508026] RDX: 000000000000007f RSI: 000000000000003c RDI: 000000000000007f
  [   89.508026] RBP: 00007f7bef9449c0 R08: 00000000000000e7 R09: 00000000ffffffff
  [   89.508026] R10: 00007ffd6be6974c R11: 0000000000000206 R12: 0000000000000018
  [   89.508026] R13: 00007f7bef944ac8 R14: 00007f7bef944a00 R15: 0000000000000000
  [   89.508026] Kernel Offset: 0x16000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
  [   89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
  [   89.508026]  ]---

  
  this only occurred once in my testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions



More information about the foundations-bugs mailing list