[Bug 1739585] Re: L2 guest failed to boot under nested KVM: entry failed, hardware error 0x0

ChristianEhrhardt 1739585 at bugs.launchpad.net
Thu Dec 21 10:35:56 UTC 2017


Tried a local repro with

HW: Xenial + Ocata
L1: Bionic
L2: Bionic

But my case is just working.
We knew it is HW related.

There was a set of similar issues in 2014/2015.
There it was around kernel 3.10/3.13 in RH.
See:
https://bugzilla.redhat.com/show_bug.cgi?id=1086058
https://bugzilla.redhat.com/show_bug.cgi?id=1069089
https://www.spinics.net/lists/kvm/msg102458.html

Back then it was related to features being passed through which should not and then fail on L2.
Chances are high that it is in this area again.

OTOH my defaults are not specifying CPU and only have base features set.
So the guests on both levels are like:
# no cpu
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>

IIRC Openstack will try to do a host-model or even host-pasthrough spec.
That might enable "too much" and thereby break.
Since those defintions and what is meant to be detected/added is version dependent that might even explain why you think you only see it on newer libvirt/qemu.


I checked my Test env:

## HW / L0 ##

/sys/module/kvm_intel/parameters/emulate_invalid_guest_state : Y
/sys/module/kvm_intel/parameters/enable_apicv : N
/sys/module/kvm_intel/parameters/enable_shadow_vmcs : N
/sys/module/kvm_intel/parameters/ept : Y
/sys/module/kvm_intel/parameters/eptad : Y
/sys/module/kvm_intel/parameters/fasteoi : Y
/sys/module/kvm_intel/parameters/flexpriority : Y
/sys/module/kvm_intel/parameters/nested : Y
/sys/module/kvm_intel/parameters/ple_gap : 128
/sys/module/kvm_intel/parameters/ple_window : 4096
/sys/module/kvm_intel/parameters/ple_window_grow : 2
/sys/module/kvm_intel/parameters/ple_window_max : 1073741823
/sys/module/kvm_intel/parameters/ple_window_shrink : 0
/sys/module/kvm_intel/parameters/pml : N
/sys/module/kvm_intel/parameters/unrestricted_guest : Y
/sys/module/kvm_intel/parameters/vmm_exclusive : Y
/sys/module/kvm_intel/parameters/vpid : Y

## L1 ##

$ for i in /sys/module/kvm_intel/parameters/*; do echo "$i : $(cat $i)"; done
/sys/module/kvm_intel/parameters/emulate_invalid_guest_state : Y
/sys/module/kvm_intel/parameters/enable_apicv : N
/sys/module/kvm_intel/parameters/enable_shadow_vmcs : N
/sys/module/kvm_intel/parameters/ept : Y
/sys/module/kvm_intel/parameters/eptad : N
/sys/module/kvm_intel/parameters/fasteoi : Y
/sys/module/kvm_intel/parameters/flexpriority : Y
/sys/module/kvm_intel/parameters/nested : Y
/sys/module/kvm_intel/parameters/ple_gap : 0
/sys/module/kvm_intel/parameters/ple_window : 4096
/sys/module/kvm_intel/parameters/ple_window_grow : 2
/sys/module/kvm_intel/parameters/ple_window_max : 1073741823
/sys/module/kvm_intel/parameters/ple_window_shrink : 0
/sys/module/kvm_intel/parameters/pml : N
/sys/module/kvm_intel/parameters/preemption_timer : Y
/sys/module/kvm_intel/parameters/unrestricted_guest : Y
/sys/module/kvm_intel/parameters/vpid : Y

$ cat /proc/cpuinfo
model name      : QEMU Virtual CPU version 2.5+
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl xtopology cpuid pni vmx cx16 x2apic hypervisor lahf_lm tpr_shadow vnmi flexpriority ept vpid


## L2 ##
model name      : QEMU Virtual CPU version 2.5+
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl xtopology cpuid pni vmx cx16 x2apic hypervisor lahf_lm cpuid_fault tpr_shadow vnmi flexpriority ept vpid


@James - could you report the XML of the guests on both levels.
@James - could you report the kvm_intel module args on both levels.
@James - could you report the cpu features on all three levels (if you reach the third at all)
@James - if you think the former e.g. zesty stack in L1 did not show the issue we might eventually want the same data with this in L1
@James - how much control do you have over those models / features to try iterating on them.

** Bug watch added: Red Hat Bugzilla #1086058
   https://bugzilla.redhat.com/show_bug.cgi?id=1086058

** Bug watch added: Red Hat Bugzilla #1069089
   https://bugzilla.redhat.com/show_bug.cgi?id=1069089

** Changed in: cloud-archive
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1739585

Title:
  L2 guest failed to boot under nested KVM: entry failed, hardware error
  0x0

Status in Ubuntu Cloud Archive:
  Incomplete

Bug description:
  During testing of the Queens b2 milestone, I see this particular error
  when the test cloud attempts to boot instances on specific hosts on
  our cloud.

  The base cloud is running:

    4.4.0-72-generic

  The test instance on the cloud saw the same issue with these kernels:

    4.10.0-42-generic
    4.4.0-97-generic

  I don't think we're seeing the same issue with pre-bionic versions of
  libvirt/qemu on these hosts.

  Error from libvirt qemu instance log:

  KVM: entry failed, hardware error 0x0
  EAX=00000000 EBX=00000000 ECX=00000000 EDX=000306d2
  ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
  EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
  ES =0000 00000000 0000ffff 00009300
  CS =f000 ffff0000 0000ffff 00009b00
  SS =0000 00000000 0000ffff 00009300
  DS =0000 00000000 0000ffff 00009300
  FS =0000 00000000 0000ffff 00009300
  GS =0000 00000000 0000ffff 00009300
  LDT=0000 00000000 0000ffff 00008200
  TR =0000 00000000 0000ffff 00008b00
  GDT=     00000000 0000ffff
  IDT=     00000000 0000ffff
  CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
  DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
  DR6=00000000ffff0ff0 DR7=0000000000000400
  EFER=0000000000000000
  Code=ff ff 66 5b 66 83 c4 08 66 5b 66 5e 66 c3 cd 19 cb cd 18 cb <ea> 5b e0 00 f0 30 36 2f 32 33 2f 39 39 00 fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

  Error on host:

  [22353622.446568] nested_vmx_exit_handled failed vm entry 7

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: qemu-system-x86 1:2.10+dfsg-0ubuntu5~cloud0 [origin: Canonical]
  ProcVersionSignature: Ubuntu 4.10.0-42.46~16.04.1-generic 4.10.17
  Uname: Linux 4.10.0-42-generic x86_64
  ApportVersion: 2.20.1-0ubuntu2.14
  Architecture: amd64
  CrashDB:
   {
                  "impl": "launchpad",
                  "project": "cloud-archive",
                  "bug_pattern_url": "http://people.canonical.com/~ubuntu-archive/bugpatterns/bugpatterns.xml",
               }
  Date: Thu Dec 21 09:58:30 2017
  Ec2AMI: ami-00000259
  Ec2AMIManifest: FIXME
  Ec2AvailabilityZone: nova
  Ec2InstanceType: m1.medium
  Ec2Kernel: unavailable
  Ec2Ramdisk: unavailable
  KvmCmdLine: COMMAND         STAT  EUID  RUID   PID  PPID %CPU COMMAND
  Lsusb:
   Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd 
   Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
  MachineType: OpenStack Foundation OpenStack Nova
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.10.0-42-generic root=UUID=d7006b2f-ace6-464d-8b21-17180b3ed360 ro console=tty1 console=ttyS0
  SourcePackage: qemu
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 04/01/2014
  dmi.bios.vendor: SeaBIOS
  dmi.bios.version: 1.10.1-1ubuntu1~cloud0
  dmi.chassis.type: 1
  dmi.chassis.vendor: QEMU
  dmi.chassis.version: pc-i440fx-zesty
  dmi.modalias: dmi:bvnSeaBIOS:bvr1.10.1-1ubuntu1~cloud0:bd04/01/2014:svnOpenStackFoundation:pnOpenStackNova:pvr15.0.7:cvnQEMU:ct1:cvrpc-i440fx-zesty:
  dmi.product.name: OpenStack Nova
  dmi.product.version: 15.0.7
  dmi.sys.vendor: OpenStack Foundation

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1739585/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list