[Bug 1665143] Re: commissioning does not discover block devices on HP ProLiant DL360 Gen9 servers

John George john.george at canonical.com
Thu Feb 23 17:54:48 UTC 2017


Please mark this bug critical, as it blocks all integration testing.

** Description changed:

- Block devices are not discovered on all 7 of our HP ProLiant DL360 Gen9
- servers. These servers are used with daily MAAS testing, so are proven
- to normally commission and deploy. We also experienced an instance of
- this failure last week with maas 2.1.3.
+ The udev package provides /lib/udev/rules.d/60-persistent-storage.rules
+ which creates two symlinks for nvme devices, under /dev/disk/by-id/. The
+ first link name includes the device wwid and the second includes the
+ device model/serial. The commission script select the first link
+ discovered and subsequently attempts to store it in a FilePath field,
+ which allows for 100 characters. Since the wwid link is greater than 100
+ characters an exception is thrown, causing not only the nvme device not
+ to be registered but all other storage devices as well.
+ 
+ This issue has blocked all test runs performed by the CDO-QA test
+ infrastructure, since every run installs MAAS on a fresh machine and
+ commissions new nodes. The failure is seen when installing from either
+ ppa:maas/next (2.2.0~beta2) or ppa:maas/stable (2.1.3+bzr5573).
+ 
  
  ubuntu at meowth:~$ dpkg -l '*maas*'|cat
  Desired=Unknown/Install/Remove/Purge/Hold
  | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
  |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
  ||/ Name                            Version                              Architecture Description
  +++-===============================-====================================-============-=================================================
  ii  maas                            2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          "Metal as a Service" is a physical cloud and IPAM
  ii  maas-cli                        2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS client and command-line interface
  un  maas-cluster-controller         <none>                               <none>       (no description available)
  ii  maas-common                     2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS server common files
  ii  maas-dhcp                       2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS DHCP server
  ii  maas-dns                        2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS DNS server
  ii  maas-proxy                      2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS Caching Proxy
  ii  maas-rack-controller            2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          Rack Controller for MAAS
  ii  maas-region-api                 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          Region controller API service for MAAS
  ii  maas-region-controller          2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          Region Controller for MAAS
  un  maas-region-controller-min      <none>                               <none>       (no description available)
  un  python-django-maas              <none>                               <none>       (no description available)
  un  python-maas-client              <none>                               <none>       (no description available)
  un  python-maas-provisioningserver  <none>                               <none>       (no description available)
  ii  python3-django-maas             2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS server Django web framework (Python 3)
  ii  python3-maas-client             2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS python API client (Python 3)
  ii  python3-maas-provisioningserver 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS server provisioning libraries (Python 3)
  
  After re-commissioning one of the servers with ssh enabled the attached
  log files were collected. Please note that from the shell it can be seen
  that block devices are discovered and even the commissioning output
  found in /tmp/user_data.sh.IK9yVp/out/00-maas-07-block-devices lists
  devices (see attached), where-as this file is shown as a 0 byte file
  from the GUI (see screen shot).
  
  There are 'HTTP Error 500: INTERNAL SERVER ERROR' errors in cloud-init-
  output.log
  
  ubuntu at azurill:~$ uname -a
  Linux azurill 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:55:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  
- ubuntu at azurill:~$ lsblk
- NAME    MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
- sdb       8:16   0 465.7G  0 disk 
- sdc       8:32   0 179.1M  1 disk /media/root-ro
- sda       8:0    0 465.7G  0 disk 
- ├─sda2    8:2    0 465.2G  0 part 
- └─sda1    8:1    0   512M  0 part
+ ubuntu at azurill:~$ sudo lsblk  --exclude 1,2,7 -d -P -o NAME,RO,RM,MODEL,ROTA
+ NAME="sdb" RO="0" RM="0" MODEL="LOGICAL VOLUME  " ROTA="1"
+ NAME="sdc" RO="1" RM="0" MODEL="VIRTUAL-DISK    " ROTA="1"
+ NAME="sda" RO="0" RM="0" MODEL="LOGICAL VOLUME  " ROTA="1"
+ NAME="nvme0n1" RO="0" RM="0" MODEL="INTEL SSDPEDME400G4

** Description changed:

  The udev package provides /lib/udev/rules.d/60-persistent-storage.rules
  which creates two symlinks for nvme devices, under /dev/disk/by-id/. The
  first link name includes the device wwid and the second includes the
- device model/serial. The commission script select the first link
+ device model/serial. The commission script selects the first link
  discovered and subsequently attempts to store it in a FilePath field,
  which allows for 100 characters. Since the wwid link is greater than 100
  characters an exception is thrown, causing not only the nvme device not
- to be registered but all other storage devices as well.
+ to be registered but all other storage devices as well. Although
+ commissioning completes there is no storage assigned, which makes
+ deployment of the node impossible.
  
  This issue has blocked all test runs performed by the CDO-QA test
  infrastructure, since every run installs MAAS on a fresh machine and
  commissions new nodes. The failure is seen when installing from either
  ppa:maas/next (2.2.0~beta2) or ppa:maas/stable (2.1.3+bzr5573).
- 
  
  ubuntu at meowth:~$ dpkg -l '*maas*'|cat
  Desired=Unknown/Install/Remove/Purge/Hold
  | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
  |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
  ||/ Name                            Version                              Architecture Description
  +++-===============================-====================================-============-=================================================
  ii  maas                            2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          "Metal as a Service" is a physical cloud and IPAM
  ii  maas-cli                        2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS client and command-line interface
  un  maas-cluster-controller         <none>                               <none>       (no description available)
  ii  maas-common                     2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS server common files
  ii  maas-dhcp                       2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS DHCP server
  ii  maas-dns                        2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS DNS server
  ii  maas-proxy                      2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS Caching Proxy
  ii  maas-rack-controller            2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          Rack Controller for MAAS
  ii  maas-region-api                 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          Region controller API service for MAAS
  ii  maas-region-controller          2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          Region Controller for MAAS
  un  maas-region-controller-min      <none>                               <none>       (no description available)
  un  python-django-maas              <none>                               <none>       (no description available)
  un  python-maas-client              <none>                               <none>       (no description available)
  un  python-maas-provisioningserver  <none>                               <none>       (no description available)
  ii  python3-django-maas             2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS server Django web framework (Python 3)
  ii  python3-maas-client             2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS python API client (Python 3)
  ii  python3-maas-provisioningserver 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS server provisioning libraries (Python 3)
  
  After re-commissioning one of the servers with ssh enabled the attached
  log files were collected. Please note that from the shell it can be seen
  that block devices are discovered and even the commissioning output
  found in /tmp/user_data.sh.IK9yVp/out/00-maas-07-block-devices lists
  devices (see attached), where-as this file is shown as a 0 byte file
  from the GUI (see screen shot).
  
  There are 'HTTP Error 500: INTERNAL SERVER ERROR' errors in cloud-init-
  output.log
  
  ubuntu at azurill:~$ uname -a
  Linux azurill 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:55:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  
  ubuntu at azurill:~$ sudo lsblk  --exclude 1,2,7 -d -P -o NAME,RO,RM,MODEL,ROTA
  NAME="sdb" RO="0" RM="0" MODEL="LOGICAL VOLUME  " ROTA="1"
  NAME="sdc" RO="1" RM="0" MODEL="VIRTUAL-DISK    " ROTA="1"
  NAME="sda" RO="0" RM="0" MODEL="LOGICAL VOLUME  " ROTA="1"
  NAME="nvme0n1" RO="0" RM="0" MODEL="INTEL SSDPEDME400G4

** Summary changed:

- commissioning does not discover block devices on HP ProLiant DL360 Gen9 servers
+ Commission scripts select the wrong nvme device link, then fail to report any storage

** Also affects: udev (Ubuntu)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to udev in Ubuntu.
https://bugs.launchpad.net/bugs/1665143

Title:
  Commission scripts select the wrong nvme device link, then fail to
  report any storage

Status in MAAS:
  Fix Committed
Status in MAAS 2.1 series:
  Won't Fix
Status in udev package in Ubuntu:
  New

Bug description:
  The udev package provides /lib/udev/rules.d/60-persistent-
  storage.rules which creates two symlinks for nvme devices, under
  /dev/disk/by-id/. The first link name includes the device wwid and the
  second includes the device model/serial. The commission script selects
  the first link discovered and subsequently attempts to store it in a
  FilePath field, which allows for 100 characters. Since the wwid link
  is greater than 100 characters an exception is thrown, causing not
  only the nvme device not to be registered but all other storage
  devices as well. Although commissioning completes there is no storage
  assigned, which makes deployment of the node impossible.

  This issue has blocked all test runs performed by the CDO-QA test
  infrastructure, since every run installs MAAS on a fresh machine and
  commissions new nodes. The failure is seen when installing from either
  ppa:maas/next (2.2.0~beta2) or ppa:maas/stable (2.1.3+bzr5573).

  ubuntu at meowth:~$ dpkg -l '*maas*'|cat
  Desired=Unknown/Install/Remove/Purge/Hold
  | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
  |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
  ||/ Name                            Version                              Architecture Description
  +++-===============================-====================================-============-=================================================
  ii  maas                            2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          "Metal as a Service" is a physical cloud and IPAM
  ii  maas-cli                        2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS client and command-line interface
  un  maas-cluster-controller         <none>                               <none>       (no description available)
  ii  maas-common                     2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS server common files
  ii  maas-dhcp                       2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS DHCP server
  ii  maas-dns                        2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS DNS server
  ii  maas-proxy                      2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS Caching Proxy
  ii  maas-rack-controller            2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          Rack Controller for MAAS
  ii  maas-region-api                 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          Region controller API service for MAAS
  ii  maas-region-controller          2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          Region Controller for MAAS
  un  maas-region-controller-min      <none>                               <none>       (no description available)
  un  python-django-maas              <none>                               <none>       (no description available)
  un  python-maas-client              <none>                               <none>       (no description available)
  un  python-maas-provisioningserver  <none>                               <none>       (no description available)
  ii  python3-django-maas             2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS server Django web framework (Python 3)
  ii  python3-maas-client             2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS python API client (Python 3)
  ii  python3-maas-provisioningserver 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all          MAAS server provisioning libraries (Python 3)

  After re-commissioning one of the servers with ssh enabled the
  attached log files were collected. Please note that from the shell it
  can be seen that block devices are discovered and even the
  commissioning output found in /tmp/user_data.sh.IK9yVp/out/00-maas-07
  -block-devices lists devices (see attached), where-as this file is
  shown as a 0 byte file from the GUI (see screen shot).

  There are 'HTTP Error 500: INTERNAL SERVER ERROR' errors in cloud-
  init-output.log

  ubuntu at azurill:~$ uname -a
  Linux azurill 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:55:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

  ubuntu at azurill:~$ sudo lsblk  --exclude 1,2,7 -d -P -o NAME,RO,RM,MODEL,ROTA
  NAME="sdb" RO="0" RM="0" MODEL="LOGICAL VOLUME  " ROTA="1"
  NAME="sdc" RO="1" RM="0" MODEL="VIRTUAL-DISK    " ROTA="1"
  NAME="sda" RO="0" RM="0" MODEL="LOGICAL VOLUME  " ROTA="1"
  NAME="nvme0n1" RO="0" RM="0" MODEL="INTEL SSDPEDME400G4

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1665143/+subscriptions



More information about the foundations-bugs mailing list