[Bug 1665143] Re: commissioning does not discover block devices on HP ProLiant DL360 Gen9 servers
John George
john.george at canonical.com
Thu Feb 23 17:54:48 UTC 2017
Please mark this bug critical, as it blocks all integration testing.
** Description changed:
- Block devices are not discovered on all 7 of our HP ProLiant DL360 Gen9
- servers. These servers are used with daily MAAS testing, so are proven
- to normally commission and deploy. We also experienced an instance of
- this failure last week with maas 2.1.3.
+ The udev package provides /lib/udev/rules.d/60-persistent-storage.rules
+ which creates two symlinks for nvme devices, under /dev/disk/by-id/. The
+ first link name includes the device wwid and the second includes the
+ device model/serial. The commission script select the first link
+ discovered and subsequently attempts to store it in a FilePath field,
+ which allows for 100 characters. Since the wwid link is greater than 100
+ characters an exception is thrown, causing not only the nvme device not
+ to be registered but all other storage devices as well.
+
+ This issue has blocked all test runs performed by the CDO-QA test
+ infrastructure, since every run installs MAAS on a fresh machine and
+ commissions new nodes. The failure is seen when installing from either
+ ppa:maas/next (2.2.0~beta2) or ppa:maas/stable (2.1.3+bzr5573).
+
ubuntu at meowth:~$ dpkg -l '*maas*'|cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============================-====================================-============-=================================================
ii maas 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all "Metal as a Service" is a physical cloud and IPAM
ii maas-cli 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS client and command-line interface
un maas-cluster-controller <none> <none> (no description available)
ii maas-common 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS server common files
ii maas-dhcp 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS DHCP server
ii maas-dns 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS DNS server
ii maas-proxy 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS Caching Proxy
ii maas-rack-controller 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all Rack Controller for MAAS
ii maas-region-api 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all Region controller API service for MAAS
ii maas-region-controller 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all Region Controller for MAAS
un maas-region-controller-min <none> <none> (no description available)
un python-django-maas <none> <none> (no description available)
un python-maas-client <none> <none> (no description available)
un python-maas-provisioningserver <none> <none> (no description available)
ii python3-django-maas 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS server Django web framework (Python 3)
ii python3-maas-client 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS python API client (Python 3)
ii python3-maas-provisioningserver 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS server provisioning libraries (Python 3)
After re-commissioning one of the servers with ssh enabled the attached
log files were collected. Please note that from the shell it can be seen
that block devices are discovered and even the commissioning output
found in /tmp/user_data.sh.IK9yVp/out/00-maas-07-block-devices lists
devices (see attached), where-as this file is shown as a 0 byte file
from the GUI (see screen shot).
There are 'HTTP Error 500: INTERNAL SERVER ERROR' errors in cloud-init-
output.log
ubuntu at azurill:~$ uname -a
Linux azurill 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:55:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
- ubuntu at azurill:~$ lsblk
- NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
- sdb 8:16 0 465.7G 0 disk
- sdc 8:32 0 179.1M 1 disk /media/root-ro
- sda 8:0 0 465.7G 0 disk
- ├─sda2 8:2 0 465.2G 0 part
- └─sda1 8:1 0 512M 0 part
+ ubuntu at azurill:~$ sudo lsblk --exclude 1,2,7 -d -P -o NAME,RO,RM,MODEL,ROTA
+ NAME="sdb" RO="0" RM="0" MODEL="LOGICAL VOLUME " ROTA="1"
+ NAME="sdc" RO="1" RM="0" MODEL="VIRTUAL-DISK " ROTA="1"
+ NAME="sda" RO="0" RM="0" MODEL="LOGICAL VOLUME " ROTA="1"
+ NAME="nvme0n1" RO="0" RM="0" MODEL="INTEL SSDPEDME400G4
** Description changed:
The udev package provides /lib/udev/rules.d/60-persistent-storage.rules
which creates two symlinks for nvme devices, under /dev/disk/by-id/. The
first link name includes the device wwid and the second includes the
- device model/serial. The commission script select the first link
+ device model/serial. The commission script selects the first link
discovered and subsequently attempts to store it in a FilePath field,
which allows for 100 characters. Since the wwid link is greater than 100
characters an exception is thrown, causing not only the nvme device not
- to be registered but all other storage devices as well.
+ to be registered but all other storage devices as well. Although
+ commissioning completes there is no storage assigned, which makes
+ deployment of the node impossible.
This issue has blocked all test runs performed by the CDO-QA test
infrastructure, since every run installs MAAS on a fresh machine and
commissions new nodes. The failure is seen when installing from either
ppa:maas/next (2.2.0~beta2) or ppa:maas/stable (2.1.3+bzr5573).
-
ubuntu at meowth:~$ dpkg -l '*maas*'|cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============================-====================================-============-=================================================
ii maas 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all "Metal as a Service" is a physical cloud and IPAM
ii maas-cli 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS client and command-line interface
un maas-cluster-controller <none> <none> (no description available)
ii maas-common 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS server common files
ii maas-dhcp 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS DHCP server
ii maas-dns 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS DNS server
ii maas-proxy 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS Caching Proxy
ii maas-rack-controller 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all Rack Controller for MAAS
ii maas-region-api 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all Region controller API service for MAAS
ii maas-region-controller 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all Region Controller for MAAS
un maas-region-controller-min <none> <none> (no description available)
un python-django-maas <none> <none> (no description available)
un python-maas-client <none> <none> (no description available)
un python-maas-provisioningserver <none> <none> (no description available)
ii python3-django-maas 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS server Django web framework (Python 3)
ii python3-maas-client 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS python API client (Python 3)
ii python3-maas-provisioningserver 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS server provisioning libraries (Python 3)
After re-commissioning one of the servers with ssh enabled the attached
log files were collected. Please note that from the shell it can be seen
that block devices are discovered and even the commissioning output
found in /tmp/user_data.sh.IK9yVp/out/00-maas-07-block-devices lists
devices (see attached), where-as this file is shown as a 0 byte file
from the GUI (see screen shot).
There are 'HTTP Error 500: INTERNAL SERVER ERROR' errors in cloud-init-
output.log
ubuntu at azurill:~$ uname -a
Linux azurill 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:55:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
ubuntu at azurill:~$ sudo lsblk --exclude 1,2,7 -d -P -o NAME,RO,RM,MODEL,ROTA
NAME="sdb" RO="0" RM="0" MODEL="LOGICAL VOLUME " ROTA="1"
NAME="sdc" RO="1" RM="0" MODEL="VIRTUAL-DISK " ROTA="1"
NAME="sda" RO="0" RM="0" MODEL="LOGICAL VOLUME " ROTA="1"
NAME="nvme0n1" RO="0" RM="0" MODEL="INTEL SSDPEDME400G4
** Summary changed:
- commissioning does not discover block devices on HP ProLiant DL360 Gen9 servers
+ Commission scripts select the wrong nvme device link, then fail to report any storage
** Also affects: udev (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to udev in Ubuntu.
https://bugs.launchpad.net/bugs/1665143
Title:
Commission scripts select the wrong nvme device link, then fail to
report any storage
Status in MAAS:
Fix Committed
Status in MAAS 2.1 series:
Won't Fix
Status in udev package in Ubuntu:
New
Bug description:
The udev package provides /lib/udev/rules.d/60-persistent-
storage.rules which creates two symlinks for nvme devices, under
/dev/disk/by-id/. The first link name includes the device wwid and the
second includes the device model/serial. The commission script selects
the first link discovered and subsequently attempts to store it in a
FilePath field, which allows for 100 characters. Since the wwid link
is greater than 100 characters an exception is thrown, causing not
only the nvme device not to be registered but all other storage
devices as well. Although commissioning completes there is no storage
assigned, which makes deployment of the node impossible.
This issue has blocked all test runs performed by the CDO-QA test
infrastructure, since every run installs MAAS on a fresh machine and
commissions new nodes. The failure is seen when installing from either
ppa:maas/next (2.2.0~beta2) or ppa:maas/stable (2.1.3+bzr5573).
ubuntu at meowth:~$ dpkg -l '*maas*'|cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============================-====================================-============-=================================================
ii maas 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all "Metal as a Service" is a physical cloud and IPAM
ii maas-cli 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS client and command-line interface
un maas-cluster-controller <none> <none> (no description available)
ii maas-common 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS server common files
ii maas-dhcp 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS DHCP server
ii maas-dns 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS DNS server
ii maas-proxy 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS Caching Proxy
ii maas-rack-controller 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all Rack Controller for MAAS
ii maas-region-api 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all Region controller API service for MAAS
ii maas-region-controller 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all Region Controller for MAAS
un maas-region-controller-min <none> <none> (no description available)
un python-django-maas <none> <none> (no description available)
un python-maas-client <none> <none> (no description available)
un python-maas-provisioningserver <none> <none> (no description available)
ii python3-django-maas 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS server Django web framework (Python 3)
ii python3-maas-client 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS python API client (Python 3)
ii python3-maas-provisioningserver 2.2.0~beta2+bzr5717-0ubuntu1~16.04.1 all MAAS server provisioning libraries (Python 3)
After re-commissioning one of the servers with ssh enabled the
attached log files were collected. Please note that from the shell it
can be seen that block devices are discovered and even the
commissioning output found in /tmp/user_data.sh.IK9yVp/out/00-maas-07
-block-devices lists devices (see attached), where-as this file is
shown as a 0 byte file from the GUI (see screen shot).
There are 'HTTP Error 500: INTERNAL SERVER ERROR' errors in cloud-
init-output.log
ubuntu at azurill:~$ uname -a
Linux azurill 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:55:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
ubuntu at azurill:~$ sudo lsblk --exclude 1,2,7 -d -P -o NAME,RO,RM,MODEL,ROTA
NAME="sdb" RO="0" RM="0" MODEL="LOGICAL VOLUME " ROTA="1"
NAME="sdc" RO="1" RM="0" MODEL="VIRTUAL-DISK " ROTA="1"
NAME="sda" RO="0" RM="0" MODEL="LOGICAL VOLUME " ROTA="1"
NAME="nvme0n1" RO="0" RM="0" MODEL="INTEL SSDPEDME400G4
To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1665143/+subscriptions
More information about the foundations-bugs
mailing list