[Bug 2121812] Re: [SRU] cinder-netapp driver fails to start when a node is down
Nick Rosbrook
2121812 at bugs.launchpad.net
Wed Oct 15 12:57:53 UTC 2025
Hello Mostafa, or anyone else affected,
Accepted cinder into noble-proposed. The package will build now and be
available at
https://launchpad.net/ubuntu/+source/cinder/2:24.2.0-0ubuntu2 in a few
hours, and then in the -proposed repository.
Please help us by testing this new package. See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed. Your feedback will aid us getting this
update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
noble to verification-done-noble. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-noble. In either case, without details of your testing we will
not be able to proceed.
Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
advance for helping!
N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.
** Changed in: cinder (Ubuntu Noble)
Status: New => Fix Committed
** Tags added: verification-needed-noble
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/2121812
Title:
[SRU] cinder-netapp driver fails to start when a node is down
Status in Cinder:
In Progress
Status in Ubuntu Cloud Archive:
New
Status in Ubuntu Cloud Archive caracal series:
New
Status in Ubuntu Cloud Archive epoxy series:
New
Status in cinder package in Ubuntu:
Fix Released
Status in cinder source package in Noble:
Fix Committed
Status in cinder source package in Plucky:
Fix Committed
Status in cinder source package in Questing:
Fix Released
Bug description:
We're trying to use cinder-netapp, and the driver fails to start with the following error.
https://pastebin.ubuntu.com/p/dwMdrVwdtf/
After enabling debugging, I noticed that one node doesn't report its model.
Checking on netapp side, the node is down due to a hardware issue, and another one is taking over the current node.
If i add or '' to get_cluster_nodes_info. it works fine.
In this current setup, if any node goes down and the driver restarts,
Cinder won't be able to recover.
***************************
[SRU]
[Impact]
cinder-volume service fails to start when cinder-netapp driver is used with some netapp nodes in maintenance state.
The service during driver initialization queries the netapp server for all the nodes information. Typically the node
name, model and certain other attributes are expected. However in case if the node is in maintenance mode, the
model information is missing.
cinder-volume service does not handle properly in case the model value is None and so the service goes to failed state.
The fix checks if the model value is None and assigns empty string as default value. In addition, a warning message is
logged for missing model values.
[Test Case]
To test the bug, we need netapp storage nodes. Instead developed a
small python script that responds to couple of netapp requests that
are required to reproduce the bug.
Here are the reproducer steps:
1. Deploy regress-stack (https://github.com/canonical/regress-stack)
2. Install the packages required for cinder service to get setup
sudo snap install astral-uv --classic
sudo apt-get update
uvx pre-commit install
sudo apt install dpkg-dev python3-dev python-apt-dev -y
uv sync
sudo apt install ceph mysql-server rabbitmq-server keystone cinder-api cinder-scheduler cinder-volume -y
3. Run regress-stack setup step
uv run regress-stack setup
4. Simulate netapp server
The simulated code responds to couple of initial netapp requests and you can
see in L#33 one of the node has no model information.
Python code: https://pastebin.ubuntu.com/p/7pBSXzFSGY/
python <netapp.py>
5. Update cinder.conf to add netapp configuration
https://pastebin.ubuntu.com/p/xK52rT5f7V/ (modified cinder configs)
systemctl restart cinder-volume.service
6. Check for cinder-volume logs
Non-Working case:
Should fail with error `TypeError: argument of type 'NoneType' is not iterable`
Working case:
Should see a printout in logs: `Reported ONTAPI Version: 1.261`
(At this point of time the code execution crossed the bug)
Note: The service wont start and fail with following error as the
simulation does not support all the netapp API calls.
cinder.volume.drivers.netapp.dataontap.client.api.NaApiError:
NetApp API failed. Reason - 400:BAD REQUEST
[Regression Potential]
In order to mitigate any regression potential, the fix has been tested with real hardware for jammy caracal. The unit test cases are also updated to verify the bug. Also the default logic is not changed when model is provided by netapp server.
[Discussion]
n/a
To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/2121812/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list