[Bug 1931004] Re: Add support for Pacific to RBD driver
OpenStack Infra
1931004 at bugs.launchpad.net
Fri Sep 17 15:59:42 UTC 2021
Reviewed: https://review.opendev.org/c/openstack/cinder/+/808475
Committed: https://opendev.org/openstack/cinder/commit/07ead73eec0ac6b962b533b07861d6a81226fa37
Submitter: "Zuul (22348)"
Branch: stable/victoria
commit 07ead73eec0ac6b962b533b07861d6a81226fa37
Author: Jon Bernard <jobernar at redhat.com>
Date: Wed Apr 14 11:14:13 2021 -0400
RBD: use correct stripe unit in clone operation
The recent release of Ceph Pacific saw a change to the clone() logic
where invalid values of stripe unit would cause an error to be returned
where previous versions would correct the value at runtime. This
becomes a problem when creating a volume from an image, where the source
RBD image may have a larger stripe unit than cinder's RBD driver is
configured for. When this happens, clone() is called with a stripe unit
that is too small given that of the source image and the clone fails.
The RBD driver in Cinder has a configuration parameter
'rbd_store_chunk_size' that stores the preferred object size for cloned
images. If clone() is called without a stripe_unit passed in, the
stripe unit defaults to the object size, which is 4MB by default. The
issue arises when creating a volume from a Glance image, where Glance is
creating images with a default stripe unit of 8MB (distinctly larger
than that of Cinder). If we do not consider the incoming stripe unit
and select the larger of the two, Ceph cannot clone an RBD image with a
smaller stripe unit and raises an error.
This patch adds a function in our driver's clone logic to select the
larger of the two stripe unit values so that the appropriate stripe unit
is chosen.
It should also be noted that we're determining the correct stripe unit,
but using the 'order' argument to clone(). Ceph will set the stripe
unit equal to the object size (order) by default and we rely on this
behaviour for the following reason: passing stripe-unit alone or with
an order argument causes an invalid argument exception to be raised in
pre-pacific releases of Ceph, as it's argument parsing appears to have
limitations.
Closes-Bug: #1931004
Change-Id: Iec111ab83e9ed8182c9679c911e3d90927d5a7c3
(cherry picked from commit 49a2c85eda9fd3cddc75fd904fe62c87a6b50735)
(cherry picked from commit 5db58159feec3d2d39d1abf3637310f5ac60a3cf)
Conflicts:
cinder/tests/unit/volume/drivers/test_rbd.py
** Tags added: in-stable-victoria
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to glance in Ubuntu.
https://bugs.launchpad.net/bugs/1931004
Title:
Add support for Pacific to RBD driver
Status in Cinder:
Fix Released
Status in Ubuntu Cloud Archive:
New
Status in Ubuntu Cloud Archive wallaby series:
New
Status in Ubuntu Cloud Archive xena series:
New
Status in glance package in Ubuntu:
Confirmed
Status in glance source package in Hirsute:
Confirmed
Status in glance source package in Impish:
Confirmed
Bug description:
When using ceph pacific, volume-from-image operations where both
glance and cinder are configured to use RBD result in an exception
when calling clone():
rbd.InvalidArgument: [errno 22] RBD invalid argument (error
creating clone)
ERROR cinder.volume.manager Traceback (most recent call last):
ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task
ERROR cinder.volume.manager result = task.execute(**arguments)
ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/volume/flows/manager/create_volume.py", line 1132, in execute
ERROR cinder.volume.manager model_update = self._create_from_image(context,
ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/utils.py", line 638, in _wrapper
ERROR cinder.volume.manager return r.call(f, *args, **kwargs)
ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 411, in call
ERROR cinder.volume.manager return self.__call__(*args, **kwargs)
ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 423, in __call__
ERROR cinder.volume.manager do = self.iter(retry_state=retry_state)
ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 360, in iter
ERROR cinder.volume.manager return fut.result()
ERROR cinder.volume.manager File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 438, in result
ERROR cinder.volume.manager return self.__get_result()
ERROR cinder.volume.manager File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 390, in __get_result
ERROR cinder.volume.manager raise self._exception
ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 426, in __call__
ERROR cinder.volume.manager result = fn(*args, **kwargs)
ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/volume/flows/manager/create_volume.py", line 998, in _create_from_image
ERROR cinder.volume.manager model_update, cloned = self.driver.clone_image(context,
ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 1571, in clone_image
ERROR cinder.volume.manager volume_update = self._clone(volume, pool, image, snapshot)
ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 1023, in _clone
ERROR cinder.volume.manager self.RBDProxy().clone(src_client.ioctx,
ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/eventlet/tpool.py", line 190, in doit
ERROR cinder.volume.manager result = proxy_call(self._autowrap, f, *args, **kwargs)
ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/eventlet/tpool.py", line 148, in proxy_call
ERROR cinder.volume.manager rv = execute(f, *args, **kwargs)
ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/eventlet/tpool.py", line 129, in execute
ERROR cinder.volume.manager six.reraise(c, e, tb)
ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/six.py", line 719, in reraise
ERROR cinder.volume.manager raise value
ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/eventlet/tpool.py", line 83, in tworker
ERROR cinder.volume.manager rv = meth(*args, **kwargs)
ERROR cinder.volume.manager File "rbd.pyx", line 698, in rbd.RBD.clone
ERROR cinder.volume.manager rbd.InvalidArgument: [errno 22] RBD invalid argument (error creating clone)
ERROR cinder.volume.manager
In Pacific a check was added to make sure during a clone operation
that the child's strip unit was not less than that of its parent.
Failing this condition returns -EINVAL, which is then raised by
python-rbd as an exception. This maps to the 'order' argument in
clone(), where order is log base 2 of the strip unit. Ceph's default
is 4 megabytes. The reason we're seeing EINVAL exceptions in the
Pacific CI is that: when Openstack is configured to use Ceph for both
cinder and glance, volume-from-image tests fail because Glance's
default stripe unit is 8 (distinctly larger than Cinder's 4). This
results in an order calculation of 22, which is invalid for clone()
(too small).
I see two possible solutions and have proposed patches:
1. Increase Cinder's default chunk size to match Glance's. I think
this makes sense for both consistency and performance.
2. When doing a clone(), consider the configured chunk size /and/ the
strip unit of the parent volume and choose the higher value.
Either of these approaches prevent the failures we're seeing, I think
they are both useful individually as well.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1931004/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list