[Bug 1986747] Re: [quincy] invalid osd_class_dir blocks rados client connections

Dan Hill 1986747 at bugs.launchpad.net
Wed Aug 17 19:31:40 UTC 2022


While attempting to recreate this issue, we noticed a new Jammy/Quincy
installation will load the correct library files!

The freshly deployed Quincy reports the incorrect `osd_class_dir` (lib/x86_64-linux-gnu/rados-classes), but lsof shows the ceph-osd process is loading rados-classes libraries from /usr/lib:
ceph-osd   735836                      64045  mem       REG                8,2     530688   35521076 /usr/lib/x86_64-linux-gnu/rados-classes/libcls_rbd.so.1.0.0

How is this happening?

It turns out that the correct library file is loaded by accident. New
Jammy installations have /lib symlinked to /usr/lib. The ceph-osd
processes have a working directory of "/" so the relative "lib/" path
follows the symlink to the correct "/usr/lib" location.

Starting with 19.04, merged-usr is now the default for new installations
[0]. This merged-usr feature [1] is what creates the symlink from /lib
-> /usr/lib.

The key thing to note here is that these symlinks do not get created on hosts that have been upgraded:
"Existing systems, upon upgrade, will not be reconfigured for merged-usr."

Hosts without the /lib -> /usr/lib symlink are exposed to the issue
described in this bug. This will occur if the host was originally
installed with a level that pre-dates 19.04 (Disco Dingo).

[0] https://lists.ubuntu.com/archives/ubuntu-devel-announce/2018-November/001253.html
[1] https://www.freedesktop.org/wiki/Software/systemd/TheCaseForTheUsrMerge/

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1986747

Title:
  [quincy] invalid osd_class_dir blocks rados client connections

Status in Ubuntu Cloud Archive:
  New
Status in Ubuntu Cloud Archive yoga series:
  New
Status in ceph package in Ubuntu:
  Confirmed
Status in ceph source package in Jammy:
  Confirmed
Status in ceph source package in Kinetic:
  Confirmed

Bug description:
  Ubuntu packaging is configuring `osd_class_dir` with a relative path
  `CMAKE_INSTALL_LIBDIR` instead of the required absolute path
  `CMAKE_INSTALL_FULL_LIBDIR` [0].

  The default value for `osd_class_dir` changed in Quincy, starting with
  v17.1.0 [1].

  The ceph-osd service relies on the `osd_class_dir` path to find and load class libraries that extend RADOS [2]. When this is set incorrectly, RADOS clients fail with repeated "Operation not supported" errors:
  ```
  2022-08-16T17:42:15.044+0000 7fe375685e40 0 rgw main: ERROR: failed reading data (obj=default.rgw.log:bucket.sync-target-hints.), r=-95
  2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to read targets index for bucket=:[]) r=-95
  2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to initialize bucket sync policy handler: get_bucket_sync_hints() on bucket=-- returned r=-95
  2022-08-16T17:42:15.048+0000 7fe375685e40 -1 rgw main: ERROR: could not initialize zone policy handler for zone=default
  2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to start notify service ((95) Operation not supported
  2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to init services (ret=(95) Operation not supported)
  ```

  The ceph-osd service will also report `_load_class` errors:
  ```
  2022-08-16T19:05:55.562+0000 7f4770ff9700 0 _load_class could not stat class lib/x86_64-linux-gnu/rados-classes/libcls_rbd.so: (2) No such file or directory
  ```

  Admins can resolve this issue by manually setting `osd_class_dir` to the correct value. Run the following command on a ceph-mon:
  ```
  sudo ceph config set global osd_class_dir /usr/lib/x86_64-linux-gnu/rados-classes
  ```

  Then restart all ceph-osd services to pick up the new `osd_class_dir`
  location.

  [0] https://cmake.org/cmake/help/v3.24/module/GNUInstallDirs.html#result-variables
  [1] https://github.com/ceph/ceph/commit/3bee4b02611459b9ae949cebf5967e4d83ef55de
  [2] https://docs.ceph.com/en/latest/dev/osd-class-path/

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1986747/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list