[Bug 2112137] Re: ceph-dashboard crashes ceph-mgr when listing images

Fri Aug 15 16:07:51 UTC 2025

this is from ceph crash

root at juju-206dbb-5-lxd-1:~# ceph crash ls
ID                                                                ENTITY                     NEW  
2025-08-15T14:54:50.871612Z_04daea1c-7333-4e54-b02f-99c7c9224f54  mgr.juju-206dbb-10-lxd-10   *   
root at juju-206dbb-5-lxd-1:~# ceph crash info 2025-08-15T14:54:50.871612Z_04daea1c-7333-4e54-b02f-99c7c9224f54
{
    "assert_condition": "object_diff_state.size() == end_object_no - start_object_no",
    "assert_file": "./src/librbd/api/DiffIterate.cc",
    "assert_func": "int librbd::api::DiffIterate<ImageCtxT>::execute() [with ImageCtxT = librbd::ImageCtx]",
    "assert_line": 341,
    "assert_msg": "./src/librbd/api/DiffIterate.cc: In function 'int librbd::api::DiffIterate<ImageCtxT>::execute() [with ImageCtxT = librbd::ImageCtx]' thread 7f6373070640 time 2025-08-15T14:54:50.867562+0000\n./src/librbd/api/DiffIterate.cc: 341: FAILED ceph_assert(object_diff_state.size() == end_object_no - start_object_no)\n",
    "assert_thread_name": "dashboard",
    "backtrace": [
        "/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f63d1f7f520]",
        "pthread_kill()",
        "raise()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x182) [0x7f63d27431e9]",
        "/usr/lib/x86_64-linux-gnu/ceph/libceph-common.so.2(+0x26b34b) [0x7f63d274334b]",
        "/lib/x86_64-linux-gnu/librbd.so.1(+0x1805de) [0x7f63c36d25de]",
        "/lib/x86_64-linux-gnu/librbd.so.1(+0x18148b) [0x7f63c36d348b]",
        "rbd_diff_iterate2()",
        "/lib/python3/dist-packages/rbd.cpython-310-x86_64-linux-gnu.so(+0x6383c) [0x7f63c3e1f83c]",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0xeb001) [0x7f63d32c0001]",
        "PyVectorcall_Call()",
        "/lib/python3/dist-packages/rbd.cpython-310-x86_64-linux-gnu.so(+0x44dc5) [0x7f63c3e00dc5]",
        "_PyObject_MakeTpCall()",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0xe242a) [0x7f63d32b742a]",
        "_PyEval_EvalFrameDefault()",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x1c24df) [0x7f63d33974df]",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0xe23e8) [0x7f63d32b73e8]",
        "_PyEval_EvalFrameDefault()",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x1c24df) [0x7f63d33974df]",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0xe23e8) [0x7f63d32b73e8]",
        "_PyEval_EvalFrameDefault()",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x1c24df) [0x7f63d33974df]",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0xe23e8) [0x7f63d32b73e8]",
        "_PyEval_EvalFrameDefault()",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x1c24df) [0x7f63d33974df]",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0xe23e8) [0x7f63d32b73e8]",
        "_PyEval_EvalFrameDefault()",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x1c24df) [0x7f63d33974df]",
        "/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0xe23e8) [0x7f63d32b73e8]",
        "_PyEval_EvalFrameDefault()"
    ],
    "ceph_version": "18.2.4",
    "crash_id": "2025-08-15T14:54:50.871612Z_04daea1c-7333-4e54-b02f-99c7c9224f54",
    "entity_name": "mgr.juju-206dbb-10-lxd-10",
    "os_id": "22.04",
    "os_name": "Ubuntu 22.04.5 LTS",
    "os_version": "22.04.5 LTS (Jammy Jellyfish)",
    "os_version_id": "22.04",
    "process_name": "ceph-mgr",
    "stack_sig": "ac76522f812876d6f5a64eab765c475ae18c51a958a9708dd53ee8a13b98cf66",
    "timestamp": "2025-08-15T14:54:50.871612Z",
    "utsname_hostname": "juju-206dbb-10-lxd-10",
    "utsname_machine": "x86_64",
    "utsname_release": "5.15.0-151-generic",
    "utsname_sysname": "Linux",
    "utsname_version": "#161-Ubuntu SMP Tue Jul 22 14:25:40 UTC 2025"
}

** Description changed:

- Using ceph-dashboard, when trying to list all the images from a pool,
- after few seconds I got an error and often ceph-mgr crashes and cannot
- restart immediately, I've to wait some minutes
+ ubuntu: jammy LTS
+ openstack: bobcat
+ ceph: reef
+ juju bundle: 2023.2

- # systemctl status ceph-mgr at juju-206dbb-4-lxd-1.service 
+ Using ceph-dashboard, when trying to list all the images from a pool, after few seconds I got an error and often ceph-mgr crashes and cannot restart immediately, I've to wait some minutes
+ 
+ # systemctl status ceph-mgr at juju-206dbb-4-lxd-1.service
  × ceph-mgr at juju-206dbb-4-lxd-1.service - Ceph cluster manager daemon
-      Loaded: loaded (/lib/systemd/system/ceph-mgr at .service; enabled; vendor preset: enabled)
-      Active: failed (Result: signal) since Thu 2025-05-29 17:26:30 UTC; 26min ago
-     Process: 181713 ExecStart=/usr/bin/ceph-mgr -f --cluster ${CLUSTER} --id juju-206dbb-4-lxd-1 --setuser ceph --setgroup ceph (code=killed, signal=ABRT)
-    Main PID: 181713 (code=killed, signal=ABRT)
-         CPU: 8.134s
+      Loaded: loaded (/lib/systemd/system/ceph-mgr at .service; enabled; vendor preset: enabled)
+      Active: failed (Result: signal) since Thu 2025-05-29 17:26:30 UTC; 26min ago
+     Process: 181713 ExecStart=/usr/bin/ceph-mgr -f --cluster ${CLUSTER} --id juju-206dbb-4-lxd-1 --setuser ceph --setgroup ceph (code=killed, signal=ABRT)
+    Main PID: 181713 (code=killed, signal=ABRT)
+         CPU: 8.134s

  May 29 17:26:30 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Consumed 8.134s CPU time.
  May 29 17:26:30 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Start request repeated too quickly.
  May 29 17:26:30 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Failed with result 'signal'.
  May 29 17:26:30 juju-206dbb-4-lxd-1 systemd[1]: Failed to start Ceph cluster manager daemon.
  May 29 17:30:38 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Start request repeated too quickly.
  May 29 17:30:38 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Failed with result 'signal'.
  May 29 17:30:38 juju-206dbb-4-lxd-1 systemd[1]: Failed to start Ceph cluster manager daemon.
  May 29 17:32:55 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Start request repeated too quickly.
  May 29 17:32:55 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Failed with result 'signal'.
  May 29 17:32:55 juju-206dbb-4-lxd-1 systemd[1]: Failed to start Ceph cluster manager daemon.
- 

  Here the log of ceph-mgr
  https://dpaste.com/285W6JNB7

  # lsb_release -a
  No LSB modules are available.
  Distributor ID: Ubuntu
  Description:    Ubuntu 22.04.5 LTS
  Release:        22.04
  Codename:       jammy

  # apt list --installed |grep ceph

  WARNING: apt does not have a stable CLI interface. Use with caution in
  scripts.

  ceph-base/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-common/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-mds/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-mgr-dashboard/jammy-updates,now 18.2.4-0ubuntu1~cloud1 all [installed]
  ceph-mgr-modules-core/jammy-updates,now 18.2.4-0ubuntu1~cloud1 all [installed,automatic]
  ceph-mgr/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-mon/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-osd/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-volume/jammy-updates,now 18.2.4-0ubuntu1~cloud1 all [installed,automatic]
  ceph/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed]
  libcephfs2/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  libsqlite3-mod-ceph/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  python3-ceph-argparse/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  python3-ceph-common/jammy-updates,now 18.2.4-0ubuntu1~cloud1 all [installed,automatic]
  python3-cephfs/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/2112137

Title:
  ceph-dashboard crashes ceph-mgr  when listing images

Status in ceph package in Ubuntu:
  New

Bug description:
  ubuntu: jammy LTS
  openstack: bobcat
  ceph: reef
  juju bundle: 2023.2

  Using ceph-dashboard, when trying to list all the images from a pool, after few seconds I got an error and often ceph-mgr crashes and cannot restart immediately, I've to wait some minutes

  # systemctl status ceph-mgr at juju-206dbb-4-lxd-1.service
  × ceph-mgr at juju-206dbb-4-lxd-1.service - Ceph cluster manager daemon
       Loaded: loaded (/lib/systemd/system/ceph-mgr at .service; enabled; vendor preset: enabled)
       Active: failed (Result: signal) since Thu 2025-05-29 17:26:30 UTC; 26min ago
      Process: 181713 ExecStart=/usr/bin/ceph-mgr -f --cluster ${CLUSTER} --id juju-206dbb-4-lxd-1 --setuser ceph --setgroup ceph (code=killed, signal=ABRT)
     Main PID: 181713 (code=killed, signal=ABRT)
          CPU: 8.134s

  May 29 17:26:30 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Consumed 8.134s CPU time.
  May 29 17:26:30 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Start request repeated too quickly.
  May 29 17:26:30 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Failed with result 'signal'.
  May 29 17:26:30 juju-206dbb-4-lxd-1 systemd[1]: Failed to start Ceph cluster manager daemon.
  May 29 17:30:38 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Start request repeated too quickly.
  May 29 17:30:38 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Failed with result 'signal'.
  May 29 17:30:38 juju-206dbb-4-lxd-1 systemd[1]: Failed to start Ceph cluster manager daemon.
  May 29 17:32:55 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Start request repeated too quickly.
  May 29 17:32:55 juju-206dbb-4-lxd-1 systemd[1]: ceph-mgr at juju-206dbb-4-lxd-1.service: Failed with result 'signal'.
  May 29 17:32:55 juju-206dbb-4-lxd-1 systemd[1]: Failed to start Ceph cluster manager daemon.

  Here the log of ceph-mgr
  https://dpaste.com/285W6JNB7

  # lsb_release -a
  No LSB modules are available.
  Distributor ID: Ubuntu
  Description:    Ubuntu 22.04.5 LTS
  Release:        22.04
  Codename:       jammy

  # apt list --installed |grep ceph

  WARNING: apt does not have a stable CLI interface. Use with caution in
  scripts.

  ceph-base/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-common/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-mds/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-mgr-dashboard/jammy-updates,now 18.2.4-0ubuntu1~cloud1 all [installed]
  ceph-mgr-modules-core/jammy-updates,now 18.2.4-0ubuntu1~cloud1 all [installed,automatic]
  ceph-mgr/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-mon/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-osd/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  ceph-volume/jammy-updates,now 18.2.4-0ubuntu1~cloud1 all [installed,automatic]
  ceph/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed]
  libcephfs2/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  libsqlite3-mod-ceph/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  python3-ceph-argparse/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]
  python3-ceph-common/jammy-updates,now 18.2.4-0ubuntu1~cloud1 all [installed,automatic]
  python3-cephfs/jammy-updates,now 18.2.4-0ubuntu1~cloud1 amd64 [installed,automatic]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2112137/+subscriptions