[Bug 2000630] Re: "crash" module is always on but not properly configured
Nobuto Murata
2000630 at bugs.launchpad.net
Tue Oct 24 07:28:52 UTC 2023
Opening a packaging task. Posting the crash data itself succeeds after
having the patches to charms.
However, the ceph-crash process cannot move the posted crash to the
"posted/" directory due to a permission issue.
Oct 24 07:21:39 more-llama ceph-crash[27895]: ERROR:ceph-crash:Error
scraping /var/lib/ceph/crash: [Errno 13] Permission denied:
'/var/lib/ceph/crash/2023-10-24T07:15:21.937207Z_f08b6b76-fae0-458a-969b-e105dab0b327'
->
'/var/lib/ceph/crash/posted/2023-10-24T07:15:21.937207Z_f08b6b76-fae0-458a-969b-e105dab0b327'
# ll /var/lib/ceph/crash/
total 28
drwxr-xr-x 7 ceph ceph 4096 Oct 24 07:15 ./
drwxr-x--- 15 ceph ceph 4096 Oct 24 04:21 ../
drwx------ 2 ceph ceph 4096 Oct 24 07:15 2023-10-24T07:15:21.937207Z_f08b6b76-fae0-458a-969b-e105dab0b327/
drwx------ 2 ceph ceph 4096 Oct 24 07:15 2023-10-24T07:15:21.937914Z_f9eccf77-39fa-440f-8b26-c410edece34a/
drwx------ 2 ceph ceph 4096 Oct 24 07:15 2023-10-24T07:15:51.704821Z_05bd68eb-4da6-4e6c-a7fa-3399c6a9d1cc/
drwx------ 2 ceph ceph 4096 Oct 24 07:15 2023-10-24T07:15:51.705072Z_47e7d9a9-8f61-402e-b62f-d2abbea11735/
drwxr-xr-x 2 root root 4096 May 26 14:42 posted/
# dpkg -S /var/lib/ceph/crash/posted/
ceph-base: /var/lib/ceph/crash/posted
/var/lib/ceph/crash/posted/ should be owned by ceph:ceph instead of
root:root as the process is running as the ceph user.
# apt policy ceph-base
ceph-base:
Installed: 17.2.6-0ubuntu0.22.04.1
Candidate: 17.2.6-0ubuntu0.22.04.1
Version table:
*** 17.2.6-0ubuntu0.22.04.1 500
500 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
100 /var/lib/dpkg/status
17.2.5-0ubuntu0.22.04.3 500
500 http://archive.ubuntu.com/ubuntu jammy-security/main amd64 Packages
17.1.0-0ubuntu3 500
500 http://archive.ubuntu.com/ubuntu jammy/main amd64 Packages
** Also affects: ceph (Ubuntu)
Importance: Undecided
Status: New
** Changed in: charm-ceph-mon
Status: Fix Committed => Fix Released
** Changed in: charm-ceph-mon/quincy.2
Status: Fix Committed => Fix Released
** Changed in: charm-ceph-osd
Status: Fix Committed => Fix Released
** Changed in: charm-ceph-osd/quincy.2
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/2000630
Title:
"crash" module is always on but not properly configured
Status in Ceph Monitor Charm:
Fix Released
Status in Ceph Monitor Charm quincy.2 series:
Fix Released
Status in Ceph OSD Charm:
Fix Released
Status in Ceph OSD Charm quincy.2 series:
Fix Released
Status in ceph package in Ubuntu:
New
Bug description:
cloud:focal-yoga (quincy)
$ juju ssh ceph-mon/leader -- sudo ceph version
ceph version 17.2.0 (43e2e60a7559d3f46c9d53f1ca875fd499a1e35e) quincy (stable)
How to reproduce:
1. make sure "crash" module is on (it's a part of "always on" modules)
https://docs.ceph.com/en/quincy/mgr/crash/
$ juju ssh ceph-mon/leader -- sudo ceph mgr module ls | grep crash
crash on (always on)
2. intentionally crash ceph-osd process (in this example I used
SIGSEGV)
$ juju ssh ceph-osd/leader -- sudo pkill --signal SIGSEGV ceph-osd
3. make sure a normal crash file is generated for apport *and* a set
of files for ceph crash module.
# ll -h /var/crash/
total 121M
drwxrwxrwt 2 root root 4.0K Dec 28 10:42 ./
drwxr-xr-x 13 root root 4.0K Dec 12 21:41 ../
-rw-r----- 1 ceph ceph 121M Dec 28 10:42 _usr_bin_ceph-osd.64045.crash
# ll -h /var/lib/ceph/crash/*
'/var/lib/ceph/crash/2022-12-28T10:42:04.661282Z_51be6c87-4a42-4fbb-afe5-264d94cd6c79':
total 1.6M
drwx------ 2 ceph ceph 4.0K Dec 28 10:42 ./
drwxr-xr-x 4 ceph ceph 4.0K Dec 28 10:42 ../
-r--r--r-- 1 ceph ceph 0 Dec 28 10:42 done
-rw-r--r-- 1 ceph ceph 1.6M Dec 28 10:42 log
-rw------- 1 ceph ceph 926 Dec 28 10:42 meta
/var/lib/ceph/crash/posted:
total 8.0K
drwxr-xr-x 2 root root 4.0K Sep 13 17:47 ./
drwxr-xr-x 4 ceph ceph 4.0K Dec 28 10:42 ../
4. check syslog for post failures to MON units.
Dec 28 10:51:18 famous-skunk ceph-crash[10667]: WARNING:ceph-
crash:post
/var/lib/ceph/crash/2022-12-28T10:42:04.661282Z_51be6c87-4a42-4fbb-
afe5-264d94cd6c79 as client.crash.famous-skunk failed: (None,
b'2022-12-28T10:51:18.368+0000 7f427dc2f700 -1 auth: unable to find a
keyring on /etc/ceph/ceph.client.crash.famous-
skunk.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin:
(2) No such file or directory\n2022-12-28T10:51:18.368+0000
7f427dc2f700 -1 AuthRegistry(0x7f427805f4f0) no keyring found at
/etc/ceph/ceph.client.crash.famous-
skunk.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
disabling cephx\n2022-12-28T10:51:18.376+0000 7f427c9cd700 -1 auth:
unable to find a keyring on /etc/ceph/ceph.client.crash.famous-
skunk.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin:
(2) No such file or directory\n2022-12-28T10:51:18.376+0000
7f427c9cd700 -1 AuthRegistry(0x7f4278065748) no keyring found at
/etc/ceph/ceph.client.crash.famous-
skunk.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
disabling cephx\n2022-12-28T10:51:18.376+0000 7f427c9cd700 -1 auth:
unable to find a keyring on /etc/ceph/ceph.client.crash.famous-
skunk.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin:
(2) No such file or directory\n2022-12-28T10:51:18.376+0000
7f427c9cd700 -1 AuthRegistry(0x7f427c9cc000) no keyring found at
/etc/ceph/ceph.client.crash.famous-
skunk.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
disabling cephx\n[errno 2] RADOS object not found (error connecting to
the cluster)\n')
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceph-mon/+bug/2000630/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list