[Bug 2039955] Re: Opening NFS tab in the dashboard leads to ceph mgr crash - orchestrator._interface.NoOrchestrator: No orchestrator configured
Samuel Walladge
2039955 at bugs.launchpad.net
Mon Nov 20 05:38:52 UTC 2023
Definitely an upstream issue, not related to the ceph-dashboard charm.
Exploring the ceph repository:
`src/pybind/mgr/dashboard/controllers/nfs.py`
```
@Endpoint()
@ReadPermission
def status(self):
status = {'available': True, 'message': None}
try:
# this is where the call happens that causes the crash - the crash is coming from ceph though, not the fault of this
# NOTE: running `sudo ceph nfs cluster ls` prints:
# Error ENOENT: No orchestrator configured (try `ceph orch set backend`)
# but does not show a traceback.
# This may be limited to the python api?
mgr.remote('nfs', 'cluster_ls')
except (ImportError, RuntimeError) as error:
logger.exception(error)
status['available'] = False
status['message'] = str(error) # type: ignore
return status
```
When the orchestrator is not present, we see this traceback:
```
{
"archived": "2023-11-20 04:58:57.151697",
"backtrace": [
" File \"/usr/share/ceph/mgr/nfs/module.py\", line 169, in cluster_ls\n return available_clusters(self)",
" File \"/usr/share/ceph/mgr/nfs/utils.py\", line 38, in available_clusters\n completion = mgr.describe_service(service_type='nfs')",
" File \"/usr/share/ceph/mgr/orchestrator/_interface.py\", line 1488, in inner\n completion = self._oremote(method_name, args, kwargs)",
" File \"/usr/share/ceph/mgr/orchestrator/_interface.py\", line 1555, in _oremote\n raise NoOrchestrator()",
"orchestrator._interface.NoOrchestrator: No orchestrator configured (try `ceph orch set backend`)"
],
"ceph_version": "17.2.6",
"crash_id": "2023-11-20T04:47:16.737623Z_8a944527-1cc1-4ed5-b58b-86bf97bcf3b1",
"entity_name": "mgr.juju-108031-1-lxd-1",
"mgr_module": "nfs",
"mgr_module_caller": "ActivePyModule::dispatch_remote cluster_ls",
"mgr_python_exception": "NoOrchestrator",
"os_id": "22.04",
"os_name": "Ubuntu 22.04.3 LTS",
"os_version": "22.04.3 LTS (Jammy Jellyfish)",
"os_version_id": "22.04",
"process_name": "ceph-mgr",
"stack_sig": "b01db59d356dd52f69bfb0b128a216e7606f54a60674c3c82711c23cf64832ce",
"timestamp": "2023-11-20T04:47:16.737623Z",
"utsname_hostname": "juju-108031-1-lxd-1",
"utsname_machine": "x86_64",
"utsname_release": "5.15.0-88-generic",
"utsname_sysname": "Linux",
"utsname_version": "#98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023"
}
```
I guess this is the part that maps directly to the `cluster_ls` method:
```
"mgr_module_caller": "ActivePyModule::dispatch_remote cluster_ls",
```
This is `cluster_ls`, in `src/pybind/mgr/nfs/module.py`.
```
# this raises an error, causing a module crash, if orchestrator is not available
def cluster_ls(self) -> List[str]:
return available_clusters(self)
```
^ This is the root of the traceback we're seeing.
I guess the reason we're seeing a crash, is because this method doesn't catch any errors thrown from `available_clusters`.
For reference, other methods I've checked here will handle the error.
For example:
(in `src/pybind/mgr/nfs/cluster.py`, called from `ceph nfs cluster ls`
handler in `_cmd_nfs_cluster_ls()` in `src/pybind/mgr/nfs/module.py`)
```
def list_nfs_cluster(self) -> List[str]:
try:
return available_clusters(self.mgr)
except Exception as e:
log.exception("Failed to list NFS Cluster")
raise ErrorResponse.wrap(e)
```
I tried the same pattern of catching the error, and raising `ErrorResponse` within `cluster_ls`,
but that still resulted in a crash:
```
{
"backtrace": [
" File \"/usr/share/ceph/mgr/nfs/module.py\", line 173, in cluster_ls\n return available_clusters(self)",
" File \"/usr/share/ceph/mgr/nfs/utils.py\", line 38, in available_clusters\n completion = mgr.describe_service(service_type='nfs')",
" File \"/usr/share/ceph/mgr/orchestrator/_interface.py\", line 1488, in inner\n completion = self._oremote(method_name, args, kwargs)",
" File \"/usr/share/ceph/mgr/orchestrator/_interface.py\", line 1555, in _oremote\n raise NoOrchestrator()",
"orchestrator._interface.NoOrchestrator: No orchestrator configured (try `ceph orch set backend`)",
"\nThe above exception was the direct cause of the following exception:\n",
"Traceback (most recent call last):",
" File \"/usr/share/ceph/mgr/nfs/module.py\", line 175, in cluster_ls\n raise ErrorResponse.wrap(e)",
"object_format.ErrorResponse: No orchestrator configured (try `ceph orch set backend`)"
],
"ceph_version": "17.2.6",
"crash_id": "2023-11-20T04:59:04.018086Z_2a16b6a4-85e5-49ee-93f0-c1b552f1df06",
"entity_name": "mgr.juju-108031-1-lxd-1",
"mgr_module": "nfs",
"mgr_module_caller": "ActivePyModule::dispatch_remote cluster_ls",
"mgr_python_exception": "ErrorResponse",
"os_id": "22.04",
"os_name": "Ubuntu 22.04.3 LTS",
"os_version": "22.04.3 LTS (Jammy Jellyfish)",
"os_version_id": "22.04",
"process_name": "ceph-mgr",
"stack_sig": "6a64a2a392fc0ad969c705c51ccec3206fab079f3c53ef566d1ed1d6f5088851",
"timestamp": "2023-11-20T04:59:04.018086Z",
"utsname_hostname": "juju-108031-1-lxd-1",
"utsname_machine": "x86_64",
"utsname_release": "5.15.0-88-generic",
"utsname_sysname": "Linux",
"utsname_version": "#98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023"
}
```
I'm not sure what kind of pattern is required here for this kind of remote module method call where it's not a cli command.
We still need to convey an error response to the remote called (eg. ceph-dashboard in this case),
but without "crashing".
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/2039955
Title:
Opening NFS tab in the dashboard leads to ceph mgr crash -
orchestrator._interface.NoOrchestrator: No orchestrator configured
Status in Ceph Dashboard Charm:
New
Status in ceph package in Ubuntu:
New
Bug description:
Whenever the NFS tab in the Ceph dashboard is opened, NoOrchestrator
exception is raised and it's considered as a ceph mgr module crash
(although it's not an actual process crash).
Other tabs that require orchestrator handle the situation well, those
tabs prints the following message but no exception is raised.
====
Orchestrator is not available
Orchestrator is unavailable: No orchestrator configured (try `ceph orch set backend`)
Please consult the documentation on how to configure and enable the management functionality.
====
In the meantime, with the NFS tab, exception is raised.
https://dashboard.example.com:8443/#/nfs
====
NFS-Ganesha is not configured
Remote method threw exception: Traceback (most recent call last): File "/usr/share/ceph/mgr/nfs/module.py", line 169, in cluster_ls return available_clusters(self) File "/usr/share/ceph/mgr/nfs/utils.py", line 38, in available_clusters completion = mgr.describe_service(service_type='nfs') File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1488, in inner completion = self._oremote(method_name, args, kwargs) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1555, in _oremote raise NoOrchestrator() orchestrator._interface.NoOrchestrator: No orchestrator configured (try `ceph orch set backend`)
Please consult the documentation on how to configure and enable the management functionality.
====
# ceph health
HEALTH_WARN 2 mgr modules have recently crashed
# ceph crash ls
ID ENTITY NEW
2023-10-20T00:40:40.362363Z_2f461bb5-343c-4cb4-8134-99ae29ddc60c mgr.juju-ffeb43-0-lxd-0 *
2023-10-20T02:24:37.980204Z_9bf106e2-0dd2-4a88-b0f4-647dfa82697f mgr.juju-ffeb43-0-lxd-0 *
# ceph crash info 2023-10-20T00:40:40.362363Z_2f461bb5-343c-4cb4-8134-99ae29ddc60c
{
"backtrace": [
" File \"/usr/share/ceph/mgr/nfs/module.py\", line 169, in cluster_ls\n return available_clusters(self)",
" File \"/usr/share/ceph/mgr/nfs/utils.py\", line 38, in available_clusters\n completion = mgr.describe_service(service_type='nfs')",
" File \"/usr/share/ceph/mgr/orchestrator/_interface.py\", line 1488, in inner\n completion = self._oremote(method_name, args, kwargs)",
" File \"/usr/share/ceph/mgr/orchestrator/_interface.py\", line 1555, in _oremote\n raise NoOrchestrator()",
"orchestrator._interface.NoOrchestrator: No orchestrator configured (try `ceph orch set backend`)"
],
"ceph_version": "17.2.6",
"crash_id": "2023-10-20T00:40:40.362363Z_2f461bb5-343c-4cb4-8134-99ae29ddc60c",
"entity_name": "mgr.juju-ffeb43-0-lxd-0",
"mgr_module": "nfs",
"mgr_module_caller": "ActivePyModule::dispatch_remote cluster_ls",
"mgr_python_exception": "NoOrchestrator",
"os_id": "22.04",
"os_name": "Ubuntu 22.04.3 LTS",
"os_version": "22.04.3 LTS (Jammy Jellyfish)",
"os_version_id": "22.04",
"process_name": "ceph-mgr",
"stack_sig": "b01db59d356dd52f69bfb0b128a216e7606f54a60674c3c82711c23cf64832ce",
"timestamp": "2023-10-20T00:40:40.362363Z",
"utsname_hostname": "juju-ffeb43-0-lxd-0",
"utsname_machine": "x86_64",
"utsname_release": "5.15.0-87-generic",
"utsname_sysname": "Linux",
"utsname_version": "#97-Ubuntu SMP Mon Oct 2 21:09:21 UTC 2023"
}
ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: ceph-mgr-dashboard 17.2.6-0ubuntu0.22.04.1
ProcVersionSignature: Ubuntu 5.15.0-87.97-generic 5.15.122
Uname: Linux 5.15.0-87-generic x86_64
ApportVersion: 2.20.11-0ubuntu82.5
Architecture: amd64
CasperMD5CheckResult: unknown
CloudArchitecture: x86_64
CloudID: lxd
CloudName: lxd
CloudPlatform: lxd
CloudSubPlatform: LXD socket API v. 1.0 (/dev/lxd/sock)
Date: Fri Oct 20 09:49:25 2023
PackageArchitecture: all
ProcEnviron:
TERM=screen-256color
PATH=(custom, no user)
LANG=C.UTF-8
SHELL=/bin/bash
SourcePackage: ceph
UpgradeStatus: No upgrade log present (probably fresh install)
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceph-dashboard/+bug/2039955/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list