[Bug 2152460] [NEW] Ceph 20.2.0 - cephadm cluster bootstrap fails due to hardcoded uid and gid
Alan Baghumian
2152460 at bugs.launchpad.net
Wed May 13 09:56:05 UTC 2026
Public bug reported:
After installing cephadm on Resolute:
ii cephadm 20.2.0-0ubuntu2 all
Trying to bootstrap a cluster:
$ sudo cephadm bootstrap --mon-ip 10.3.1.190
Creating directory /etc/ceph for ceph.conf
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chrony.service is enabled and running
Repeating the final host check...
docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chrony.service is enabled and running
Host looks OK
Cluster fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
Verifying IP 10.3.1.190 port 3300 ...
Verifying IP 10.3.1.190 port 6789 ...
Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
Internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Pulling container image quay.io/ceph/ceph:v20...
Ceph version: ceph version 20.2.1 (6a49aff47758778a5f5951e731d437c317f72fb2) tentacle (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Non-zero exit code 1 from install -d -m0770 -o 167 -g 167 /var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652
install: stderr install: invalid user: '167'
RuntimeError: Failed command: install -d -m0770 -o 167 -g 167 /var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652: install: invalid user: '167'
***************
Cephadm hit an issue during cluster installation. Current cluster files will be deleted automatically.
To disable this behaviour you can pass the --no-cleanup-on-failure flag. In case of any previous
broken installation, users must use the following command to completely delete the broken cluster:
> cephadm rm-cluster --force --zap-osds --fsid <fsid>
for more information please refer to https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster
***************
Deleting cluster with fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
Traceback (most recent call last):
File "/usr/sbin/cephadm", line 5288, in <module>
main()
~~~~^^
File "/usr/sbin/cephadm", line 5276, in main
r = ctx.func(ctx)
File "/usr/sbin/cephadm", line 2524, in _rollback
return func(ctx)
File "/usr/sbin/cephadm", line 434, in _default_image
return func(ctx)
File "/usr/sbin/cephadm", line 2684, in command_bootstrap
make_var_run(ctx, fsid, uid, gid)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "/usr/sbin/cephadm", line 508, in make_var_run
call_throws(ctx, ['install', '-d', '-m0770', '-o', str(uid), '-g', str(gid),
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
'/var/run/ceph/%s' % fsid])
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cephadmlib/call_wrappers.py", line 307, in call_throws
raise RuntimeError(
f'Failed command: {" ".join(command)}: {s}'
)
RuntimeError: Failed command: install -d -m0770 -o 167 -g 167
/var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652: install: invalid
user: '167'
On RHEL, uid/gid is reserved for the ceph user, however we do not have
that on Ubuntu and this needs to get fixed on the next SRU.
Thanks,
Alan
** Affects: ceph (Ubuntu)
Importance: Undecided
Status: New
** Description changed:
After installing cephadm on Resolute:
ii cephadm 20.2.0-0ubuntu2 all
Trying to bootstrap a cluster:
$ sudo cephadm bootstrap --mon-ip 10.3.1.190
-
- $ sudo cephadm bootstrap --mon-ip 10.3.1.190
- Creating directory /etc/ceph for ceph.conf
- Verifying podman|docker is present...
- Verifying lvm2 is present...
- Verifying time synchronization is in place...
- Unit chrony.service is enabled and running
- Repeating the final host check...
- docker (/usr/bin/docker) is present
- systemctl is present
- lvcreate is present
- Unit chrony.service is enabled and running
- Host looks OK
- Cluster fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
- Verifying IP 10.3.1.190 port 3300 ...
- Verifying IP 10.3.1.190 port 6789 ...
- Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
- Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
- Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
- Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
- Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
+ Creating directory /etc/ceph for ceph.conf
+ Verifying podman|docker is present...
+ Verifying lvm2 is present...
+ Verifying time synchronization is in place...
+ Unit chrony.service is enabled and running
+ Repeating the final host check...
+ docker (/usr/bin/docker) is present
+ systemctl is present
+ lvcreate is present
+ Unit chrony.service is enabled and running
+ Host looks OK
+ Cluster fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
+ Verifying IP 10.3.1.190 port 3300 ...
+ Verifying IP 10.3.1.190 port 6789 ...
+ Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
+ Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
+ Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
+ Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
+ Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
Internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Pulling container image quay.io/ceph/ceph:v20...
Ceph version: ceph version 20.2.1 (6a49aff47758778a5f5951e731d437c317f72fb2) tentacle (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Non-zero exit code 1 from install -d -m0770 -o 167 -g 167 /var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652
install: stderr install: invalid user: '167'
RuntimeError: Failed command: install -d -m0770 -o 167 -g 167 /var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652: install: invalid user: '167'
+ ***************
+ Cephadm hit an issue during cluster installation. Current cluster files will be deleted automatically.
+ To disable this behaviour you can pass the --no-cleanup-on-failure flag. In case of any previous
+ broken installation, users must use the following command to completely delete the broken cluster:
- ***************
- Cephadm hit an issue during cluster installation. Current cluster files will be deleted automatically.
- To disable this behaviour you can pass the --no-cleanup-on-failure flag. In case of any previous
- broken installation, users must use the following command to completely delete the broken cluster:
+ > cephadm rm-cluster --force --zap-osds --fsid <fsid>
- > cephadm rm-cluster --force --zap-osds --fsid <fsid>
-
- for more information please refer to https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster
- ***************
+ for more information please refer to https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster
+ ***************
Deleting cluster with fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
Traceback (most recent call last):
- File "/usr/sbin/cephadm", line 5288, in <module>
- main()
- ~~~~^^
- File "/usr/sbin/cephadm", line 5276, in main
- r = ctx.func(ctx)
- File "/usr/sbin/cephadm", line 2524, in _rollback
- return func(ctx)
- File "/usr/sbin/cephadm", line 434, in _default_image
- return func(ctx)
- File "/usr/sbin/cephadm", line 2684, in command_bootstrap
- make_var_run(ctx, fsid, uid, gid)
- ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
- File "/usr/sbin/cephadm", line 508, in make_var_run
- call_throws(ctx, ['install', '-d', '-m0770', '-o', str(uid), '-g', str(gid),
- ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- '/var/run/ceph/%s' % fsid])
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^
- File "/usr/lib/python3/dist-packages/cephadmlib/call_wrappers.py", line 307, in call_throws
- raise RuntimeError(
- f'Failed command: {" ".join(command)}: {s}'
- )
+ File "/usr/sbin/cephadm", line 5288, in <module>
+ main()
+ ~~~~^^
+ File "/usr/sbin/cephadm", line 5276, in main
+ r = ctx.func(ctx)
+ File "/usr/sbin/cephadm", line 2524, in _rollback
+ return func(ctx)
+ File "/usr/sbin/cephadm", line 434, in _default_image
+ return func(ctx)
+ File "/usr/sbin/cephadm", line 2684, in command_bootstrap
+ make_var_run(ctx, fsid, uid, gid)
+ ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
+ File "/usr/sbin/cephadm", line 508, in make_var_run
+ call_throws(ctx, ['install', '-d', '-m0770', '-o', str(uid), '-g', str(gid),
+ ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ '/var/run/ceph/%s' % fsid])
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ File "/usr/lib/python3/dist-packages/cephadmlib/call_wrappers.py", line 307, in call_throws
+ raise RuntimeError(
+ f'Failed command: {" ".join(command)}: {s}'
+ )
RuntimeError: Failed command: install -d -m0770 -o 167 -g 167
/var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652: install: invalid
user: '167'
-
- On RHEL, uid/gid is reserved for the ceph user, however we do not have that on Ubuntu and this needs to get fixed on the next SRU.
+ On RHEL, uid/gid is reserved for the ceph user, however we do not have
+ that on Ubuntu and this needs to get fixed on the next SRU.
Thanks,
Alan
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/2152460
Title:
Ceph 20.2.0 - cephadm cluster bootstrap fails due to hardcoded uid and
gid
Status in ceph package in Ubuntu:
New
Bug description:
After installing cephadm on Resolute:
ii cephadm 20.2.0-0ubuntu2 all
Trying to bootstrap a cluster:
$ sudo cephadm bootstrap --mon-ip 10.3.1.190
Creating directory /etc/ceph for ceph.conf
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chrony.service is enabled and running
Repeating the final host check...
docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chrony.service is enabled and running
Host looks OK
Cluster fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
Verifying IP 10.3.1.190 port 3300 ...
Verifying IP 10.3.1.190 port 6789 ...
Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
Internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Pulling container image quay.io/ceph/ceph:v20...
Ceph version: ceph version 20.2.1 (6a49aff47758778a5f5951e731d437c317f72fb2) tentacle (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Non-zero exit code 1 from install -d -m0770 -o 167 -g 167 /var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652
install: stderr install: invalid user: '167'
RuntimeError: Failed command: install -d -m0770 -o 167 -g 167 /var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652: install: invalid user: '167'
***************
Cephadm hit an issue during cluster installation. Current cluster files will be deleted automatically.
To disable this behaviour you can pass the --no-cleanup-on-failure flag. In case of any previous
broken installation, users must use the following command to completely delete the broken cluster:
> cephadm rm-cluster --force --zap-osds --fsid <fsid>
for more information please refer to https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster
***************
Deleting cluster with fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
Traceback (most recent call last):
File "/usr/sbin/cephadm", line 5288, in <module>
main()
~~~~^^
File "/usr/sbin/cephadm", line 5276, in main
r = ctx.func(ctx)
File "/usr/sbin/cephadm", line 2524, in _rollback
return func(ctx)
File "/usr/sbin/cephadm", line 434, in _default_image
return func(ctx)
File "/usr/sbin/cephadm", line 2684, in command_bootstrap
make_var_run(ctx, fsid, uid, gid)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "/usr/sbin/cephadm", line 508, in make_var_run
call_throws(ctx, ['install', '-d', '-m0770', '-o', str(uid), '-g', str(gid),
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
'/var/run/ceph/%s' % fsid])
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cephadmlib/call_wrappers.py", line 307, in call_throws
raise RuntimeError(
f'Failed command: {" ".join(command)}: {s}'
)
RuntimeError: Failed command: install -d -m0770 -o 167 -g 167
/var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652: install: invalid
user: '167'
On RHEL, uid/gid is reserved for the ceph user, however we do not have
that on Ubuntu and this needs to get fixed on the next SRU.
Thanks,
Alan
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2152460/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list