[Bug 1928259] Re: Package upgrade won't restart services
Andreas Hasenack
1928259 at bugs.launchpad.net
Fri Jun 11 20:56:38 UTC 2021
Bionic verification
"TEST A" verification was done following steps from bug #1927745, where
the lack of the restart was first found, and that fix is included in
this upload as well.
ubuntu at b-gssd-restart-1928259-1927745-A:~$ apt-cache policy nfs-common
nfs-common:
Installed: 1:1.3.4-2.1ubuntu5.3
Candidate: 1:1.3.4-2.1ubuntu5.3
Version table:
*** 1:1.3.4-2.1ubuntu5.3 500
500 http://br.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
500 http://br.archive.ubuntu.com/ubuntu bionic-security/main amd64 Packages
100 /var/lib/dpkg/status
1:1.3.4-2.1ubuntu5 500
500 http://br.archive.ubuntu.com/ubuntu bionic/main amd64 Packages
Problem quickly reproduced:
ubuntu at b-gssd-restart-1928259-1927745-a:~$ sudo ./bz1419280_test_threads
Iter 1
calling stat on '/mnt/test_krb5/foo' with uids 9995 through 10035
reproduced the bug after 1 iterations
ubuntu at b-gssd-restart-1928259-1927745-a:~$ ps axw|grep stat_as
8012 pts/0 D 0:00 ./stat_as /mnt/test_krb5/foo 9995 10035
8036 pts/0 D 0:00 ./stat_as /mnt/test_krb5/foo 9995 10035
8091 pts/0 S+ 0:00 grep --color=auto stat_as
In this state, I installed the fixed packages from proposed. But first, let's
get the rpc.gssd pid:
ubuntu at b-gssd-restart-1928259-1927745-a:~$ ps axw | grep rpc\\.gssd
7854 ? Ss 0:00 /usr/sbin/rpc.gssd
Now upgrade:
ubuntu at b-gssd-restart-1928259-1927745-a:~$ sudo apt install nfs-common
Reading package lists... Done
(...)
Do you want to continue? [Y/n]
Get:1 http://br.archive.ubuntu.com/ubuntu bionic-proposed/main amd64 nfs-common amd64 1:1.3.4-2.1ubuntu5.5 [206 kB]
Get:2 http://br.archive.ubuntu.com/ubuntu bionic-proposed/main amd64 nfs-kernel-server amd64 1:1.3.4-2.1ubuntu5.5 [93.8 kB]
Fetched 299 kB in 0s (1479 kB/s)
(Reading database ... 64831 files and directories currently installed.)
Preparing to unpack .../nfs-common_1%3a1.3.4-2.1ubuntu5.5_amd64.deb ...
Unpacking nfs-common (1:1.3.4-2.1ubuntu5.5) over (1:1.3.4-2.1ubuntu5.3) ...
Preparing to unpack .../nfs-kernel-server_1%3a1.3.4-2.1ubuntu5.5_amd64.deb ...
Unpacking nfs-kernel-server (1:1.3.4-2.1ubuntu5.5) over (1:1.3.4-2.1ubuntu5.3) ...
Setting up nfs-common (1:1.3.4-2.1ubuntu5.5) ...
Setting up nfs-kernel-server (1:1.3.4-2.1ubuntu5.5) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Processing triggers for ureadahead (0.100.0-21) ...
Processing triggers for systemd (237-3ubuntu10.47) ...
Not only was rpc.gssd restarted automatically (see new pid):
ubuntu at b-gssd-restart-1928259-1927745-a:~$ ps axw | grep rpc\\.gssd
8886 ? Ss 0:00 /usr/sbin/rpc.gssd
We also got rid of the stuck stat_as processes:
ubuntu at b-gssd-restart-1928259-1927745-a:~$ ps axw|grep stat_as
9550 pts/0 S+ 0:00 grep --color=auto stat_as
TEST B
ubuntu at b-gssd-restart-1928259-1927745-b:~$ diff -u pstree.old pstree.new
ubuntu at b-gssd-restart-1928259-1927745-b:~$ l pstree.*
-rw-rw-r-- 1 ubuntu ubuntu 633 Jun 11 20:55 pstree.new
-rw-rw-r-- 1 ubuntu ubuntu 633 Jun 11 20:55 pstree.old
ubuntu at b-gssd-restart-1928259-1927745-b:~$ apt-cache policy nfs-common
nfs-common:
Installed: 1:1.3.4-2.1ubuntu5.5
Candidate: 1:1.3.4-2.1ubuntu5.5
Version table:
*** 1:1.3.4-2.1ubuntu5.5 500
500 http://br.archive.ubuntu.com/ubuntu bionic-proposed/main amd64 Packages
100 /var/lib/dpkg/status
1:1.3.4-2.1ubuntu5.3 500
500 http://br.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
500 http://br.archive.ubuntu.com/ubuntu bionic-security/main amd64 Packages
1:1.3.4-2.1ubuntu5 500
500 http://br.archive.ubuntu.com/ubuntu bionic/main amd64 Packages
Bionic verification succeeded.
** Tags removed: verification-needed-bionic
** Tags added: verification-done-bionic
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to nfs-utils in Ubuntu.
https://bugs.launchpad.net/bugs/1928259
Title:
Package upgrade won't restart services
Status in nfs-utils package in Ubuntu:
Fix Released
Status in nfs-utils source package in Bionic:
Fix Committed
Status in nfs-utils source package in Focal:
Fix Committed
Status in nfs-utils source package in Groovy:
Fix Committed
Status in nfs-utils source package in Hirsute:
Fix Committed
Status in nfs-utils package in Debian:
New
Status in nfs-utils package in Fedora:
Confirmed
Bug description:
[Impact]
In order to get the fixes provided by a package update, the affected services shipped in it need to be restarted. When that restart does not happen, the system remains running the old binaries with the bug(s).
This bug was found while testing the fix for #1927745, which affected
rpc.gssd, one of the services shipped in nfs-common. Without the
restart, systems that installed the update are still affected by the
bug.
[Test Plan]
To make the test simple, we are not going to mount an NFSv4 share using kerberos. We are just going to have a minimal configuration that gets rpc.gssd running to demonstrate the before and after of this bug.
For a more thorough testing, which includes actually mounting an NFSv4
export with kerberos, follow the test instructions of bug #1927745,
and you will see that the manual restart included because of this bug
here, after the package is updated, is no longer needed.
TEST (A)
# create a VM for the affected ubuntu release under test, login and run:
sudo touch /etc/krb5.keytab
sudo chmod 0600 /etc/krb5.keytab
# install nfs-common
sudo apt install nfs-common -y
# note message about nfs-utils.service being disabled/static:
nfs-utils.service is a disabled or a static unit, not starting it.
# Manually start rpc-gssd. It will start, but since we have an empty
# krb5.keytab file, it won't work. That's ok, we are not actually going to
# mount nfsv4
sudo systemctl start rpc-gssd.service
# Check it's running, and make note of its pid:
pidof rpc.gssd
2994
# reinstall nfs-common
sudo apt install --reinstall nfs-common
# note rpc-gssd wasn't restarted
pidof rpc.gssd
2994
# install the fixed nfs-common package. Notice the message about starting a disabled or static unit no longer appears:
sudo apt install nfs-common
# this time, rpc.gssd is restarted
pidof rpc.gssd
5000
TEST (B)
This test is to confirm no new services are started after the fixed package is installed for the first time.
# create a VM for the affected ubuntu release under test, login and run:
sudo touch /etc/krb5.keytab
sudo chmod 0600 /etc/krb5.keytab
# install nfs-common that has the bug
sudo apt install nfs-common -y
# take a snapshot of running processes
pstree > pstree.old
# purge the nfs-common package
sudo apt purge nfs-common -y
# install the new nfs-common package
sudo apt install nfs-common -y
# take a new pstree snapshot and compare with the old one
pstree > pstree.new
diff -u pstree.old pstree.new
Should be no difference.
[Where problems could occur]
Also known as "I'm doing an unconditional start in postinst, what could go wrong":
- start services that were not started with the previous package on first install
- systemd behavior change or bug and suddenly PartOf units also react to "start", instead of just "restart" and "stop" as documented
- starting services that are not configured, and start fails, breaking postinst (but we have the proverbial || true to avoid that)
[Other Info]
This fix is a bit awkward, but I think it's in line with the SRU spirit of doing the least unpredictable change, and one that is simple and can be better understood.
See the linked MP for an explanation of this fix, why it works, and other tests I did:
https://code.launchpad.net/~ahasenack/ubuntu/+source/nfs-utils/+git/nfs-utils/+merge/403288
[Original Description]
Upgrading the nfs-common debian package will not restart its services.
Specifically, the package tries to restart "nfs-utils.service", which is a "fake" service meant to coordinate all the other daemons that make up a modern NFS server. This service, however, as it is, cannot be enabled:
$ sudo systemctl enable nfs-utils.service
The unit files have no installation config (WantedBy, RequiredBy, Also, Alias
settings in the [Install] section, and DefaultInstance for template units).
This means they are not meant to be enabled using systemctl.
Possible reasons for having this kind of units are:
1) A unit may be statically enabled by being symlinked from another unit's
.wants/ or .requires/ directory.
2) A unit's purpose may be to act as a helper for some other unit which has
a requirement dependency on it.
3) A unit may be started when needed via activation (socket, path, timer,
D-Bus, udev, scripted systemctl call, ...).
4) In case of template units, the unit is meant to be enabled with some
instance name specified
Granted, d/rules of the nfs-utils package doesn't even try:
dh_systemd_enable -p nfs-common nfs-client.target
dh_systemd_enable -p nfs-kernel-server nfs-server.service
dh_installinit -pnfs-common -R
dh_systemd_start -p nfs-common --restart-after-upgrade nfs-utils.service
dh_systemd_start -p nfs-kernel-server --restart-after-upgrade nfs-server.service
We can see it tries to start and restart it, but that won't work on disabled or non-started services: deb-systemd-invoke won't do it:
# If the job is disabled and is not currently running, the job is not started or restarted.
# However, if the job is disabled but has been forced into the running state, we *do* stop
# and restart it since this is expected behaviour for the admin who forced the start.
# We don't autostart static units either.
The above can be seen while attempting a fresh install (or even upgrade) of nfs-common:
(...)
Setting up nfs-common (1:1.3.4-2.5ubuntu6) ...
Creating config file /etc/idmapd.conf with new version
Adding system user `statd' (UID 113) ...
Adding new user `statd' (UID 113) with group `nogroup' ...
Not creating home directory `/var/lib/nfs'.
Created symlink /etc/systemd/system/multi-user.target.wants/nfs-client.target → /lib/systemd/system/nfs-client.target.
Created symlink /etc/systemd/system/remote-fs.target.wants/nfs-client.target → /lib/systemd/system/nfs-client.target.
nfs-utils.service is a disabled or a static unit, not starting it.
^^^^^^^^^^^^^^^^^
$ systemctl status nfs-utils.service
● nfs-utils.service - NFS server and client services
Loaded: loaded (/lib/systemd/system/nfs-utils.service; static)
Active: inactive (dead)
This was found while testing the fix for bug #1927745. In that bug,
the affected service is rpc.gssd and it's critical that it be
restarted, but it's not happening. It will only be restarted if nfs-
utils.service is already "started".
I'm marking this bug as "high" because it prevents valid fixes from
being deployed after just upgrading a package.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1928259/+subscriptions
More information about the foundations-bugs
mailing list