[Bug 1670811] Re: Multipath services fails to start on Ubuntu 17.04 on boot and kdump (initramfs)
Mauricio Faria de Oliveira
mauricfo at linux.vnet.ibm.com
Thu Apr 6 13:14:35 UTC 2017
Hi Christian,
> > Can you please elaborate a bit more on that, for my own education?
Upfront, thanks for the explanation. It really helped to think a bit
more on the code and go figure out a few things.
I'm writing up an explanation below of why I still think it cannot
happen. And it's _not_ that I want to insist in making my point -- because
actually we don't even need that anymore w/ the much simpler/elegant
fix of changing the /var/run & /run paths :) -- it's more of I liked
to go check it out in the code.
> pid="$(pidof multipathd)"
> # lets assume it gets pid 1001 being the current main PID and there are
> three sibling processes of a sort of multipath-helper binary with pids:
> 1002, 1003, 1004
AFAIK, multipathd only spawns threads (with same PID as parent), but
not other processes (ie, different PIDs). There only point with fork()
calls are in daemonize()'ing the multipathd main() into a child().
After that, only one PID exists (despite several threads running, e.g.,
uevent listener, path checkers).
We can verify that, as with a running/functinal multipathd there's only
a single process running, but all the path checking/uevent listening is
going on all the time.
> out="$(/sbin/multipathd -k'shutdown')"
> # here the shutdown might send signals to all, and helpers 1002, 1003
> # But due to an error on terminating 1004 it sends back to its parent that
> it has to reload
Even considering the PIDs could be different (which doesn't seem to be the
case), the child() loop requires just one evaluation of (running_state != DAEMON_SHUTDOWN)
to get out of the loop in which it could process any other events,
and goes right into tearing down the threads and the process.
And the only place where it could get a different PID would be daemonize()
which has fork() calls, but it's only called from main().
> # main 1001 re-execs itself and 1001 goes away but a new 1005 appears and
> spawns two new helpers 1006, 1007.
> # now you end up with main 1005, helper 1004, 1006, 1007
> # shutdown will reply some error to "out" but you will not know exactly
> This is just a case for such an issue - and as I said only theoretical, but
> essentially it is a chance to use an out of date value.
Sure, I understand. It's been a great exercise analyzing it. Thanks.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1670811
Title:
Multipath services fails to start on Ubuntu 17.04 on boot and kdump
(initramfs)
Status in multipath-tools package in Ubuntu:
Fix Released
Status in multipath-tools source package in Zesty:
Fix Released
Bug description:
---Problem Description---
Multipath services fails to start on Ubuntu 17.04 with SAN multipath devices.
root at ltciofvtr-s824-lp8:~# service multipath-tools status
* multipathd.service - Device-Mapper Multipath Device Controller
Loaded: loaded (/lib/systemd/system/multipathd.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2017-03-07 07:00:43 CST; 5min ago
Main PID: 690 (code=exited, status=1/FAILURE)
Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: Starting Device-Mapper Multipath Device Controller...
Mar 07 07:00:43 ltciofvtr-s824-lp8 multipathd[690]: process is already running
Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Main process exited, code=exited, status=1/FAILURE
Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: Failed to start Device-Mapper Multipath Device Controller.
Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Unit entered failed state.
Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Failed with result 'exit-code'.
---uname output---
Linux ltciofvtr-s824-lp8 4.10.0-8-generic #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux [12:24] good, passe
Machine Type = IBM,8286-42A LPAR
---Steps to Reproduce---
# service multipath-tools status
There are failures to start the multipathd.socket unit, which reports the socket address busy [1].
And also are failures to start the multipath.service unit, which reports already running [2].
Those correlate to a failure to stop multipathd at the initramfs-tools local-bottom hook,
becasue the pid file changed from /var/run to /run, due to a build-time evaluation in multipath-tools' Makefile.inc,
that apparently is evaluated differently on 17.04 or has changed with the recent merge with Debian. [3, 4]
So, let's make the stop/shutdown/kill more independent of the initramfs filesystem structure.
Since we can assume multipathd is running, use its shutdown command.
And if that fails, try SIGINT.
And if that fails, use SIGKILL.
A fallout of this is that sometimes multipathd takes a while to handle the non-KILL methods,
and once multipathd.socket was started very quickly afterward, it failed because the
unix socket of the initramfs multipathd was still open.
So, wait a little for it to close (almost always it happens immediately, but keep a 10-sec retry/timeout handler there,
just ensure the shutdown a bit more, since the impact of not shutting down correctly is not having multipathd started at the rootfs/systemd units, which is important as /etc/fstab might point/wait to mpath devices, and if those are not availble, the local-fs unit fails, and puts the system into the rescue/emergency shell.
1) multipathd.socket
[ 36.465332] systemd[1]: Failed to listen on multipathd control socket.
[FAILED] Failed to listen on multipathd control socket.
See 'systemctl status multipathd.socket' for details.
# systemctl status multipathd.socket --no-pager -l
* multipathd.socket - multipathd control socket
Loaded: loaded (/lib/systemd/system/multipathd.socket; static; vendor preset: enabled)
Active: failed (Result: resources)
Listen: @/org/kernel/linux/storage/multipathd (Stream)
# journalctl -b -x
...
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.socket: Failed to listen on sockets: Address already in use
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: Failed to listen on multipathd control socket.
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.socket: Unit entered failed state.
...
2) multipathd.service
[FAILED] Failed to start Device-Mapper Multipath Device Controller.
See 'systemctl status multipathd.service' for details.
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: Starting Device-Mapper Multipath Device Controller...
Mar 07 11:02:04 ltciofvtr-s824-lp8 multipathd[861]: process is already running
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Main process exited, code=exited, status=1/FAILURE
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: Failed to start Device-Mapper Multipath Device Controller.
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Unit entered failed state.
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Failed with result 'exit-code'.
[3] at the local-bottom hook, break=post-multipath:
(initramfs) multipathd -k'show daemon'
pid 259 idle
(initramfs) ls /var/run/multipathd.pid
ls: /var/run/multipathd.pid: No such file or directory
(initramfs) ls /run/multipathd.pid
/run/multipathd.pid
(initramfs) exit
cat: can't open '/var/run/multipathd.pid': No such file or directory
[4] bulid time evaluation in Makefile.inc
build time evaluation:
ifndef RUN
ifeq ($(shell test -L /var/run -o ! -d /var/run && echo 1),1)
RUN=run
else
RUN=var/run
endif
endif
Attaching the patch for multipath-tools on 17.04 that resolves this problem.
With it applied, the multipathd.socket and .service units started
successfully on both normal boot and kdump boot scenarios.
Can you please consider it for Zesty? Thanks
# systemctl status multipathd.socket | head -n4
* multipathd.socket - multipathd control socket
Loaded: loaded (/lib/systemd/system/multipathd.socket; static; vendor preset: enabled)
Active: active (running) since Tue 2017-03-07 12:37:54 CST; 30s ago
Listen: @/org/kernel/linux/storage/multipathd (Stream)
# systemctl status multipathd.service | head -n3
* multipathd.service - Device-Mapper Multipath Device Controller
Loaded: loaded (/lib/systemd/system/multipathd.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2017-03-07 12:37:55 CST; 42s ago
@taco-screen-team
May you please assign this bug to @cyphermox or @paelzer ?
Thanks!
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1670811/+subscriptions
More information about the foundations-bugs
mailing list