[Bug 1670811] Re: Multipath services fails to start on Ubuntu 17.04 on boot and kdump (initramfs)
ChristianEhrhardt
1670811 at bugs.launchpad.net
Fri Mar 31 06:48:34 UTC 2017
On Thu, Mar 30, 2017 at 6:25 PM, Mauricio Faria de Oliveira <
mauricfo at linux.vnet.ibm.com> wrote:
[...]
> And the whole shutdown is skipped in case 'multipathd -kshutdown' failed
> and PID is null (as kill_stage does not progress into 1,2,3)
>
> +if [ "$out" = 'ok' ] \
> +|| ( [ -n "$pid" ] && /bin/kill -SIGINT $pid ) \
> +|| ( [ -n "$pid" ] && /bin/kill -SIGKILL $pid ); then
> + kill_stage=1
> +fi
>
> Maybe I'm missing something in your point?
>
I think you are fine, I think I just expected the logic inverse.
If there is no pid this is essentially being:
=> "check ok" || false || false
And that should be fine - if none of that it does not go into stage 1
below.
> Also there might be some theoretical cases where the
> > +out="$(/sbin/multipathd -k'shutdown')"
> > As a side effect could make the pid change, so please re-arrange the
> pidof and the shutdown.
>
> I'm afraid I don't see that theoretical case happening.
> Can you please elaborate a bit more on that, for my own education?
>
pid="$(pidof multipathd)"
# lets assume it gets pid 1001 being the current main PID and there are
three sibling processes of a sort of multipath-helper binary with pids:
1002, 1003, 1004
out="$(/sbin/multipathd -k'shutdown')"
# here the shutdown might send signals to all, and helpers 1002, 1003
# But due to an error on terminating 1004 it sends back to its parent that
it has to reload
# main 1001 re-execs itself and 1001 goes away but a new 1005 appears and
spawns two new helpers 1006, 1007.
# now you end up with main 1005, helper 1004, 1006, 1007
# shutdown will reply some error to "out" but you will not know exactly
This is just a case for such an issue - and as I said only theoretical, but
essentially it is a chance to use an out of date value.
By changing order you are at least a bit better off - only "a bit" as if
you start to consider asynchronous behavior it gets worse again.
In some sense juggling with pids is a bit like potential racy "use after
free" and since we are sending kill signals we might at least close the
minimal things we see.
[... thanks for the great explanation that was following ...]
Yeah I think the current case is safe, never the less IMHO reordering two
lines can't hurt to not have a bug in it in 2022.
I'd agree it is waste if it would be more effort than reordering two lines.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1670811
Title:
Multipath services fails to start on Ubuntu 17.04 on boot and kdump
(initramfs)
Status in multipath-tools package in Ubuntu:
Fix Released
Status in multipath-tools source package in Zesty:
Fix Released
Bug description:
---Problem Description---
Multipath services fails to start on Ubuntu 17.04 with SAN multipath devices.
root at ltciofvtr-s824-lp8:~# service multipath-tools status
* multipathd.service - Device-Mapper Multipath Device Controller
Loaded: loaded (/lib/systemd/system/multipathd.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2017-03-07 07:00:43 CST; 5min ago
Main PID: 690 (code=exited, status=1/FAILURE)
Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: Starting Device-Mapper Multipath Device Controller...
Mar 07 07:00:43 ltciofvtr-s824-lp8 multipathd[690]: process is already running
Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Main process exited, code=exited, status=1/FAILURE
Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: Failed to start Device-Mapper Multipath Device Controller.
Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Unit entered failed state.
Mar 07 07:00:43 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Failed with result 'exit-code'.
---uname output---
Linux ltciofvtr-s824-lp8 4.10.0-8-generic #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux [12:24] good, passe
Machine Type = IBM,8286-42A LPAR
---Steps to Reproduce---
# service multipath-tools status
There are failures to start the multipathd.socket unit, which reports the socket address busy [1].
And also are failures to start the multipath.service unit, which reports already running [2].
Those correlate to a failure to stop multipathd at the initramfs-tools local-bottom hook,
becasue the pid file changed from /var/run to /run, due to a build-time evaluation in multipath-tools' Makefile.inc,
that apparently is evaluated differently on 17.04 or has changed with the recent merge with Debian. [3, 4]
So, let's make the stop/shutdown/kill more independent of the initramfs filesystem structure.
Since we can assume multipathd is running, use its shutdown command.
And if that fails, try SIGINT.
And if that fails, use SIGKILL.
A fallout of this is that sometimes multipathd takes a while to handle the non-KILL methods,
and once multipathd.socket was started very quickly afterward, it failed because the
unix socket of the initramfs multipathd was still open.
So, wait a little for it to close (almost always it happens immediately, but keep a 10-sec retry/timeout handler there,
just ensure the shutdown a bit more, since the impact of not shutting down correctly is not having multipathd started at the rootfs/systemd units, which is important as /etc/fstab might point/wait to mpath devices, and if those are not availble, the local-fs unit fails, and puts the system into the rescue/emergency shell.
1) multipathd.socket
[ 36.465332] systemd[1]: Failed to listen on multipathd control socket.
[FAILED] Failed to listen on multipathd control socket.
See 'systemctl status multipathd.socket' for details.
# systemctl status multipathd.socket --no-pager -l
* multipathd.socket - multipathd control socket
Loaded: loaded (/lib/systemd/system/multipathd.socket; static; vendor preset: enabled)
Active: failed (Result: resources)
Listen: @/org/kernel/linux/storage/multipathd (Stream)
# journalctl -b -x
...
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.socket: Failed to listen on sockets: Address already in use
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: Failed to listen on multipathd control socket.
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.socket: Unit entered failed state.
...
2) multipathd.service
[FAILED] Failed to start Device-Mapper Multipath Device Controller.
See 'systemctl status multipathd.service' for details.
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: Starting Device-Mapper Multipath Device Controller...
Mar 07 11:02:04 ltciofvtr-s824-lp8 multipathd[861]: process is already running
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Main process exited, code=exited, status=1/FAILURE
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: Failed to start Device-Mapper Multipath Device Controller.
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Unit entered failed state.
Mar 07 11:02:04 ltciofvtr-s824-lp8 systemd[1]: multipathd.service: Failed with result 'exit-code'.
[3] at the local-bottom hook, break=post-multipath:
(initramfs) multipathd -k'show daemon'
pid 259 idle
(initramfs) ls /var/run/multipathd.pid
ls: /var/run/multipathd.pid: No such file or directory
(initramfs) ls /run/multipathd.pid
/run/multipathd.pid
(initramfs) exit
cat: can't open '/var/run/multipathd.pid': No such file or directory
[4] bulid time evaluation in Makefile.inc
build time evaluation:
ifndef RUN
ifeq ($(shell test -L /var/run -o ! -d /var/run && echo 1),1)
RUN=run
else
RUN=var/run
endif
endif
Attaching the patch for multipath-tools on 17.04 that resolves this problem.
With it applied, the multipathd.socket and .service units started
successfully on both normal boot and kdump boot scenarios.
Can you please consider it for Zesty? Thanks
# systemctl status multipathd.socket | head -n4
* multipathd.socket - multipathd control socket
Loaded: loaded (/lib/systemd/system/multipathd.socket; static; vendor preset: enabled)
Active: active (running) since Tue 2017-03-07 12:37:54 CST; 30s ago
Listen: @/org/kernel/linux/storage/multipathd (Stream)
# systemctl status multipathd.service | head -n3
* multipathd.service - Device-Mapper Multipath Device Controller
Loaded: loaded (/lib/systemd/system/multipathd.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2017-03-07 12:37:55 CST; 42s ago
@taco-screen-team
May you please assign this bug to @cyphermox or @paelzer ?
Thanks!
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1670811/+subscriptions
More information about the foundations-bugs
mailing list