[Bug 1435706] Re: DevLossTO, FastIoFailTO settings do not match multipath.conf expected values

Tore Anderson tore at fud.no
Wed Aug 26 07:17:58 UTC 2015


I verified that this bug is *NOT* fixed by trying the exact identical
configuration (which is as minimal as possible) both with Ubuntu Trusty
and with Scientific Linux 6 (RHEL6 clone). The test machine is a Cisco
B200M2 blade server, using the Cisco VIC FCoE HBA (fnic.ko driver). The
storage array is an EMC VNX5300, which is reached via FCoE (inside the
Cisco UCS infrastructure) and then traditional FC fabric.

The following console output is taken with Trusty installed. Note that
it was fully upgraded. After creating /etc/multipath.conf with the
indicated contents, update-initramfs was run and the system rebooted,
just to make sure the settings had taken effect. As you can see from the
output, the dev_loss_tmo and fast_io_fail_tmo settings are *NOT*
applied:

=-=-=-=-=-=-=-=
tore at ucstest-osl2:~$ cat /etc/multipath.conf
devices {
        device {
                vendor                  ".*"
                product                 ".*"
                fast_io_fail_tmo        3
                dev_loss_tmo            2147483647
        }
}

multipaths {
        multipath {
                wwid 3600601603a71320022967e0a1f38e411
                alias bootvolume
        }
}
tore at ucstest-osl2:~$ sudo multipath -ll
bootvolume (3600601603a71320022967e0a1f38e411) dm-0 DGC,VRAID
size=50G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| |- 0:0:1:0 sdb 8:16 active ready running
| `- 1:0:1:0 sdd 8:48 active ready running
`-+- policy='round-robin 0' prio=0 status=enabled
  |- 1:0:0:0 sdc 8:32 active ready running
  `- 0:0:0:0 sda 8:0  active ready running
tore at ucstest-osl2:~$ grep . /sys/class/fc_remote_ports/rport-*/*tmo
/sys/class/fc_remote_ports/rport-0:0-0/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-0:0-0/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-0:0-1/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-0:0-1/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-0:0-2/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-0:0-2/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-1:0-0/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-1:0-0/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-1:0-1/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-1:0-1/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-1:0-2/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-1:0-2/fast_io_fail_tmo:off
tore at ucstest-osl2:~$ uname -r
3.13.0-62-generic
tore at ucstest-osl2:~$ md5sum /etc/multipath.conf
27a62898e80a0bcd7e62b5f2e8d675ff  /etc/multipath.conf
tore at ucstest-osl2:~$ echo 3 | sudo tee /sys/class/fc_remote_ports/rport-*/fast_io_fail_tmo
3
tore at ucstest-osl2:~$ echo 2147483647 | sudo tee /sys/class/fc_remote_ports/rport-*/dev_loss_tmo
2147483647
tore at ucstest-osl2:~$ grep . /sys/class/fc_remote_ports/rport-*/*tmo
/sys/class/fc_remote_ports/rport-0:0-0/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-0:0-0/fast_io_fail_tmo:3
/sys/class/fc_remote_ports/rport-0:0-1/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-0:0-1/fast_io_fail_tmo:3
/sys/class/fc_remote_ports/rport-0:0-2/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-0:0-2/fast_io_fail_tmo:3
/sys/class/fc_remote_ports/rport-1:0-0/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-1:0-0/fast_io_fail_tmo:3
/sys/class/fc_remote_ports/rport-1:0-1/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-1:0-1/fast_io_fail_tmo:3
/sys/class/fc_remote_ports/rport-1:0-2/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-1:0-2/fast_io_fail_tmo:3
tore at ucstest-osl2:~$ dpkg -l multipath-tools
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                   Version                  Architecture             Description
+++-======================================-========================-========================-=================================================================================
ii  multipath-tools                        0.4.9-3ubuntu7.4         amd64                    maintain multipath block
=-=-=-=-=-=-=-=

This shows the exact same multipath.conf file being used on SL6, and in
this case the sysfs settings *ARE* applied when the multipath map is
registered (no reboot required):

=-=-=-=-=-=-=-=
[root at ucstest-osl2 ~]# uname -r
2.6.32-358.23.2.el6.x86_64
[root at ucstest-osl2 ~]# rpm -qa device-mapper-multipath
device-mapper-multipath-0.4.9-80.el6.x86_64
[root at ucstest-osl2 ~]# grep . /sys/class/fc_remote_ports/rport-*/*tmo
/sys/class/fc_remote_ports/rport-1:0-0/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-1:0-0/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-1:0-1/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-1:0-1/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-1:0-2/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-1:0-2/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-2:0-0/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-2:0-0/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-2:0-1/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-2:0-1/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-2:0-2/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-2:0-2/fast_io_fail_tmo:off
[root at ucstest-osl2 ~]# md5sum /etc/multipath.conf
27a62898e80a0bcd7e62b5f2e8d675ff  /etc/multipath.conf
[root at ucstest-osl2 ~]# multipath -v 2
Aug 26 07:06:34 | 35000c50042a362cb: ignoring map
create: bootvolume (3600601603a71320022967e0a1f38e411) undef DGC,VRAID
size=50G features='0' hwhandler='0' wp=undef
|-+- policy='round-robin 0' prio=1 status=undef
| `- 1:0:0:0 sdb 8:16 undef ready running
|-+- policy='round-robin 0' prio=1 status=undef
| `- 1:0:1:0 sdc 8:32 undef ready running
|-+- policy='round-robin 0' prio=1 status=undef
| `- 2:0:0:0 sdd 8:48 undef ready running
`-+- policy='round-robin 0' prio=1 status=undef
  `- 2:0:1:0 sde 8:64 undef ready running
[root at ucstest-osl2 ~]# grep . /sys/class/fc_remote_ports/rport-*/*tmo
/sys/class/fc_remote_ports/rport-1:0-0/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-1:0-0/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-1:0-1/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-1:0-1/fast_io_fail_tmo:3
/sys/class/fc_remote_ports/rport-1:0-2/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-1:0-2/fast_io_fail_tmo:3
/sys/class/fc_remote_ports/rport-2:0-0/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-2:0-0/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-2:0-1/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-2:0-1/fast_io_fail_tmo:3
/sys/class/fc_remote_ports/rport-2:0-2/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-2:0-2/fast_io_fail_tmo:3
=-=-=-=-=-=-=-=

Tore

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1435706

Title:
  DevLossTO, FastIoFailTO settings do not match multipath.conf expected
  values

Status in multipath-tools package in Ubuntu:
  Fix Released
Status in multipath-tools source package in Trusty:
  Triaged
Status in multipath-tools source package in Vivid:
  Fix Committed

Bug description:
  [Impact]
  This bug impacts multipath users who need to tweak timeout values for DevLoss and FastIoFail for performance reasons.

  [Test Case]
  On a multipath system, attempt to modify DevLossTO or FastIoFailTO, then verify that the values got applied with 'multipath -l'. See below.

  [Regression Potential]
  Users who have already modified these values but have not noticed they did not properly apply may notice a change in behavior on device failure.

  ---

  Problem Description
  =========================================
  DevLossTO, FastIoFailTO settings do not match multipath.conf expected values

  ---uname output---
  Linux ilp1fc85apA4.tuc.stglabs.ibm.com 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:09:21 UTC 2014 ppc64le ppc64le ppc64le GNU/Linuxuname -m

  Machine Type = p7 8247

  Steps to Reproduce
  ===================================
   Verify DevLossTO, FastIoFailTO setting match multipath.conf expected values

  == Comment: #31 - Thadeu Lima De Souza Cascardo <thadeul at br.ibm.com> - 2015-03-20 10:57:20 ==
  OK.

  From the point of view of multipathd, everything seems correct, by
  looking at the logs.

  I even parsed syslog and the output of getHBAInfo in order to find
  inconsistencies, and the inconsistency is between what multipathd
  logged as configured for a given target, and what its rport reports at
  getHBAInfo.

  So, either multipathd is not configuring the timeouts even though it
  has the right configuration, or something else is changing those
  timeouts.

  The other problem is that multipathd does not include the dev_loss_tmo
  configuration for 2145 as can be seen from list config. So, it could
  be not parsing the configuration correctly, or there could be a
  problem with the configuration.

  At this point, to move forward, I would like to take a look at your
  system, and try reconfigure and looking at some strace output of
  multipathd, to check for writes into sysfs.

  == Comment: #34 - Thadeu Lima De Souza Cascardo <thadeul at br.ibm.com> - 2015-03-20 15:56:46 ==
  OK, so I investigated in the system and read some of the code and checked changelog.

  It looks like Ubuntu is shipping a fairly old version of multipath-
  tools, which is understandable, since multipath-tools is not very good
  in doing frequent releases, so one needs to either ship a version
  closer to upstream git or include its own large set of patches.

  One of the patches missing is the one attached next. Without that, any
  devices included in the built-in hardware table will have some of its
  attributes from the config file ignored. That is the case with 2145.
  So, we lose the dev_loss_tmo setting for that device.

  Cascardo.

  == Comment: #38 - Thadeu Lima De Souza Cascardo <thadeul at br.ibm.com> - 2015-03-20 16:25:39 ==
  The bug this patch fixes would explain why fast_io_fail_tmo is not correctly set in some cases, but not dev_loss_tmo. So, probably, there is another missing patch here. I would like to experiment with the two patches I mentioned, however. Let's try to do this on Monday?

  Cascardo.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1435706/+subscriptions



More information about the foundations-bugs mailing list