[Bug 1099875] [NEW] multipathd ignores dev_loss_tmo and fast_io_fail_tmo settings

Tore Anderson tore at fud.no
Tue Jan 15 15:00:57 UTC 2013


Public bug reported:

My device section of /etc/multipath.conf contains the following (I'll
attach the complete file in a bit):

fast_io_fail_tmo 3
dev_loss_tmo 2147483647

This is also visible in the output from multipathd -k"show config", so
it's being correctly parsed. However, the settings appears to be
completely ignored by multipathd, as the corresponding sysfs settings
doesn't get updated:

$ grep . /sys/class/fc_remote_ports/rport-*/*tmo
/sys/class/fc_remote_ports/rport-2:0-0/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-2:0-0/fast_io_fail_tmo:off
/sys/class/fc_remote_ports/rport-3:0-0/dev_loss_tmo:30
/sys/class/fc_remote_ports/rport-3:0-0/fast_io_fail_tmo:off

These are the kernel's defaults. I can easily set them manually:

$ echo 3 | tee /sys/class/fc_remote_ports/rport-*/fast_io_fail_tmo
3
$ echo 2147483647 | tee /sys/class/fc_remote_ports/rport-*/dev_loss_tmo
2147483647
$ grep . /sys/class/fc_remote_ports/rport-*/*tmo
/sys/class/fc_remote_ports/rport-2:0-0/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-2:0-0/fast_io_fail_tmo:3
/sys/class/fc_remote_ports/rport-3:0-0/dev_loss_tmo:2147483647
/sys/class/fc_remote_ports/rport-3:0-0/fast_io_fail_tmo:3

However, this won't survive a reboot, and since SAN paths may appear at
any time, adding the above to /etc/rc.local is also not a good way to
fix it.

I've also attempted to move the settings to the defaults section and the
individual path sections. No change in behaviour.

This bug prevents dm-multipath from quickly moving I/O away from a
recently failed SAN path, instead stalling all I/O for 30 seconds. This
defeats the purpose of using multipathing in the first place, which is
to have highly available I/O access so that production can continue
uninterrupted even if parts of the SAN fabric fails.

The setting works on RHEL6, btw.

** Affects: multipath-tools (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "multipath.conf from impacted Precise server"
   https://bugs.launchpad.net/bugs/1099875/+attachment/3484064/+files/multipath.conf

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1099875

Title:
  multipathd ignores dev_loss_tmo and fast_io_fail_tmo settings

Status in “multipath-tools” package in Ubuntu:
  New

Bug description:
  My device section of /etc/multipath.conf contains the following (I'll
  attach the complete file in a bit):

  fast_io_fail_tmo 3
  dev_loss_tmo 2147483647

  This is also visible in the output from multipathd -k"show config", so
  it's being correctly parsed. However, the settings appears to be
  completely ignored by multipathd, as the corresponding sysfs settings
  doesn't get updated:

  $ grep . /sys/class/fc_remote_ports/rport-*/*tmo
  /sys/class/fc_remote_ports/rport-2:0-0/dev_loss_tmo:30
  /sys/class/fc_remote_ports/rport-2:0-0/fast_io_fail_tmo:off
  /sys/class/fc_remote_ports/rport-3:0-0/dev_loss_tmo:30
  /sys/class/fc_remote_ports/rport-3:0-0/fast_io_fail_tmo:off

  These are the kernel's defaults. I can easily set them manually:

  $ echo 3 | tee /sys/class/fc_remote_ports/rport-*/fast_io_fail_tmo
  3
  $ echo 2147483647 | tee /sys/class/fc_remote_ports/rport-*/dev_loss_tmo
  2147483647
  $ grep . /sys/class/fc_remote_ports/rport-*/*tmo
  /sys/class/fc_remote_ports/rport-2:0-0/dev_loss_tmo:2147483647
  /sys/class/fc_remote_ports/rport-2:0-0/fast_io_fail_tmo:3
  /sys/class/fc_remote_ports/rport-3:0-0/dev_loss_tmo:2147483647
  /sys/class/fc_remote_ports/rport-3:0-0/fast_io_fail_tmo:3

  However, this won't survive a reboot, and since SAN paths may appear
  at any time, adding the above to /etc/rc.local is also not a good way
  to fix it.

  I've also attempted to move the settings to the defaults section and
  the individual path sections. No change in behaviour.

  This bug prevents dm-multipath from quickly moving I/O away from a
  recently failed SAN path, instead stalling all I/O for 30 seconds.
  This defeats the purpose of using multipathing in the first place,
  which is to have highly available I/O access so that production can
  continue uninterrupted even if parts of the SAN fabric fails.

  The setting works on RHEL6, btw.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1099875/+subscriptions




More information about the foundations-bugs mailing list