[Bug 644489] Re: constantly changes /dev/disk/by-id/{scsi, wwn}-* LUN symlinks with multipathing

Peter Petrakis peter.petrakis at canonical.com
Tue Jun 21 19:00:44 UTC 2011


After chatting with Douglas Gilbert on this I know have a better
understanding of the problem. This very issue has been raised by
him before on linux-scsi and it never saw final resolution.

http://kerneltrap.org/mailarchive/linux-scsi/2010/2/15/6778453

When the file descriptor to the SD device is closed, the change
event occurs, presumably because it was opened with O_RDRW
to begin with, though nothing actually changed.

By changing that flag to O_RDONLY, no udev events are generated
when the fd closes. It doesn't address the root cause but it's
sufficient to get us unjammed safely while we continue to work
on a better solution. That solution may be adjusting the multipath
priority checkers to use the corresponding sg devices when 
presented with an sd device. Either way, there's more work to do.

Since all the priority checkers amount to different degrees of
scsi inquiry, and vendor specific san interrogation commands. I
believe we're safe with the O_RDONLY approach. All data
direction flags for sg io in this case are SG_DXFER_FROM_DEV
or SG_DXFER_NONE.


** Patch added: "multipath-tools-eliminate-udev-change-events-lp644489.debdiff"
   https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/644489/+attachment/2177364/+files/multipath-tools-eliminate-udev-change-events-lp644489.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to udev in Ubuntu.
https://bugs.launchpad.net/bugs/644489

Title:
  constantly changes /dev/disk/by-id/{scsi,wwn}-* LUN symlinks with
  multipathing

Status in “multipath-tools” package in Ubuntu:
  In Progress
Status in “udev” package in Ubuntu:
  Invalid

Bug description:
  Binary package hint: udev

  udevd constantly changes LUN device node symlinks (devices/LUNs, not
  the partition nodes) in /dev/disk/by-id. udevd uses ~15% of CPU and
  system time is using ~50-60%.

  For example:

  [jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36; sleep 1; echo '======'; ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36
  lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sde
  lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdf
  lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sde
  lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdf
  ======
  lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sdg
  lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdh
  lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sdg
  lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdh

  All other device nodes stay the same, such as the device nodes for the
  partitions:

  [jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l scsi-360a98000486e5339576f596675735354-part1; sleep 1; echo '======'; ls -l scsi-360a98000486e5339576f596675735354-part1
  lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
  ======
  lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1

  
  I'm not entirely sure whether this is udev's problem or something related to multipathing. Our most recent experience with multipathing is the last LTS release (hardy), which doesn't exhibit this behavior given similar configurations.

  
  [jwm at syslog01.roch.ny:pts/0 ~> sudo multipath -ll
  rootvol (360a98000486e5339576f596675735354) dm-1 NETAPP  ,LUN           
  [size=36G][features=1 queue_if_no_path][hwhandler=0]
  \_ round-robin 0 [prio=8][active]
   \_ 2:0:2:0 sda 8:0   [active][ready]
   \_ 3:0:2:0 sde 8:64  [active][ready]
  \_ round-robin 0 [prio=2][enabled]
   \_ 3:0:3:0 sdg 8:96  [active][ready]
   \_ 2:0:3:0 sdc 8:32  [active][ready]
  syslog-data (360a98000486e5339576f596675744c36) dm-0 NETAPP  ,LUN           
  [size=1.0T][features=1 queue_if_no_path][hwhandler=0]
  \_ round-robin 0 [prio=8][active]
   \_ 2:0:2:1 sdb 8:16  [active][ready]
   \_ 3:0:2:1 sdf 8:80  [active][ready]
  \_ round-robin 0 [prio=2][enabled]
   \_ 3:0:3:1 sdh 8:112 [active][ready]
   \_ 2:0:3:1 sdd 8:48  [active][ready]
  [jwm at syslog01.roch.ny:pts/0 ~> cat /etc/multipath.conf 
  multipaths {
          multipath {
                  wwid            360a98000486e5339576f596675735354
                  alias           rootvol
          }
          multipath {
                  wwid            360a98000486e5339576f596675744c36
                  alias           syslog-data
          }
  }

  devices {
          device {
                  vendor                  "NETAPP  "
                  product                 "LUN "
                  path_checker            tur
                  path_grouping_policy    group_by_prio
                  prio_callout            "/sbin/mpath_prio_netapp /dev/%n"
                  failback                immediate
                  rr_min_io               128
                  no_path_retry           queue
          }
  }

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/644489/+subscriptions




More information about the foundations-bugs mailing list