[Bug 644489] Re: constantly changes /dev/disk/by-id/{scsi, wwn}-* LUN symlinks with multipathing
Peter Petrakis
peter.petrakis at canonical.com
Tue Jun 21 19:00:44 UTC 2011
After chatting with Douglas Gilbert on this I know have a better
understanding of the problem. This very issue has been raised by
him before on linux-scsi and it never saw final resolution.
http://kerneltrap.org/mailarchive/linux-scsi/2010/2/15/6778453
When the file descriptor to the SD device is closed, the change
event occurs, presumably because it was opened with O_RDRW
to begin with, though nothing actually changed.
By changing that flag to O_RDONLY, no udev events are generated
when the fd closes. It doesn't address the root cause but it's
sufficient to get us unjammed safely while we continue to work
on a better solution. That solution may be adjusting the multipath
priority checkers to use the corresponding sg devices when
presented with an sd device. Either way, there's more work to do.
Since all the priority checkers amount to different degrees of
scsi inquiry, and vendor specific san interrogation commands. I
believe we're safe with the O_RDONLY approach. All data
direction flags for sg io in this case are SG_DXFER_FROM_DEV
or SG_DXFER_NONE.
** Patch added: "multipath-tools-eliminate-udev-change-events-lp644489.debdiff"
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/644489/+attachment/2177364/+files/multipath-tools-eliminate-udev-change-events-lp644489.debdiff
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to udev in Ubuntu.
https://bugs.launchpad.net/bugs/644489
Title:
constantly changes /dev/disk/by-id/{scsi,wwn}-* LUN symlinks with
multipathing
Status in “multipath-tools” package in Ubuntu:
In Progress
Status in “udev” package in Ubuntu:
Invalid
Bug description:
Binary package hint: udev
udevd constantly changes LUN device node symlinks (devices/LUNs, not
the partition nodes) in /dev/disk/by-id. udevd uses ~15% of CPU and
system time is using ~50-60%.
For example:
[jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36; sleep 1; echo '======'; ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sde
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdf
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sde
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdf
======
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sdg
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdh
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sdg
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdh
All other device nodes stay the same, such as the device nodes for the
partitions:
[jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l scsi-360a98000486e5339576f596675735354-part1; sleep 1; echo '======'; ls -l scsi-360a98000486e5339576f596675735354-part1
lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
======
lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
I'm not entirely sure whether this is udev's problem or something related to multipathing. Our most recent experience with multipathing is the last LTS release (hardy), which doesn't exhibit this behavior given similar configurations.
[jwm at syslog01.roch.ny:pts/0 ~> sudo multipath -ll
rootvol (360a98000486e5339576f596675735354) dm-1 NETAPP ,LUN
[size=36G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=8][active]
\_ 2:0:2:0 sda 8:0 [active][ready]
\_ 3:0:2:0 sde 8:64 [active][ready]
\_ round-robin 0 [prio=2][enabled]
\_ 3:0:3:0 sdg 8:96 [active][ready]
\_ 2:0:3:0 sdc 8:32 [active][ready]
syslog-data (360a98000486e5339576f596675744c36) dm-0 NETAPP ,LUN
[size=1.0T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=8][active]
\_ 2:0:2:1 sdb 8:16 [active][ready]
\_ 3:0:2:1 sdf 8:80 [active][ready]
\_ round-robin 0 [prio=2][enabled]
\_ 3:0:3:1 sdh 8:112 [active][ready]
\_ 2:0:3:1 sdd 8:48 [active][ready]
[jwm at syslog01.roch.ny:pts/0 ~> cat /etc/multipath.conf
multipaths {
multipath {
wwid 360a98000486e5339576f596675735354
alias rootvol
}
multipath {
wwid 360a98000486e5339576f596675744c36
alias syslog-data
}
}
devices {
device {
vendor "NETAPP "
product "LUN "
path_checker tur
path_grouping_policy group_by_prio
prio_callout "/sbin/mpath_prio_netapp /dev/%n"
failback immediate
rr_min_io 128
no_path_retry queue
}
}
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/644489/+subscriptions
More information about the foundations-bugs
mailing list