[Bug 644489] Re: constantly changes /dev/disk/by-id/{scsi, wwn}-* LUN symlinks with multipathing
Peter Petrakis
peter.petrakis at canonical.com
Tue Jun 21 20:34:32 UTC 2011
also affects udev should be removed.
** Changed in: udev (Ubuntu Lucid)
Status: New => Invalid
** Changed in: udev (Ubuntu Maverick)
Status: New => Invalid
** Changed in: udev (Ubuntu Natty)
Status: New => Invalid
** Package changed: udev (Ubuntu) => ubuntu
** Description changed:
+ = SRU Justification =
+
+ == Impact ==
+ Multipath-tools is inadvertedly generating UDEV CHANGE events for the SD
+ block devices under it's control. These change events feedback into the udev
+ rules, increasing cpu utlilzation, and ruining the multipath aliasing feature,
+ which allows one to rename a multipath path from a series of letters and
+ numbers, to a human readable label. It gives users the impression that
+ the SAN is unstable.
+
+ == Solution ==
+ Change the open() flags in the priority checkers to read only from read write, this
+ stops sd from generating a change event after the file descriptor has been
+ closed.
+
+ Patch: https://bugs.launchpad.net/ubuntu/+source/multipath-
+ tools/+bug/644489/+attachment/2177364/+files/multipath-tools-eliminate-
+ udev-change-events-lp644489.debdiff
+
+ == Reproduction ==
+
+ Is easy, and doesn't even require a SAN. Since we're dealing with simple SCSI
+ inquiry cmds any block device will do. Simply install multipath-tools and
+ execute one of the priority checkers like so:
+
+ /sbin/mpath_prio_emc /dev/sda
+
+ also, have a window open monitoring udev, udevadm monitor, ensure
+ no change events to that block device are occurring before hand.
+
+ TEST CASE:
+ root at kickseed:~# udevadm monitor &
+ [1] 16950
+ root at kickseed:~# monitor will print the received events for:
+ UDEV - the event which udev sends out after rule processing
+ KERNEL - the kernel uevent
+ root at kickseed:~#
+ root at kickseed:~# /sbin/mpath_prio_emc /dev/sda
+ query command indicates error0
+ root at kickseed:~# KERNEL[1308688009.806317] change /devices/pci0000:00/0000:00:07.0/0000:04:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:0/0:0:0:0/block/sda (block)
+ UDEV [1308688009.823569] change /devices/pci0000:00/0000:00:07.0/0000:04:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:0/0:0:0:0/block/sda (block)
+
+ root at kickseed:~#
+ root at kickseed:~#
+ root at kickseed:~# /sbin/mpath_prio_alua /dev/sda
+ 130
+
+ mpath_prio_alua doesn't generate any change events since it's open
+ flags do not include O_RDRW to begin with.
+
+ == regression potential ==
+ None, it's broken to begin with.
+ --------------------------
+
Binary package hint: udev
udevd constantly changes LUN device node symlinks (devices/LUNs, not the
partition nodes) in /dev/disk/by-id. udevd uses ~15% of CPU and system
time is using ~50-60%.
For example:
[jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36; sleep 1; echo '======'; ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sde
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdf
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sde
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdf
======
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sdg
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdh
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sdg
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdh
All other device nodes stay the same, such as the device nodes for the
partitions:
[jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l scsi-360a98000486e5339576f596675735354-part1; sleep 1; echo '======'; ls -l scsi-360a98000486e5339576f596675735354-part1
lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
======
lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
-
- I'm not entirely sure whether this is udev's problem or something related to multipathing. Our most recent experience with multipathing is the last LTS release (hardy), which doesn't exhibit this behavior given similar configurations.
-
+ I'm not entirely sure whether this is udev's problem or something
+ related to multipathing. Our most recent experience with multipathing is
+ the last LTS release (hardy), which doesn't exhibit this behavior given
+ similar configurations.
[jwm at syslog01.roch.ny:pts/0 ~> sudo multipath -ll
- rootvol (360a98000486e5339576f596675735354) dm-1 NETAPP ,LUN
+ rootvol (360a98000486e5339576f596675735354) dm-1 NETAPP ,LUN
[size=36G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=8][active]
- \_ 2:0:2:0 sda 8:0 [active][ready]
- \_ 3:0:2:0 sde 8:64 [active][ready]
+ \_ 2:0:2:0 sda 8:0 [active][ready]
+ \_ 3:0:2:0 sde 8:64 [active][ready]
\_ round-robin 0 [prio=2][enabled]
- \_ 3:0:3:0 sdg 8:96 [active][ready]
- \_ 2:0:3:0 sdc 8:32 [active][ready]
- syslog-data (360a98000486e5339576f596675744c36) dm-0 NETAPP ,LUN
+ \_ 3:0:3:0 sdg 8:96 [active][ready]
+ \_ 2:0:3:0 sdc 8:32 [active][ready]
+ syslog-data (360a98000486e5339576f596675744c36) dm-0 NETAPP ,LUN
[size=1.0T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=8][active]
- \_ 2:0:2:1 sdb 8:16 [active][ready]
- \_ 3:0:2:1 sdf 8:80 [active][ready]
+ \_ 2:0:2:1 sdb 8:16 [active][ready]
+ \_ 3:0:2:1 sdf 8:80 [active][ready]
\_ round-robin 0 [prio=2][enabled]
- \_ 3:0:3:1 sdh 8:112 [active][ready]
- \_ 2:0:3:1 sdd 8:48 [active][ready]
- [jwm at syslog01.roch.ny:pts/0 ~> cat /etc/multipath.conf
+ \_ 3:0:3:1 sdh 8:112 [active][ready]
+ \_ 2:0:3:1 sdd 8:48 [active][ready]
+ [jwm at syslog01.roch.ny:pts/0 ~> cat /etc/multipath.conf
multipaths {
- multipath {
- wwid 360a98000486e5339576f596675735354
- alias rootvol
- }
- multipath {
- wwid 360a98000486e5339576f596675744c36
- alias syslog-data
- }
+ multipath {
+ wwid 360a98000486e5339576f596675735354
+ alias rootvol
+ }
+ multipath {
+ wwid 360a98000486e5339576f596675744c36
+ alias syslog-data
+ }
}
devices {
- device {
- vendor "NETAPP "
- product "LUN "
- path_checker tur
- path_grouping_policy group_by_prio
- prio_callout "/sbin/mpath_prio_netapp /dev/%n"
- failback immediate
- rr_min_io 128
- no_path_retry queue
- }
+ device {
+ vendor "NETAPP "
+ product "LUN "
+ path_checker tur
+ path_grouping_policy group_by_prio
+ prio_callout "/sbin/mpath_prio_netapp /dev/%n"
+ failback immediate
+ rr_min_io 128
+ no_path_retry queue
+ }
}
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to udev in Ubuntu.
https://bugs.launchpad.net/bugs/644489
Title:
constantly changes /dev/disk/by-id/{scsi,wwn}-* LUN symlinks with
multipathing
Status in Ubuntu:
Invalid
Status in “multipath-tools” package in Ubuntu:
Fix Released
Status in The Lucid Lynx:
Invalid
Status in “multipath-tools” source package in Lucid:
Confirmed
Status in The Maverick Meerkat:
Invalid
Status in “multipath-tools” source package in Maverick:
Confirmed
Status in The Natty Narwhal:
Invalid
Status in “multipath-tools” source package in Natty:
Confirmed
Bug description:
= SRU Justification =
== Impact ==
Multipath-tools is inadvertedly generating UDEV CHANGE events for the SD
block devices under it's control. These change events feedback into the udev
rules, increasing cpu utlilzation, and ruining the multipath aliasing feature,
which allows one to rename a multipath path from a series of letters and
numbers, to a human readable label. It gives users the impression that
the SAN is unstable.
== Solution ==
Change the open() flags in the priority checkers to read only from read write, this
stops sd from generating a change event after the file descriptor has been
closed.
Patch: https://bugs.launchpad.net/ubuntu/+source/multipath-
tools/+bug/644489/+attachment/2177364/+files/multipath-tools-
eliminate-udev-change-events-lp644489.debdiff
== Reproduction ==
Is easy, and doesn't even require a SAN. Since we're dealing with simple SCSI
inquiry cmds any block device will do. Simply install multipath-tools and
execute one of the priority checkers like so:
/sbin/mpath_prio_emc /dev/sda
also, have a window open monitoring udev, udevadm monitor, ensure
no change events to that block device are occurring before hand.
TEST CASE:
root at kickseed:~# udevadm monitor &
[1] 16950
root at kickseed:~# monitor will print the received events for:
UDEV - the event which udev sends out after rule processing
KERNEL - the kernel uevent
root at kickseed:~#
root at kickseed:~# /sbin/mpath_prio_emc /dev/sda
query command indicates error0
root at kickseed:~# KERNEL[1308688009.806317] change /devices/pci0000:00/0000:00:07.0/0000:04:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:0/0:0:0:0/block/sda (block)
UDEV [1308688009.823569] change /devices/pci0000:00/0000:00:07.0/0000:04:00.0/host0/port-0:0/expander-0:0/port-0:0:1/end_device-0:0:1/target0:0:0/0:0:0:0/block/sda (block)
root at kickseed:~#
root at kickseed:~#
root at kickseed:~# /sbin/mpath_prio_alua /dev/sda
130
mpath_prio_alua doesn't generate any change events since it's open
flags do not include O_RDRW to begin with.
== regression potential ==
None, it's broken to begin with.
--------------------------
Binary package hint: udev
udevd constantly changes LUN device node symlinks (devices/LUNs, not
the partition nodes) in /dev/disk/by-id. udevd uses ~15% of CPU and
system time is using ~50-60%.
For example:
[jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36; sleep 1; echo '======'; ls -l wwn-0x60a98000486e5339576f596675735354 wwn-0x60a98000486e5339576f596675744c36 scsi-360a98000486e5339576f596675735354 scsi-360a98000486e5339576f596675744c36
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sde
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdf
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sde
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdf
======
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675735354 -> ../../sdg
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 scsi-360a98000486e5339576f596675744c36 -> ../../sdh
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675735354 -> ../../sdg
lrwxrwxrwx 1 root root 9 2010-09-21 16:12 wwn-0x60a98000486e5339576f596675744c36 -> ../../sdh
All other device nodes stay the same, such as the device nodes for the
partitions:
[jwm at syslog01.roch.ny:pts/0 /dev/disk/by-id> ls -l scsi-360a98000486e5339576f596675735354-part1; sleep 1; echo '======'; ls -l scsi-360a98000486e5339576f596675735354-part1
lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
======
lrwxrwxrwx 1 root root 10 2010-09-21 15:47 scsi-360a98000486e5339576f596675735354-part1 -> ../../sdg1
I'm not entirely sure whether this is udev's problem or something
related to multipathing. Our most recent experience with multipathing
is the last LTS release (hardy), which doesn't exhibit this behavior
given similar configurations.
[jwm at syslog01.roch.ny:pts/0 ~> sudo multipath -ll
rootvol (360a98000486e5339576f596675735354) dm-1 NETAPP ,LUN
[size=36G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=8][active]
\_ 2:0:2:0 sda 8:0 [active][ready]
\_ 3:0:2:0 sde 8:64 [active][ready]
\_ round-robin 0 [prio=2][enabled]
\_ 3:0:3:0 sdg 8:96 [active][ready]
\_ 2:0:3:0 sdc 8:32 [active][ready]
syslog-data (360a98000486e5339576f596675744c36) dm-0 NETAPP ,LUN
[size=1.0T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=8][active]
\_ 2:0:2:1 sdb 8:16 [active][ready]
\_ 3:0:2:1 sdf 8:80 [active][ready]
\_ round-robin 0 [prio=2][enabled]
\_ 3:0:3:1 sdh 8:112 [active][ready]
\_ 2:0:3:1 sdd 8:48 [active][ready]
[jwm at syslog01.roch.ny:pts/0 ~> cat /etc/multipath.conf
multipaths {
multipath {
wwid 360a98000486e5339576f596675735354
alias rootvol
}
multipath {
wwid 360a98000486e5339576f596675744c36
alias syslog-data
}
}
devices {
device {
vendor "NETAPP "
product "LUN "
path_checker tur
path_grouping_policy group_by_prio
prio_callout "/sbin/mpath_prio_netapp /dev/%n"
failback immediate
rr_min_io 128
no_path_retry queue
}
}
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/644489/+subscriptions
More information about the foundations-bugs
mailing list