[Bug 1535898] Re: Trusty & Vivid multipath-tools (multipathd) seg-fault core dump
Louis Bouchard
louis.bouchard at canonical.com
Thu Jan 21 13:44:08 UTC 2016
To further add to Mathieu's comment, here is the backtrace of one of the
recent core that we got :
(gdb) bt
#0 0x00007f17a122a0d5 in __GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00007f17a122d83b in __GI_abort () at abort.c:91
#2 0x00007f17a126732e in __libc_message (do_abort=2, fmt=0x7f17a13715d8 "*** glibc detected *** %s: %s: 0x%s ***\n")
at ../sysdeps/unix/sysv/linux/libc_fatal.c:201
#3 0x00007f17a1271b26 in malloc_printerr (action=3, str=0x7f17a13717c8 "double free or corruption (fasttop)",
ptr=<optimized out>) at malloc.c:5051
#4 0x00007f17a15c8f27 in free_multipath (mpp=0x7f177c009160, free_paths=0) at structs.c:174
#5 0x00007f17a15ec09a in _remove_map (mpp=0x7f177c009160, vecs=0xbaea70, stop_waiter=1, purge_vec=1) at structs_vec.c:143
#6 0x00007f17a15ec0f8 in remove_map_and_stop_waiter (mpp=0x7f177c009160, vecs=0xbaea70, purge_vec=1) at structs_vec.c:156
#7 0x00000000004075f5 in mpvec_garbage_collector (vecs=0xbaea70) at main.c:949
#8 0x00007f177c007060 in ?? ()
#9 0x0000000000baea70 in ?? ()
#10 0x00007f177c009160 in ?? ()
#11 0x0000000200000003 in ?? ()
#12 0x00007f17a2274e20 in ?? ()
#13 0x00000000004080f0 in checkerloop (ap=0x7f17a13717c8) at main.c:1162
#14 0x0000000000000000 in ?? ()
(gdb) f 4
#4 0x00007f17a15c8f27 in free_multipath (mpp=0x7f177c009160, free_paths=0) at structs.c:174
warning: Source file is more recent than executable.
174 FREE(mpp->dmi);
(gdb) l
169 FREE(mpp->alias);
170 mpp->alias = NULL;
171 }
172
173 if (mpp->dmi) {
174 FREE(mpp->dmi);
175 mpp->dmi = NULL;
176 }
177
178 /*
(gdb)
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1535898
Title:
Trusty & Vivid multipath-tools (multipathd) seg-fault core dump
Status in multipath-tools package in Ubuntu:
Incomplete
Bug description:
We have a problem on multipath-tools.
Usually after a path removal and a re-scan, the multipathd process
dies.
I created 2 hosts:
iscsi-server
iscsi-client
With 4 NICs in between them and with a simple multibus multipath. With
that I was able to check that there is a regression in multipath-
tools.
It looks like the patches brought from upstream:
0017-multipath-get-right-sysfs-value-for-checker_timeout.patch
0018-multipath-handle-offlined-paths.patch
#
# from here
#
0019-multipath-fix-scsi-timeout-code.patch
0020-multipath-make-tgt_node_name-work-for-iscsi-devices.patch
0021-multipath-cleanup-dev_loss_tmo-issues.patch
0022-Fix-for-setting-0-to-fast_io_fail.patch
0023-Fix-fast_io_fail-capping.patch
0024-multipath-enable-getting-uevents-through-libudev.patch
0025-Use-devpath-as-argument-for-sysfs-functions.patch
0026-multipathd-remove-references-to-sysfs_device.patch
0027-multipathd-use-struct-path-as-argument-for-event-pro.patch
0028-Add-global-udev-reference-pointer-to-config.patch
0029-Use-udev-enumeration-during-discovery.patch
0030-use-struct-udev_device-during-discovery.patch
0031-More-debugging-output-when-synchronizing-path-states.patch
0032-Use-struct-udev_device-instead-of-sysdev.patch
0033-discovery-Fixup-cciss-discovery.patch
0035-Use-udev-devices-during-discovery.patch
0036-Remove-all-references-to-hand-craftes-sysfs-code.patch
#
# to here
#
# 0037-multipath-libudev-cleanup-and-bugfixes.patch
# 0038-multipath-check-if-a-device-belongs-to-multipath.patch
# 0039-multipath-and-wwids_file-multipath.conf-option.patch
# 0040-multipath-Check-blacklists-as-soon-as-possible.patch
# 0041-add-wwids-file-cleanup-options.patch
# 0042-add-find_multipaths-option.patch
# 0043-alloc-keywords.patch
# lp1503305_libmultipath_info_on_1st_path_down_dbd131e.patch
In the range 19-36 caused a regression.
Whenever I generate the package (for trusty) including those patches
I'm able to generate a core dump indicating a possible double-free or
null-dereference related to a path removal (that is why I can
reproduce with the test case). Unfortunately it usually explodes
inside malloc() or somewhere in glibc.
Using valgrind I was able to verify some free() errors:
==30415== Invalid free() / delete / delete[] / realloc()
==30415== at 0x4C2BDEC: free (vg_replace_malloc.c:473)
==30415== by 0x54E243C: vector_del_slot (vector.c:95)
==30415== by 0x550A516: _remove_map (structs_vec.c:139)
==30415== by 0x550A5C3: _remove_maps (structs_vec.c:170)
==30415== by 0x550A64B: remove_maps (structs_vec.c:181)
==30415== by 0x40713F: configure (main.c:1153)
==30415== by 0x407A74: child (main.c:1419)
==30415== by 0x40837D: main (main.c:1618)
And they are exactly aligned to a core dump (multipathd) I got from
another user. (wrong free was coming from _remove_map).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1535898/+subscriptions
More information about the foundations-bugs
mailing list