[Bug 1535898] Re: Trusty & Vivid multipath-tools (multipathd) seg-fault core dump
Mathieu Trudel-Lapierre
mathieu.tl at gmail.com
Wed Jan 20 19:23:08 UTC 2016
The crash you're hitting is something very different than what valgrind
is finding (though I agree technically valgrind is correct in pointing
this out, and it looks as though some of it is addressed upstream).
Program terminated with signal SIGSEGV, Segmentation fault.
#0 malloc_consolidate (av=av at entry=0x7f9aa4000020) at malloc.c:4151
[Current thread is 1 (LWP 3163)]
(gdb) bt full
#0 malloc_consolidate (av=av at entry=0x7f9aa4000020) at malloc.c:4151
fb = <optimized out>
maxfb = 0x7f9aa4000070
p = 0x7f9aa4000078
nextp = 0x7f9aa40009a0
unsorted_bin = 0x7f9aa4000078
first_unsorted = <optimized out>
nextchunk = 0xff3548000a18
size = 140302153157024
nextsize = <optimized out>
prevsize = <optimized out>
nextinuse = <optimized out>
bck = <optimized out>
fwd = <optimized out>
__func__ = "malloc_consolidate"
#1 0x00007f9acb099df8 in _int_malloc (av=0x7f9aa4000020, bytes=16384) at malloc.c:3423
nb = 16400
idx = 114
bin = <optimized out>
victim = <optimized out>
size = <optimized out>
victim_index = <optimized out>
remainder = <optimized out>
remainder_size = <optimized out>
block = <optimized out>
bit = <optimized out>
map = <optimized out>
fwd = <optimized out>
bck = <optimized out>
errstr = 0x0
__func__ = "_int_malloc"
#2 0x00007f9acb09c7b0 in __GI___libc_malloc (bytes=16384) at malloc.c:2891
ar_ptr = 0x7f9aa4000020
victim = 0x511
__func__ = "__libc_malloc"
#3 0x00007f9acbaa94d7 in dm_task_run () from /tmp/apport_sandbox_S4eo5o/lib/x86_64-linux-gnu/libdevmapper.so.1.02.1
No symbol table info available.
#4 0x00007f9acb3eed9a in dm_map_present (str=0x7f9aa4000ef0 "lun01") at devmapper.c:304
r = 0
dmt = 0x7f9aa40008e0
info = {exists = -871807232, suspended = 32666, live_table = -888551504, inactive_table = 32666, open_count = -871809664, event_nr = 32666, major = 3423160064, minor = 32666,
read_only = -871809840, target_count = 32666}
#5 0x0000000000404a77 in ev_add_map (dev=0x7f9ac40020fb "dm-3", alias=0x7f9aa4000ef0 "lun01", vecs=0xb6b6b0) at main.c:256
refwwid = 0x600000000 <error: Cannot access memory at address 0x600000000>
mpp = 0x7f9aa4000ef0
map_present = 32666
r = 1
#6 0x0000000000404a3c in uev_add_map (uev=0x7f9ac4002020, vecs=0xb6b6b0) at main.c:243
alias = 0x7f9aa4000ef0 "lun01"
major = -1
minor = -1
rc = 32666
#7 0x00000000004061ed in uev_trigger (uev=0x7f9ac4002020, trigger_data=0xb6b6b0) at main.c:755
r = 0
vecs = 0xb6b6b0
#8 0x00007f9acb40d29d in service_uevq (tmpq=0x7f9acc093de0) at uevent.c:118
uev = 0x7f9ac4002020
tmp = 0x7f9acc093de0
#9 0x00007f9acb40d4ac in uevent_dispatch (uev_trigger=0x406130 <uev_trigger>, trigger_data=0xb6b6b0) at uevent.c:167
uevq_tmp = {next = 0x7f9acc093de0, prev = 0x7f9acc093de0}
#10 0x0000000000406436 in uevqloop (ap=0xb6b6b0) at main.c:814
No locals.
#11 0x00007f9acbcc3182 in start_thread (arg=0x7f9acc094700) at pthread_create.c:312
__res = <optimized out>
pd = 0x7f9acc094700
now = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140302824851200, 5524701056917599093, 0, 0, 140302824851904, 140302824851200, -5503902022748048523, -5503898131477662859},
mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
pagesize_m1 = <optimized out>
sp = <optimized out>
freesize = <optimized out>
__PRETTY_FUNCTION__ = "start_thread"
#12 0x00007f9acb11447d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.
That kind of makes me suspicious of the state of the kernel multipath
modules. Which ones are loaded of the dm-* modules? Does anything show
up in /var/log/syslog? It would be quite helpful if you could attach
syslog to this bug report.
** Changed in: multipath-tools (Ubuntu)
Status: In Progress => Incomplete
** Changed in: multipath-tools (Ubuntu)
Assignee: (unassigned) => Mathieu Trudel-Lapierre (mathieu-tl)
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1535898
Title:
Trusty & Vivid multipath-tools (multipathd) seg-fault core dump
Status in multipath-tools package in Ubuntu:
Incomplete
Bug description:
We have a problem on multipath-tools.
Usually after a path removal and a re-scan, the multipathd process
dies.
I created 2 hosts:
iscsi-server
iscsi-client
With 4 NICs in between them and with a simple multibus multipath. With
that I was able to check that there is a regression in multipath-
tools.
It looks like the patches brought from upstream:
0017-multipath-get-right-sysfs-value-for-checker_timeout.patch
0018-multipath-handle-offlined-paths.patch
#
# from here
#
0019-multipath-fix-scsi-timeout-code.patch
0020-multipath-make-tgt_node_name-work-for-iscsi-devices.patch
0021-multipath-cleanup-dev_loss_tmo-issues.patch
0022-Fix-for-setting-0-to-fast_io_fail.patch
0023-Fix-fast_io_fail-capping.patch
0024-multipath-enable-getting-uevents-through-libudev.patch
0025-Use-devpath-as-argument-for-sysfs-functions.patch
0026-multipathd-remove-references-to-sysfs_device.patch
0027-multipathd-use-struct-path-as-argument-for-event-pro.patch
0028-Add-global-udev-reference-pointer-to-config.patch
0029-Use-udev-enumeration-during-discovery.patch
0030-use-struct-udev_device-during-discovery.patch
0031-More-debugging-output-when-synchronizing-path-states.patch
0032-Use-struct-udev_device-instead-of-sysdev.patch
0033-discovery-Fixup-cciss-discovery.patch
0035-Use-udev-devices-during-discovery.patch
0036-Remove-all-references-to-hand-craftes-sysfs-code.patch
#
# to here
#
# 0037-multipath-libudev-cleanup-and-bugfixes.patch
# 0038-multipath-check-if-a-device-belongs-to-multipath.patch
# 0039-multipath-and-wwids_file-multipath.conf-option.patch
# 0040-multipath-Check-blacklists-as-soon-as-possible.patch
# 0041-add-wwids-file-cleanup-options.patch
# 0042-add-find_multipaths-option.patch
# 0043-alloc-keywords.patch
# lp1503305_libmultipath_info_on_1st_path_down_dbd131e.patch
In the range 19-36 caused a regression.
Whenever I generate the package (for trusty) including those patches
I'm able to generate a core dump indicating a possible double-free or
null-dereference related to a path removal (that is why I can
reproduce with the test case). Unfortunately it usually explodes
inside malloc() or somewhere in glibc.
Using valgrind I was able to verify some free() errors:
==30415== Invalid free() / delete / delete[] / realloc()
==30415== at 0x4C2BDEC: free (vg_replace_malloc.c:473)
==30415== by 0x54E243C: vector_del_slot (vector.c:95)
==30415== by 0x550A516: _remove_map (structs_vec.c:139)
==30415== by 0x550A5C3: _remove_maps (structs_vec.c:170)
==30415== by 0x550A64B: remove_maps (structs_vec.c:181)
==30415== by 0x40713F: configure (main.c:1153)
==30415== by 0x407A74: child (main.c:1419)
==30415== by 0x40837D: main (main.c:1618)
And they are exactly aligned to a core dump (multipathd) I got from
another user. (wrong free was coming from _remove_map).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1535898/+subscriptions
More information about the foundations-bugs
mailing list