[Bug 1020210] Re: Race condition using ATOMIC_FASTBINS in _int_free causes crash or heap corruption
Michael Truog
1020210 at bugs.launchpad.net
Sat Nov 30 21:05:54 UTC 2013
I believe I am seeing this bug in-action on Ubuntu 12.04.3 AMD64. I
have some gdb backtrace info pasted below. I am surprised glibc malloc
has been broken for more than 1 year. I will definitely avoid it in the
future, but it also shows the potential for future instability at the
OS-level:
##t 3 crash SIGSEGV
#0 malloc_consolidate (av=0x7fa89c000020) at malloc.c:4272
#1 0x00007fa8d08ceb89 in malloc_consolidate (av=0x7fa89c000020)
at malloc.c:4247
#2 _int_free (av=0x7fa89c000020, p=<optimized out>, have_lock=0)
at malloc.c:4178
#3 0x000000000041a8f8 in __gnu_cxx::new_allocator<unsigned long>::deallocate (
this=0x7fa89c000ae8, __p=0x7fa89c068bc0)
at /usr/include/c++/4.6/ext/new_allocator.h:98
#4 0x000000000041a484 in std::_Deque_base<unsigned long, std::allocator<unsigned long> >::_M_deallocate_node (this=0x7fa89c000ae8, __p=0x7fa89c068bc0)
at /usr/include/c++/4.6/bits/stl_deque.h:531
##f 0
##p *av
*av = {mutex = 1, flags = 3, fastbinsY = {0x0, 0x7fa89c182920, 0x7fa89c15ed80,
0x0, 0x7fa89c16b170, 0x7fa89c05aa80, 0x7fa89c185340, 0x0, 0x0, 0x0},
top = 0x7fa89c18bb30, last_remainder = 0x7fa89c141170, bins = {
0x7fa89c141170, ...},
binmap = {262428, 8, 1, 2}, next = 0x7fa8b0000020, next_free = 0x0,
system_mem = 1634304, max_system_mem = 1634304}
##t 2
#0 sYSMALLOc (av=<optimized out>, nb=528) at malloc.c:2756
#1 _int_malloc (av=<optimized out>, bytes=512) at malloc.c:3924
#2 0x00007fa8d08d1f95 in __GI___libc_malloc (bytes=512) at malloc.c:2924
#3 0x00007fa8d01cfd8b in j__udyLAllocJLL5 () from /usr/lib/libJudy.so.1
#4 0x00007fa8d01cc9ae in ?? () from /usr/lib/libJudy.so.1
#5 0x00007fa8d01ca3ee in ?? () from /usr/lib/libJudy.so.1
#6 0x00007fa8d01ca3ee in ?? () from /usr/lib/libJudy.so.1
#7 0x00007fa8d01ca3ee in ?? () from /usr/lib/libJudy.so.1
#8 0x00007fa8d01cd83f in JudyLIns () from /usr/lib/libJudy.so.1
The malloc.c line numbers look erroneous based on my examination of the
source code from http://packages.ubuntu.com/precise/libc6 .
Unfortunately, that must just be a problem with the libc6-dbg package
data.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to eglibc in Ubuntu.
https://bugs.launchpad.net/bugs/1020210
Title:
Race condition using ATOMIC_FASTBINS in _int_free causes crash or heap
corruption
Status in Embedded GLIBC:
Incomplete
Status in “eglibc” package in Ubuntu:
Confirmed
Bug description:
We have an application which makes heavy allocation and de-allocation
demands from multiple threads. We run this application continuously
on many servers, and once every several CPU months or years, we were
getting a crash in _int_free that did not look like vanilla heap
corruption. I believe I have narrowed it down to a race condition in
_int_free due to the ATOMIC_FASTBINS feature. Basically, in the
lockless FASTBIN _int_free path, a chunk is pulled into a local
variable with the intent to add it to the fastbins list. However, the
heap consolidation/trim code can race with this, and can coalesce the
entire block and/or give it back to the OS before _int_free has a
chance to try and store it into the fastbins list.
The problem is very challenging to reproduce in situ, but using gdb I
have a recipe which demonstrates the crash 100% of the time on my
12.04 x64 system running eglibc 2.15. It relies on malloc_trim,
although in our in situ data, the consolidation is triggered as a
result of a normal free. malloc_trim is just easier to control.
While I am not a glibc developer, I could not see any easy ways to fix
the situation shy of disabling ATOMIC_FASTBINS.
I am attaching the reproduction source. Other pertinent information
follows:
> jpieper at calculon:~/downloads$ lsb_release -rd
> Description: Ubuntu 12.04 LTS
> Release: 12.04
> jpieper at calculon:~/downloads$ apt-cache policy libc6
> libc6:
> Installed: 2.15-0ubuntu10
> Candidate: 2.15-0ubuntu10
> Version table:
> *** 2.15-0ubuntu10 0
> 500 http://us.archive.ubuntu.com/ubuntu/ precise/main amd64 Packages
> 100 /var/lib/dpkg/status
What I expect: I expect the attached application, when run using the gdb script in the comments, to complete with no failures.
What happened: A SIGSEGV after the final continue.
To manage notifications about this bug go to:
https://bugs.launchpad.net/eglibc/+bug/1020210/+subscriptions
More information about the foundations-bugs
mailing list