[Bug 1020210] Re: Race condition using ATOMIC_FASTBINS in _int_free causes crash or heap corruption

Michael Truog 1020210 at bugs.launchpad.net
Sat Nov 30 21:05:54 UTC 2013


I believe I am seeing this bug in-action on Ubuntu 12.04.3 AMD64.  I
have some gdb backtrace info pasted below.  I am surprised glibc malloc
has been broken for more than 1 year.  I will definitely avoid it in the
future, but it also shows the potential for future instability at the
OS-level:

##t 3 crash SIGSEGV
#0  malloc_consolidate (av=0x7fa89c000020) at malloc.c:4272
#1  0x00007fa8d08ceb89 in malloc_consolidate (av=0x7fa89c000020)
    at malloc.c:4247
#2  _int_free (av=0x7fa89c000020, p=<optimized out>, have_lock=0)
    at malloc.c:4178
#3  0x000000000041a8f8 in __gnu_cxx::new_allocator<unsigned long>::deallocate (
    this=0x7fa89c000ae8, __p=0x7fa89c068bc0)
    at /usr/include/c++/4.6/ext/new_allocator.h:98
#4  0x000000000041a484 in std::_Deque_base<unsigned long, std::allocator<unsigned long> >::_M_deallocate_node (this=0x7fa89c000ae8, __p=0x7fa89c068bc0)
    at /usr/include/c++/4.6/bits/stl_deque.h:531
##f 0
##p *av
*av = {mutex = 1, flags = 3, fastbinsY = {0x0, 0x7fa89c182920, 0x7fa89c15ed80,
    0x0, 0x7fa89c16b170, 0x7fa89c05aa80, 0x7fa89c185340, 0x0, 0x0, 0x0},
  top = 0x7fa89c18bb30, last_remainder = 0x7fa89c141170, bins = {
    0x7fa89c141170, ...},
  binmap = {262428, 8, 1, 2}, next = 0x7fa8b0000020, next_free = 0x0,
  system_mem = 1634304, max_system_mem = 1634304}
##t 2
#0  sYSMALLOc (av=<optimized out>, nb=528) at malloc.c:2756
#1  _int_malloc (av=<optimized out>, bytes=512) at malloc.c:3924
#2  0x00007fa8d08d1f95 in __GI___libc_malloc (bytes=512) at malloc.c:2924
#3  0x00007fa8d01cfd8b in j__udyLAllocJLL5 () from /usr/lib/libJudy.so.1
#4  0x00007fa8d01cc9ae in ?? () from /usr/lib/libJudy.so.1
#5  0x00007fa8d01ca3ee in ?? () from /usr/lib/libJudy.so.1
#6  0x00007fa8d01ca3ee in ?? () from /usr/lib/libJudy.so.1
#7  0x00007fa8d01ca3ee in ?? () from /usr/lib/libJudy.so.1
#8  0x00007fa8d01cd83f in JudyLIns () from /usr/lib/libJudy.so.1

The malloc.c line numbers look erroneous based on my examination of the
source code from http://packages.ubuntu.com/precise/libc6 .
Unfortunately, that must just be a problem with the libc6-dbg package
data.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to eglibc in Ubuntu.
https://bugs.launchpad.net/bugs/1020210

Title:
  Race condition using ATOMIC_FASTBINS in _int_free causes crash or heap
  corruption

Status in Embedded GLIBC:
  Incomplete
Status in “eglibc” package in Ubuntu:
  Confirmed

Bug description:
  We have an application which makes heavy allocation and de-allocation
  demands from multiple threads.  We run this application continuously
  on many servers, and once every several CPU months or years, we were
  getting a crash in _int_free that did not look like vanilla heap
  corruption.  I believe I have narrowed it down to a race condition in
  _int_free due to the ATOMIC_FASTBINS feature.  Basically, in the
  lockless FASTBIN _int_free path, a chunk is pulled into a local
  variable with the intent to add it to the fastbins list.  However, the
  heap consolidation/trim code can race with this, and can coalesce the
  entire block and/or give it back to the OS before _int_free has a
  chance to try and store it into the fastbins list.

  The problem is very challenging to reproduce in situ, but using gdb I
  have a recipe which demonstrates the crash 100% of the time on my
  12.04 x64 system running eglibc 2.15.  It relies on malloc_trim,
  although in our in situ data, the consolidation is triggered as a
  result of a normal free.  malloc_trim is just easier to control.

  While I am not a glibc developer, I could not see any easy ways to fix
  the situation shy of disabling ATOMIC_FASTBINS.

  I am attaching the reproduction source.  Other pertinent information
  follows:

  > jpieper at calculon:~/downloads$ lsb_release -rd
  > Description:	Ubuntu 12.04 LTS
  > Release:	12.04

  > jpieper at calculon:~/downloads$ apt-cache policy libc6
  > libc6:
  >   Installed: 2.15-0ubuntu10
  >   Candidate: 2.15-0ubuntu10
  >   Version table:
  >  *** 2.15-0ubuntu10 0
  >        500 http://us.archive.ubuntu.com/ubuntu/ precise/main amd64 Packages
  >        100 /var/lib/dpkg/status

  What I expect: I expect the attached application, when run using the gdb script in the comments, to complete with no failures.
  What happened: A SIGSEGV after the final continue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/eglibc/+bug/1020210/+subscriptions



More information about the foundations-bugs mailing list