[Bug 1888857] Re: deadlock in pthread_cond_signal under high contention

Sat Jul 25 11:53:51 UTC 2020

Thank you for the prompt reply!
I've installed the packages and left the test application running for the weekend. I'll write you back on Monday.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to glibc in Ubuntu.
https://bugs.launchpad.net/bugs/1888857

Title:
  deadlock in pthread_cond_signal under high contention

Status in glibc package in Ubuntu:
  New

Bug description:
  Hello!

  I'm working on a large C++-based cross-platform project. I noticed that on arm64-based systems some of my processes sporadically became paralyzed by the deadlock hitting all the threads posting to the single boost::asio::io_service. I investigated the deadlock condition further and reduced the problem to the simple test application available at https://github.com/bezkrovatki/deadlock_in_pthread_cond_signal 
  There you may find there the test source code, the detailed description, the deadlock call stacks for all threads and theirs compact view as a call graph.
  In a short, in the test I have threads of two types:
    (1) producers - Np threads calling pthread_cond_signal after unlocking a mutex at a rate of Rp calls per second;
    (2) consumers - Nc threads calling pthread_cond_wait at a rate of Rc calls per second.
  Np, Rp and Rc can be specified with command line parameters, Nc is equal to the number of CPU cores of the particular system running the test. Once started on arm64-based multi-core device the test eventually gets all its threads blocked if the Np, Rp and Rc are enough to keep contention high around pthread_cond_singal calls.

  The deadlock can be workarounded by
  * reducing probability of concurring pthread_cond_singal calls by tuning Np, Rp and Rc;
  * moving pthread_cond_singal call under the lock

  Moreover, the deadlock can be broken by ptrace: attaching with
  debugger, generating dump with Google Breakpad and etc. makes the
  process revive. One time I was able to wake up the process from the
  deadlock with SIGSTOP/SIGCONT, however, the healing effect was very
  limited and the process returned into the deadlock state in a few
  seconds.

  I would like to note that a problem with symptoms that look similar
  was reported and fixed in kernel several years ago (see
  https://groups.google.com/forum/#!topic/mechanical-
  sympathy/QbmpZxp6C64 and
  https://github.com/torvalds/linux/commit/76835b0ebf8a7fe85beb03c75121419a7dec52f0).

  However, I believe this time the problem is on the NPTL implementation
  side because:

    * 100% of the observed deadlocks both in our product and the tests
  appear to have the same structure: single producer blocked in
  __condvar_quiesce_and_switch_g1, all other producers blocked in
  __condvar_acquire_lock, all consumers blocked in
  __pthread_cond_wait_common;

    * mutex misbehavior was never observed either in test or in my
  project;

    * wakeups by ptrace/signal simply mean waiting on a futex got
  interrupted and on the next iteration (if any) at least one of these
  call paths made progress after observing changed global state, which
  can be a side effect of the race in the userland as well as in the
  kernel;

    * while the mutex object is more contended than pthread_cond_signal
  related internal data of the condvar if I put the pthread_cond_signal
  call under the lock, I cannot reproduce the problem.

  I looked at the nptl source code
  (https://elixir.bootlin.com/glibc/glibc-2.27/source/nptl/pthread_cond_common.c#L280)
  a bit. I'm not burdened with a deep knowledge of the implemented
  algorithm and its dark corners. For me the observed deadlock looks
  quite probable from the source code.

    1) All producers (signalling threads) except one are blocked in
  __condvar_acquire_lock at pthread_cond_common.c:280, they are waiting for
  the single signalling thread which was lucky to succeed in acquiring
  the internal data lock.

    2) According to the comments lavishly sown around the code, that
  "lucky" signalling thread waits for the some of consumers (waiting
  threads) to leave G1 group to be able to close the group and make the
  group switch in
  __condvar_quiesce_and_switch_g1 at pthread_cond_common.c:412

    3) And all consumers (waiting threads) wait, of course, they wait
  for the producers to send a signal, see
  __pthread_cond_wait_common at pthread_cond_wait.c:502

    4) And if you watch the code around __pthread_cond_wait_common at pthread_cond_wait.c:502 carefully you can see that when the waiting for signals on the futex gets interrupted, the code wakes our "lucky" thread blocked in __condvar_quiesce_and_switch_g1 at pthread_cond_common.c:412 at first (by calling __condvar_dec_grefs at pthread_cond_wait.c:149 ) and only then re-evaluates the condition and returns to the waiting on the futex if necessary.
      This fact can explain how ptrace/signal allows to break the deadlock.

  --
  I posted the bug report here because the glibc's wiki strongly recommends to start from the distribution bug tracker. All arm64-based devices I tested were running Ubuntu 18.04.

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: libc6 2.27-3ubuntu1
  Uname: Linux 4.9.187-52 aarch64
  ApportVersion: 2.20.9-0ubuntu7.15
  Architecture: arm64
  Date: Fri Jul 24 14:05:57 2020
  Dependencies:
   gcc-8-base 8.3.0-6ubuntu1~18.04.1
   libc6 2.27-3ubuntu1
   libgcc1 1:8.3.0-6ubuntu1~18.04.1
  ProcEnviron:
   TERM=rxvt-unicode-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=C.UTF-8
   SHELL=/bin/bash
  SourcePackage: glibc
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1888857/+subscriptions