[Bug 1961697] Re: Transaction ID collisions cause slow DNS lookups in getaddrinfo

Bug Watch Updater 1961697 at bugs.launchpad.net
Tue Feb 22 18:41:16 UTC 2022


Launchpad has imported 6 comments from the remote bug at
https://sourceware.org/bugzilla/show_bug.cgi?id=26600.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2020-09-11T13:01:21+00:00 Florian Weimer wrote:

If the A and AAAA queries have equal transaction IDs, the initial AAAA
response is not recognized as valid, resulting in timeouts and
retransmits.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1961697/comments/0

------------------------------------------------------------------------
On 2020-09-11T13:02:20+00:00 Florian Weimer wrote:

This bug is distinct from bug 19691 in the sense that it is possible to
fix it without reworking the buffer management.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1961697/comments/1

------------------------------------------------------------------------
On 2020-09-11T13:05:49+00:00 Florian Weimer wrote:

Patch posted: https://sourceware.org/pipermail/libc-
alpha/2020-September/117547.html

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1961697/comments/2

------------------------------------------------------------------------
On 2020-10-14T09:34:14+00:00 Cvs-commit wrote:

The master branch has been updated by Florian Weimer
<fw at sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f1f00c072138af90ae6da180f260111f09afe7a3

commit f1f00c072138af90ae6da180f260111f09afe7a3
Author: Florian Weimer <fweimer at redhat.com>
Date:   Wed Oct 14 10:54:39 2020 +0200

    resolv: Handle transaction ID collisions in parallel queries (bug 26600)
    
    If the transaction IDs are equal, the old check attributed both
    responses to the first query, not recognizing the second response.
    This fixes bug 26600.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1961697/comments/3

------------------------------------------------------------------------
On 2020-10-14T09:34:46+00:00 Florian Weimer wrote:

Fixed for glibc 2.33.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1961697/comments/4

------------------------------------------------------------------------
On 2020-11-10T16:00:06+00:00 Cvs-commit wrote:

The release/2.32/master branch has been updated by Florian Weimer
<fw at sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=2dfa659a66f20facc4082207884c20e986ddecee

commit 2dfa659a66f20facc4082207884c20e986ddecee
Author: Florian Weimer <fweimer at redhat.com>
Date:   Wed Oct 14 10:54:39 2020 +0200

    resolv: Handle transaction ID collisions in parallel queries (bug 26600)
    
    If the transaction IDs are equal, the old check attributed both
    responses to the first query, not recognizing the second response.
    This fixes bug 26600.
    
    (cherry picked from commit f1f00c072138af90ae6da180f260111f09afe7a3)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1961697/comments/5


** Changed in: glibc
       Status: Unknown => Fix Released

** Changed in: glibc
   Importance: Unknown => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to glibc in Ubuntu.
https://bugs.launchpad.net/bugs/1961697

Title:
  Transaction ID collisions cause slow DNS lookups in getaddrinfo

Status in GLibC:
  Fix Released
Status in glibc package in Ubuntu:
  Confirmed

Bug description:
  When resolving DNS names with getaddrinfo(), I have seen this hang for
  5 seconds and then retry and succeed. The issue is that glibc will
  issue a both an A and AAAA query on the same socket, and in some
  circumstances they can be sent with the same DNS transaction ID as
  well.

  I verified this with a packet capture; in the packet capture, I saw
  the A and AAAA queries for a name be made with the same DNS
  transaction ID, get responses, do nothing for five seconds, and then
  send the same DNS query again. On the glibc side, I confirmed that
  it's blocked waiting for the DNS response by interrupting it with gdb,
  even though the packet capture shows the response has well and truly
  arrived. I've attached a packet capture & a backtrace of the glibc
  hang.

  I believe this is the same issue reported in these places:
      * In RHEL: https://bugzilla.redhat.com/show_bug.cgi?id=1904153
      * Also RHEL: https://bugzilla.redhat.com/show_bug.cgi?id=1903880
      * Upstream: https://sourceware.org/bugzilla/show_bug.cgi?id=26600

  The environment I noticed this bug in was:
      * Docker for Mac on an arm64 m1 Macbook
      * Docker for Mac Linux kernel version is 5.10.76-linuxkit
      * Linux is also arm64, not emulated
      * Container with the buggy DNS environment is Ubuntu bionic (also arm64, not emulated)
      * Glibc 2.27-3ubuntu1.4

  However one of the redhat reporters noticed this issue in m6 series
  EC2 instances in AWS.

  A patch has been provided upstream for this issue:
  https://sourceware.org/pipermail/libc-alpha/2020-September/117547.html

  I applied the upstream patch to glibc 2.27-3ubuntu1.4 and rebuilt the
  package, and the problem went away. I've attached the exact patch I
  applied, since I had to work through some conflicts.

  So, I think that patch just needs to be backported to Bionic and (I
  think) Focal as well. Is that reasonable?

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/glibc/+bug/1961697/+subscriptions




More information about the foundations-bugs mailing list