[Bug 1904416] [NEW] gpg-agent 2.x poor performance / futex errors

Andrey Arapov 1904416 at bugs.launchpad.net
Mon Nov 16 14:22:48 UTC 2020


Public bug reported:

Hello,

after upgrading from Ubuntu 16.04 to Ubuntu 18.04 we've noticed the
issues which came along with the gpg v2.x.

The gpg-agent produces millions of futex syscall errors during a very
short time (a second or two) when it's being loaded either by the
SaltStack's salt-master decrypting the pillars (our main use case) or
when it is being directly tested with "parallel" tool from moreutils
package.

```
$ sudo strace -f -p <pidof gpg-agent>
...
...Ctrl+C just after couple of seconds while "gpg -d" commands are running in parallel (see below for details)
...
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.35 2800.145231         305   9194103   2009552 futex
  3.63  105.404136      373774       282           pselect6
  0.01    0.431338         102      4246           read
  0.00    0.104701          12      8490           write
  0.00    0.029085         103       283           accept
  0.00    0.016549          58       284           madvise
  0.00    0.012201          22       567           close
  0.00    0.010979           8      1410           getpid
  0.00    0.010405          12       849           access
  0.00    0.006341          22       284       284 wait4
  0.00    0.004350          15       283           openat
  0.00    0.003764          13       283           clone
  0.00    0.002668           9       283           getsockopt
  0.00    0.002568           9       283           fstat
  0.00    0.002564           9       283           set_robust_list
  0.00    0.001941           7       283           lseek
------ ----------- ----------- --------- --------- ----------------
100.00 2906.188821               9212496   2009836 total
```

I'll describe the issue and steps to reproduce it.

First, prepare the "enc" file:

```
cat /usr/share/doc/base-files/README | gpg -ear "some-4K-RSA-publick-key" > enc
```

Run parallel decryptions using "time" to measure it:

```
time parallel -j 30 sh -c "cat enc | gpg --no-tty -d -q -o /dev/null" -- $(seq 1 3000)
```

Running "gpg -d" (GPG v2.x, with the gpg-agent) in parallel as described above took:
- 1minute 18seconds on a big HW; (48 cores, *gpg-agent 2.2.4*-1ubuntu1.2)
- 32 seconds on my laptop; (4 cores, *gpg-agent 2.2.19*-3ubuntu2)

Running the same commands but with GPG v1.4.20 (no agent):
- 9 seconds on a big HW: (40 cores, *gnupg 1.4.20*-1ubuntu3.3)
- 21 seconds on a VM; (1 core, *gnupg 1.4.20*-1ubuntu3.3)

Note: in order to prevent "command 'PKDECRYPT' failed: Cannot allocate
memory <gcrypt>" error, the gpg-agent is running either with "--auto-
expand-secmem 0x30000" flag or with "auto-expand-secmem" in ~/.gnupg
/gpg-agent.conf file.

Since our use case is to have SaltStack's salt-master decrypt many
pillars for hundreds of servers, the Ubuntu 16.04 => 18.04 upgrade
severely degrades the SaltStack performance making it almost unusable,
i.e. it becomes 10 times slower, requires us figuring workarounds such
as increasing "gather_job_timeout" or probably even rolling back to gpg
v1.x -- not sure if Ubuntu Bionic fully supports that and won't break
though (we haven't tested the gpg 2.x => 1.x downgrade scenario yet --
any insights are highly appreciated!).

Any suggestions?

Kind regards,
Andrey Arapov

** Affects: ubuntu
     Importance: Undecided
         Status: New

** Affects: gnupg2 (Ubuntu)
     Importance: Undecided
         Status: New

** Also affects: gnupg2 (Ubuntu)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to gnupg2 in Ubuntu.
https://bugs.launchpad.net/bugs/1904416

Title:
  gpg-agent 2.x poor performance / futex errors

Status in Ubuntu:
  New
Status in gnupg2 package in Ubuntu:
  New

Bug description:
  Hello,

  after upgrading from Ubuntu 16.04 to Ubuntu 18.04 we've noticed the
  issues which came along with the gpg v2.x.

  The gpg-agent produces millions of futex syscall errors during a very
  short time (a second or two) when it's being loaded either by the
  SaltStack's salt-master decrypting the pillars (our main use case) or
  when it is being directly tested with "parallel" tool from moreutils
  package.

  ```
  $ sudo strace -f -p <pidof gpg-agent>
  ...
  ...Ctrl+C just after couple of seconds while "gpg -d" commands are running in parallel (see below for details)
  ...
  % time     seconds  usecs/call     calls    errors syscall
  ------ ----------- ----------- --------- --------- ----------------
   96.35 2800.145231         305   9194103   2009552 futex
    3.63  105.404136      373774       282           pselect6
    0.01    0.431338         102      4246           read
    0.00    0.104701          12      8490           write
    0.00    0.029085         103       283           accept
    0.00    0.016549          58       284           madvise
    0.00    0.012201          22       567           close
    0.00    0.010979           8      1410           getpid
    0.00    0.010405          12       849           access
    0.00    0.006341          22       284       284 wait4
    0.00    0.004350          15       283           openat
    0.00    0.003764          13       283           clone
    0.00    0.002668           9       283           getsockopt
    0.00    0.002568           9       283           fstat
    0.00    0.002564           9       283           set_robust_list
    0.00    0.001941           7       283           lseek
  ------ ----------- ----------- --------- --------- ----------------
  100.00 2906.188821               9212496   2009836 total
  ```

  I'll describe the issue and steps to reproduce it.

  First, prepare the "enc" file:

  ```
  cat /usr/share/doc/base-files/README | gpg -ear "some-4K-RSA-publick-key" > enc
  ```

  Run parallel decryptions using "time" to measure it:

  ```
  time parallel -j 30 sh -c "cat enc | gpg --no-tty -d -q -o /dev/null" -- $(seq 1 3000)
  ```

  Running "gpg -d" (GPG v2.x, with the gpg-agent) in parallel as described above took:
  - 1minute 18seconds on a big HW; (48 cores, *gpg-agent 2.2.4*-1ubuntu1.2)
  - 32 seconds on my laptop; (4 cores, *gpg-agent 2.2.19*-3ubuntu2)

  Running the same commands but with GPG v1.4.20 (no agent):
  - 9 seconds on a big HW: (40 cores, *gnupg 1.4.20*-1ubuntu3.3)
  - 21 seconds on a VM; (1 core, *gnupg 1.4.20*-1ubuntu3.3)

  Note: in order to prevent "command 'PKDECRYPT' failed: Cannot allocate
  memory <gcrypt>" error, the gpg-agent is running either with "--auto-
  expand-secmem 0x30000" flag or with "auto-expand-secmem" in ~/.gnupg
  /gpg-agent.conf file.

  Since our use case is to have SaltStack's salt-master decrypt many
  pillars for hundreds of servers, the Ubuntu 16.04 => 18.04 upgrade
  severely degrades the SaltStack performance making it almost unusable,
  i.e. it becomes 10 times slower, requires us figuring workarounds such
  as increasing "gather_job_timeout" or probably even rolling back to
  gpg v1.x -- not sure if Ubuntu Bionic fully supports that and won't
  break though (we haven't tested the gpg 2.x => 1.x downgrade scenario
  yet -- any insights are highly appreciated!).

  Any suggestions?

  Kind regards,
  Andrey Arapov

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1904416/+subscriptions



More information about the foundations-bugs mailing list