[Bug 2119987] [NEW] haproxy reload triggers OOM-killer for TERMINATED_HTTPS loadbalancers

Wesley Hershberger 2119987 at bugs.launchpad.net
Thu Aug 7 21:31:19 UTC 2025


Public bug reported:

[ Impact ]

Creating a TERMINATED_HTTPS listener in an amphora with >=32GB of memory
triggers the OOM-killer during listener startup (and any subsequent
`systemctl reload` of haproxy in the amphora).

```
os loadbalancer listener create --name thttps_xxlarge --protocol TERMINATED_HTTPS --protocol-port 443 --default-tls-container-ref <URL> --wait xxlarge1
```

This was originally reported in a Caracal cloud using an Ubuntu 22.04
Amphora image.

I've been able to reproduce this reliably in my lab using the latest
devstack and an Ubuntu 24.04 Amphora image.


[ Root Cause ]

454cff5 (in Zed+ IIUC) introduces the use of haproxy's
`tune.ssl.cachesize` for TERMINATED_HTTPS listeners [1][2].

The commit does not make clear that during a reload of haproxy
(SIGUSR2), the old worker process stays running until the new worker
process is ready [3][4]. This means that two TLS session caches are
allocated/held simultaneously during a reload of the service [5].

For small Amphorae, this works fine. The default connection limit is
50000, which takes enough of a chunk out of the 50% allocation that
there is enough wiggle room for the new haproxy worker to allocate its
cache and coexist with the old worker for some time.

However, as the available memory in the system increases, the memory
consumed by the session cache approaches 50%, and increases the worker's
memory usage beyond 50% (as something else in the worker is also using
memory in proportion to the configured cachesize).

I tested 10 values of tune.ssl.cachesize in an amphora with 32GiB of
RAM, reloading the haproxy service each time:

- vsz here is the value reported by `ps -ax -o pid,vsz,rss,uss,pmem,args | grep haproxy`
- overhead is `tune.ssl.cachesize_MiB - vsz_MiB - 261`
- overhead% is `floor((overhead / tune.ssl.cachesize_MiB) * 100)`

tune.ssl.cachesize | tune.ssl.cachesize_MiB |      vsz |  vsz_MiB | overhead | overhead%
                 0 |                      0 |   267416 |      261 |        0 |        0%
           7741606 |                   1476 |  2142472 |     2092 |      355 |       24%
          15483212 |                   2953 |  4017260 |     3923 |      709 |       24%
          23224818 |                   4429 |  5892180 |     5754 |     1064 |       24%
          30966424 |                   5906 |  7767100 |     7585 |     1418 |       24%
          38708030 |                   7382 |  9642020 |     9416 |     1773 |       24%
          46449636 |                   8859 | 11516940 |    11247 |     2127 |       24%
          54191242 |                  10336 | 13391860 |    13077 |     2480 |       23%
          61932848 |                  11812 | 15266780 |    14908 |     2835 |       24%
          69674454 |                  13289 | 17141700 |    16739 |     3189 |       23%
          77416060 |                  14765 | 19016744 |    18571 |     3545 |       24%

Note that this listener was not configured with a pool, so there was no
load on the system when I gathered this data.

As shown, haproxy to consumes additional memory proportional to the size
of the TLS session cache. The allocation for the cache occurs at [6],
referring to [7].

I verified the documentation's assertion that tune.ssl.cachesize is 200
bytes on amd64; sizeof(struct shared_block) is 48 bytes on the same
hardware [8].

Octavia should allocate closer to 1/3 than 1/2 for the TLS session
cache. I'll test and propose a patch against master shortly.

[1] https://opendev.org/openstack/octavia/commit/454cff587ed10b5e504da93b074b77cb85055b13
[2] https://www.haproxy.com/documentation/haproxy-configuration-manual/new/2-8r1/#section-3.2.-tunesslcachesize
[3] https://github.com/haproxy/haproxy/issues/217#issuecomment-544515990
[4] https://manpages.ubuntu.com/manpages/jammy/en/man1/haproxy.1.html
[5] https://opendev.org/openstack/octavia/src/branch/master/octavia/amphorae/backends/agent/api_server/templates/systemd.conf.j2
[6] https://git.launchpad.net/ubuntu/+source/haproxy/tree/src/ssl_sock.c?h=applied/ubuntu/noble-devel#n5346
[7] https://git.launchpad.net/ubuntu/+source/haproxy/tree/src/shctx.c?h=applied/ubuntu/noble-devel#n300
[8] https://git.launchpad.net/ubuntu/+source/haproxy/tree/include/haproxy/shctx-t.h?h=applied/ubuntu/noble-devel#n38

** Affects: cloud-archive
     Importance: Undecided
     Assignee: Wesley Hershberger (whershberger)
         Status: Triaged

** Affects: cloud-archive/caracal
     Importance: Undecided
         Status: Triaged

** Affects: cloud-archive/epoxy
     Importance: Undecided
     Assignee: Wesley Hershberger (whershberger)
         Status: Triaged

** Affects: cloud-archive/flamingo
     Importance: Undecided
     Assignee: Wesley Hershberger (whershberger)
         Status: Triaged

** Affects: octavia
     Importance: Undecided
     Assignee: Wesley Hershberger (whershberger)
         Status: In Progress

** Affects: octavia (Ubuntu)
     Importance: Undecided
     Assignee: Wesley Hershberger (whershberger)
         Status: Triaged

** Affects: octavia (Ubuntu Noble)
     Importance: Undecided
     Assignee: Wesley Hershberger (whershberger)
         Status: Triaged

** Affects: octavia (Ubuntu Plucky)
     Importance: Undecided
     Assignee: Wesley Hershberger (whershberger)
         Status: Triaged

** Affects: octavia (Ubuntu Questing)
     Importance: Undecided
     Assignee: Wesley Hershberger (whershberger)
         Status: Triaged


** Tags: sts

** Also affects: octavia (Ubuntu)
   Importance: Undecided
       Status: New

** Also affects: octavia (Ubuntu Questing)
   Importance: Undecided
       Status: New

** Also affects: octavia (Ubuntu Noble)
   Importance: Undecided
       Status: New

** Also affects: octavia (Ubuntu Plucky)
   Importance: Undecided
       Status: New

** Changed in: octavia
       Status: New => In Progress

** Changed in: octavia
     Assignee: (unassigned) => Wesley Hershberger (whershberger)

** Changed in: octavia (Ubuntu Noble)
       Status: New => Triaged

** Changed in: octavia (Ubuntu Plucky)
       Status: New => Triaged

** Changed in: octavia (Ubuntu Questing)
       Status: New => Triaged

** Changed in: octavia (Ubuntu Noble)
     Assignee: (unassigned) => Wesley Hershberger (whershberger)

** Changed in: octavia (Ubuntu Plucky)
     Assignee: (unassigned) => Wesley Hershberger (whershberger)

** Changed in: octavia (Ubuntu Questing)
     Assignee: (unassigned) => Wesley Hershberger (whershberger)

** Also affects: cloud-archive
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/caracal
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/flamingo
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/epoxy
   Importance: Undecided
       Status: New

** Changed in: cloud-archive/epoxy
       Status: New => Triaged

** Changed in: cloud-archive/caracal
       Status: New => Triaged

** Changed in: cloud-archive/flamingo
       Status: New => Triaged

** Changed in: cloud-archive/epoxy
     Assignee: (unassigned) => Wesley Hershberger (whershberger)

** Changed in: cloud-archive/flamingo
     Assignee: (unassigned) => Wesley Hershberger (whershberger)

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/2119987

Title:
  haproxy reload triggers OOM-killer for TERMINATED_HTTPS loadbalancers

Status in Ubuntu Cloud Archive:
  Triaged
Status in Ubuntu Cloud Archive caracal series:
  Triaged
Status in Ubuntu Cloud Archive epoxy series:
  Triaged
Status in Ubuntu Cloud Archive flamingo series:
  Triaged
Status in octavia:
  In Progress
Status in octavia package in Ubuntu:
  Triaged
Status in octavia source package in Noble:
  Triaged
Status in octavia source package in Plucky:
  Triaged
Status in octavia source package in Questing:
  Triaged

Bug description:
  [ Impact ]

  Creating a TERMINATED_HTTPS listener in an amphora with >=32GB of
  memory triggers the OOM-killer during listener startup (and any
  subsequent `systemctl reload` of haproxy in the amphora).

  ```
  os loadbalancer listener create --name thttps_xxlarge --protocol TERMINATED_HTTPS --protocol-port 443 --default-tls-container-ref <URL> --wait xxlarge1
  ```

  This was originally reported in a Caracal cloud using an Ubuntu 22.04
  Amphora image.

  I've been able to reproduce this reliably in my lab using the latest
  devstack and an Ubuntu 24.04 Amphora image.

  
  [ Root Cause ]

  454cff5 (in Zed+ IIUC) introduces the use of haproxy's
  `tune.ssl.cachesize` for TERMINATED_HTTPS listeners [1][2].

  The commit does not make clear that during a reload of haproxy
  (SIGUSR2), the old worker process stays running until the new worker
  process is ready [3][4]. This means that two TLS session caches are
  allocated/held simultaneously during a reload of the service [5].

  For small Amphorae, this works fine. The default connection limit is
  50000, which takes enough of a chunk out of the 50% allocation that
  there is enough wiggle room for the new haproxy worker to allocate its
  cache and coexist with the old worker for some time.

  However, as the available memory in the system increases, the memory
  consumed by the session cache approaches 50%, and increases the
  worker's memory usage beyond 50% (as something else in the worker is
  also using memory in proportion to the configured cachesize).

  I tested 10 values of tune.ssl.cachesize in an amphora with 32GiB of
  RAM, reloading the haproxy service each time:

  - vsz here is the value reported by `ps -ax -o pid,vsz,rss,uss,pmem,args | grep haproxy`
  - overhead is `tune.ssl.cachesize_MiB - vsz_MiB - 261`
  - overhead% is `floor((overhead / tune.ssl.cachesize_MiB) * 100)`

  tune.ssl.cachesize | tune.ssl.cachesize_MiB |      vsz |  vsz_MiB | overhead | overhead%
                   0 |                      0 |   267416 |      261 |        0 |        0%
             7741606 |                   1476 |  2142472 |     2092 |      355 |       24%
            15483212 |                   2953 |  4017260 |     3923 |      709 |       24%
            23224818 |                   4429 |  5892180 |     5754 |     1064 |       24%
            30966424 |                   5906 |  7767100 |     7585 |     1418 |       24%
            38708030 |                   7382 |  9642020 |     9416 |     1773 |       24%
            46449636 |                   8859 | 11516940 |    11247 |     2127 |       24%
            54191242 |                  10336 | 13391860 |    13077 |     2480 |       23%
            61932848 |                  11812 | 15266780 |    14908 |     2835 |       24%
            69674454 |                  13289 | 17141700 |    16739 |     3189 |       23%
            77416060 |                  14765 | 19016744 |    18571 |     3545 |       24%

  Note that this listener was not configured with a pool, so there was
  no load on the system when I gathered this data.

  As shown, haproxy to consumes additional memory proportional to the
  size of the TLS session cache. The allocation for the cache occurs at
  [6], referring to [7].

  I verified the documentation's assertion that tune.ssl.cachesize is
  200 bytes on amd64; sizeof(struct shared_block) is 48 bytes on the
  same hardware [8].

  Octavia should allocate closer to 1/3 than 1/2 for the TLS session
  cache. I'll test and propose a patch against master shortly.

  [1] https://opendev.org/openstack/octavia/commit/454cff587ed10b5e504da93b074b77cb85055b13
  [2] https://www.haproxy.com/documentation/haproxy-configuration-manual/new/2-8r1/#section-3.2.-tunesslcachesize
  [3] https://github.com/haproxy/haproxy/issues/217#issuecomment-544515990
  [4] https://manpages.ubuntu.com/manpages/jammy/en/man1/haproxy.1.html
  [5] https://opendev.org/openstack/octavia/src/branch/master/octavia/amphorae/backends/agent/api_server/templates/systemd.conf.j2
  [6] https://git.launchpad.net/ubuntu/+source/haproxy/tree/src/ssl_sock.c?h=applied/ubuntu/noble-devel#n5346
  [7] https://git.launchpad.net/ubuntu/+source/haproxy/tree/src/shctx.c?h=applied/ubuntu/noble-devel#n300
  [8] https://git.launchpad.net/ubuntu/+source/haproxy/tree/include/haproxy/shctx-t.h?h=applied/ubuntu/noble-devel#n38

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2119987/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list