[Bug 2092217] Re: [SRU] Missing Group-By Optimization Backport in Yoga and Antelope

Bryan Fraschetti 2092217 at bugs.launchpad.net
Thu Mar 27 19:03:02 UTC 2025


Hello,

I've performed the SRU verification and all looks good on my end.

Firstly, after deploying a fresh openstack cloud I ran the attached test
script, which benchmarks performance and response times from the neutron
api service. The following results show the performance with the default
ext_net and private networks, admin security group, and no rbac policies
as a baseline

========================
Performance in virtually empty fresh cloud
========================
Time taken for network list: 1496 ms
Time taken for security group list: 1274 ms
Time taken for network rbac list: 1194 ms
Time taken for network show: 1408 ms

Then I used a script to create 1000 projects, 10 networks, and 10
security groups, and share each network and security group with each
project. Effectively creating 20000 RBAC rules, which is slightly less
than the ~30000 in the original bug report.

At this point I ran the same test script, which produced the following
performance:

========================
Performance of unpatched with 20000 RBAC rules
========================
Time taken for network list: 9330 ms
Time taken for security group list: 7824 ms
Time taken for network rbac list: 20576 ms
Time taken for network show: 1544 ms

Here I enabled proposed and ran the same test

========================
Performance of proposed with 20000 RBAC rules
========================
Time taken for network list: 2400 ms
Time taken for security group list: 1952 ms
Time taken for network rbac list: 13727 ms
Time taken for network show: 1341 ms

As expected, listing the networks and security groups requires
significantly less time due to the lower cardinality of the returned set
by the /network API. In the original report this endpoint was noted as
being particularly affected.

Thanks!

** Attachment added: "test.sh"
   https://bugs.launchpad.net/ubuntu/+source/python-neutron-lib/+bug/2092217/+attachment/5867728/+files/test.sh

** Tags removed: verification-needed verification-needed-jammy
** Tags added: verification-done verification-done-jammy

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/2092217

Title:
  [SRU] Missing Group-By Optimization Backport in Yoga and Antelope

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive antelope series:
  New
Status in Ubuntu Cloud Archive bobcat series:
  Fix Released
Status in Ubuntu Cloud Archive caracal series:
  Fix Released
Status in Ubuntu Cloud Archive yoga series:
  New
Status in Ubuntu Cloud Archive zed series:
  Won't Fix
Status in python-neutron-lib package in Ubuntu:
  Fix Released
Status in python-neutron-lib source package in Jammy:
  Fix Committed

Bug description:
  [ Impact ]

    * There is a missing performance optimization in the Yoga
  (2.20.0-0ubuntu1) and Antelope (3.4.0-0ubuntu1) releases of neutron-
  lib caused by a missing database query optimization at [1], which is
  present upstream. In particular, the commit at [1] minimizes the
  number of objects returned from the database (grouping by resource id
  to prevent returning duplicated rows). The commit is a follow up to
  the commit at [2] in neutron. Both commits are needed to get the
  performance benefit. Fortunately said neutron commit [2] is present in
  Yoga (2:20.5.0-0ubuntu2) and Antelope (2:22.0.2-0ubuntu1)

    * To summarize, commit [1] is present in Ubuntu's releases of Zed
  (3.1.2-0ubuntu1), Bobcat (3.8.0-0ubuntu1), or Caracal
  (3.11.0-0ubuntu1) but not Yoga (2.20.0-0ubuntu1) or Antelope
  (3.4.0-0ubuntu1)

    * This proposed SRU patch is a cherry-pick backport of the upstream
  commit [1]

  [ Test Plan ]

    * Deploy openstack yoga/antelope on jammy and create as many RBAC
  rules as possible (the original report had 15000 to 30000 on various
  clouds). A simple and programmatic way to generate thousands of rules
  would be to generate thousands of projects, a few admin security
  groups and networks and share them with all projects

    * Measure the time elapsed when executing:
      openstack network rbac list
      openstack security group list
      openstack network list
      openstack network show <name>

    * The expected behaviour is that without the patch the performance
  will degrade with added RBAC groups while there will be no significant
  performance change with the patch.

    [ Where problems could occur ]

    * While it is not expected to introduce a regression as the Ubuntu
  releases surrounding Yoga and Antelope contain the patch, the patch
  inherently affects performance. There may be edge cases with a
  reduction in performance. For example, this patch reduces the number
  of returned values of the query to the set of rows (eliminating
  duplicates) - if there are few RBAC rules, this may be unnecessary
  overhead as the query may not have produced many duplicates results to
  begin with.

    * This affects the aggregation of sql queries, if there is an
  integration that doesn't expect this grouping (for some reason expects
  duplicates), it may be affected

  [ Other Info ]

    * There may be questions regarding whether there are accompanying
  tests to prove that there is no regression. Those tests are present in
  the associated neutron patch (as per the commit message [3]).

    * Note that I have included an additional patch in the Yoga debdiff.
  This additional patch removes a failing test related to tenant_id.
  It's not actually related to the neutron patch but I incidentally
  discovered it when addressing the missing optimization that is the
  primary concern of this SRU. If one clones the source of the Yoga
  release and then tries to debuild it, the build fails with a single
  error associated with a tenant_id test. This test was removed upstream
  in March 2022 [4], before the Yoga release, shortly after the request
  context in oslo deprecated the same field in January 2022 [5], which
  manifested the test failure.

    * The original report of the poor scaling with RBAC rules that lead
  to the development of the change is located at the LP bug [6], which
  may provide some additional context.

  Neutron RBAC performance optimization:
  [1] https://opendev.org/openstack/neutron-lib/commit/829e97024c2b73dd67bfd8a04c65f03be556eec8
  [2] https://opendev.org/openstack/neutron/commit/4c654dc553fa8c52e4459c7ac781dca536dba05b
  [3] https://review.opendev.org/c/openstack/neutron/+/884877
  [6] https://bugs.launchpad.net/neutron/+bug/1918145

  Removal of tenant_id test:
  [4] https://opendev.org/openstack/neutron-lib/commit/a8abe9d592da5bcf065af40e0ba1cd3599ede1e7
  [5] https://review.opendev.org/c/openstack/oslo.context/+/815938

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2092217/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list