[Bug 1999551] Re: glibc: backport AArch64 memcmp improvements
Simon Chopin
1999551 at bugs.launchpad.net
Fri Jul 7 08:09:57 UTC 2023
Attached are all the generated plots for the various benchmarks.
My matplotlib skills being rather poor, here's the legend: y axis in %,
we want negative values, x axis is the size of the buffers being
processed.
My conclusions are that the memcmp patch for focal should be rolled
back, but the Jammy results are fairly OK. There's a huge (150%!)
performance regression for very small memmoves (as in < 16 bytes), but I
think it's only showing that the fixed cost of the function has
increased. I also think the "fixed input" part of the benchmarks are
actually hitting a worst-case scenario, as we show no improvement in the
non-random benchmarks (including the large ones), whereas the ones on
random inputs are much more satisfying, with impressive results on
graviton3 without significant regression on non-SVE machines.
Thus, I'm marking the jammy upload as verified.
I'll have to re-do a focal upload without the memcmp patch, though.
** Attachment added: "benchmark plots"
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1999551/+attachment/5684556/+files/results.tar.gz
** Tags removed: verification-needed-jammy
** Tags added: verification-done-jammy
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1999551
Title:
glibc: backport AArch64 mem{cpy,cmp} improvements
Status in glibc package in Ubuntu:
Fix Released
Status in glibc source package in Focal:
Fix Committed
Status in glibc source package in Jammy:
Fix Committed
Status in glibc source package in Kinetic:
Fix Released
Bug description:
[impact]
There have been relatively recent improvements to the memcmp and
memcpy routines for server-grade AArch64 implementation, in particular
AWS's Graviton3.
We'd like to backport those improvements to Jammy and Focal when
appropriate, under the HWE umbrella.
The relevant patches are
https://sourceware.org/git/?p=glibc.git;a=commit;h=9f298bfe1f183804bb54b54ff9071afc0494906c (Jammy & Focal)
https://sourceware.org/git/?p=glibc.git;a=commit;h=b51eb35c572b015641f03e3682c303f7631279b7 (Focal only, already present in Jammy)
In addition, to be able to actually test the changes and its impact on
all architectures, we'll need the following fix:
https://sourceware.org/git/?p=glibc.git;a=commit;h=311a7e0256975275d97077f1af338bc9caf0c837
[test case]
Since those are optimization patches, we'll be relying on the
autopkgtests triggered by the upload for regression detection.
However, we'll also benchmark the optimizations on Graviton AWS
instances as well as various Raspberry Pi models to ensure there is no
severe performance regression on those platforms.
To do the performance test, first install the libc from this PPA:
https://launchpad.net/~schopin/+archive/ubuntu/glibc-benchmark
that is the current Jammy glibc with the extra fix for benchmarking.
Then, untar the attached archive bench-timing.tar.xz on the target
platform, and follow the instructions from the README.
[Regression potential]
This could potentially impact performance on other, non-server-grade
arm64 platforms such as RPi. Furthermore, there could be unforeseen
issues with the newly optimized routine in edge cases (a recent amd64
optimization had issues on page boundaries, for instance).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1999551/+subscriptions
More information about the foundations-bugs
mailing list