[Bug 2030515]
Adhemerval Zanella
2030515 at bugs.launchpad.net
Fri Dec 6 13:38:12 UTC 2024
(In reply to Koichiro Iwao from comment #15)
> Hi,
>
> This causes segfault on bhvye hypervisor running on Ryzen processors.
>
> FreeBSD folks are also investigating the issue from hypervisor's side but I
> would like to let you know that this caused an issue because of the change
> to keep your eyes on this.
>
> - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279901
> - https://bugs.almalinux.org/view.php?id=489
> - https://bbs.archlinux.org/viewtopic.php?id=295802
As for FreeBSD, I am trying to reproduce it with a different
hypervisor/emulation (in this case qemu/kvm) with a Ryzen 9 5900x Zen3
core but both AlmaLinux 10 Kitten (glibc 2.39) and debian sid (glibc
2.40) boots and works without any issue.
And I also verified on debian sid the selected memcpy/memmove is indeed
the one that optimized with glibc change (__memcpy_avx_unaligned_erms).
I even tried to run glibc memcpy/memmove tests in this VM, where they
stress a lot of different sizes and alignments for different
memcpy/memmove implementations.
Also, my daily workstation (Ryzen 9 5900x) the uses a recent glibc that
also contains this issue and I haven't see any memcpy/memmove related
issue.
So I am not sure if this is a glibc issue. Does it only happen on bhyve?
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to glibc in Ubuntu.
https://bugs.launchpad.net/bugs/2030515
Title:
Terrible memcpy performance on Zen 3 when using rep movsb
Status in GLibC:
New
Status in glibc package in Ubuntu:
New
Bug description:
On CPUs that advertise FSRM (fast short rep movsb), glibc 2.35 uses
REP MOVSB for memcpy for sizes above 2112 (up to some threshold that
depends on the cache size). Unfortunately, it seems that Zen 3 (at
least in the microcode we're running) is extremely slow at REP MOVSB
when the data are not well-aligned.
I've found this using a memcpy benchmark at https://github.com/ska-
sa/katgpucbf/blob/69752be58fb8ab0668ada806e0fd809e782cc58b/scratch/memcpy_loop.cpp
(compiled with the adjacent Makefile). To demonstrate the issue, run
./memcpy_loop -b 2113 -p 1000000 -t mmap -S 0 -D 1 0
This runs:
- 2113-byte memory copies
- 1,000,000 times per timing measurement
- in memory allocated with mmap
- with the source 0 bytes from the start of the page
- with the destination 1 byte from the start of the page
- on core 0.
It reports about 3.2 GB/s. Change the -b argument to 2111 and it
reports over 100 GB/s. So the REP MOVSB case is about 30× slower!
This will most likely need to be reported and fixed upstream, but I'm
reporting it to Ubuntu first since I don't know if Ubuntu has modified
glibc in any way that would be significant.
See also: https://xuanwo.io/2023/04-rust-std-fs-slower-than-python/
ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: libc6 2.35-0ubuntu3.1
ProcVersionSignature: Ubuntu 5.19.0-46.47~22.04.1-generic 5.19.17
Uname: Linux 5.19.0-46-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.20.11-0ubuntu82.5
Architecture: amd64
CasperMD5CheckResult: unknown
Date: Mon Aug 7 14:02:28 2023
RebootRequiredPkgs: Error: path contained symlinks.
SourcePackage: glibc
UpgradeStatus: No upgrade log present (probably fresh install)
To manage notifications about this bug go to:
https://bugs.launchpad.net/glibc/+bug/2030515/+subscriptions
More information about the foundations-bugs
mailing list