[Bug 1966064] [NEW] libc does not handle multi-byte grouping chars
Carl-Erik Kopseng
1966064 at bugs.launchpad.net
Wed Mar 23 11:54:54 UTC 2022
Public bug reported:
So I first reported this as a bug to `coreutils`, which closed the issue
as the underlying reason lies in libc (*). Still, for the reproduction
case I will still use `printf` from coreutils, as that uses `printf()`
from libc internally.
As can be seen below, the width specifier for numeric parameters does
some weird calculations when the specified locale is `nb_NO.utf8`. For
instance, the number formatting rules for US and NO locales both result
in the same number of characters (with ' ' instead of ','), but the
Norwegian version lacks two spaces in the padded output:
$ LC_NUMERIC=en_US.utf8 printf "%s%'7d%s\n" XXX 1234 XXX
XXX 1,234XXX
$ LC_NUMERIC=nb_NO.utf8 printf "%s%'7d%s\n" XXX 1234 XXX
XXX1 234XXX
According to Pádraig Brady "The particular issue is the grouping char used
in the nb_NO.utf8 locale is multi-byte. Specifically: e2 80 af. So that character counts as 3 bytes, and the printf implementation is counting bytes, not characters, or display cells.
Given the usual consideration is display width, it probably should be
considering display cells, but that's an issue for libc, not coreutils."
Filing this here in the hopes that it will be pushed upstream at some
point.
* see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=50336 for details
ProblemType: Bug
DistroRelease: Ubuntu 21.10
Package: libc6-dev 2.34-0ubuntu3.2
ProcVersionSignature: Ubuntu 5.13.0-35.40-generic 5.13.19
Uname: Linux 5.13.0-35-generic x86_64
ApportVersion: 2.20.11-0ubuntu71
Architecture: amd64
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Wed Mar 23 12:45:17 2022
InstallationDate: Installed on 2021-08-26 (208 days ago)
InstallationMedia: Ubuntu 21.04 "Hirsute Hippo" - Release amd64 (20210420)
RebootRequiredPkgs: Error: path contained symlinks.
SourcePackage: glibc
UpgradeStatus: Upgraded to impish on 2022-02-08 (42 days ago)
** Affects: glibc (Ubuntu)
Importance: Undecided
Status: New
** Tags: amd64 apport-bug impish wayland-session
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to glibc in Ubuntu.
https://bugs.launchpad.net/bugs/1966064
Title:
libc does not handle multi-byte grouping chars
Status in glibc package in Ubuntu:
New
Bug description:
So I first reported this as a bug to `coreutils`, which closed the
issue as the underlying reason lies in libc (*). Still, for the
reproduction case I will still use `printf` from coreutils, as that
uses `printf()` from libc internally.
As can be seen below, the width specifier for numeric parameters does
some weird calculations when the specified locale is `nb_NO.utf8`. For
instance, the number formatting rules for US and NO locales both
result in the same number of characters (with ' ' instead of ','), but
the Norwegian version lacks two spaces in the padded output:
$ LC_NUMERIC=en_US.utf8 printf "%s%'7d%s\n" XXX 1234 XXX
XXX 1,234XXX
$ LC_NUMERIC=nb_NO.utf8 printf "%s%'7d%s\n" XXX 1234 XXX
XXX1 234XXX
According to Pádraig Brady "The particular issue is the grouping char used
in the nb_NO.utf8 locale is multi-byte. Specifically: e2 80 af. So that character counts as 3 bytes, and the printf implementation is counting bytes, not characters, or display cells.
Given the usual consideration is display width, it probably should be
considering display cells, but that's an issue for libc, not
coreutils."
Filing this here in the hopes that it will be pushed upstream at some
point.
* see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=50336 for details
ProblemType: Bug
DistroRelease: Ubuntu 21.10
Package: libc6-dev 2.34-0ubuntu3.2
ProcVersionSignature: Ubuntu 5.13.0-35.40-generic 5.13.19
Uname: Linux 5.13.0-35-generic x86_64
ApportVersion: 2.20.11-0ubuntu71
Architecture: amd64
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Wed Mar 23 12:45:17 2022
InstallationDate: Installed on 2021-08-26 (208 days ago)
InstallationMedia: Ubuntu 21.04 "Hirsute Hippo" - Release amd64 (20210420)
RebootRequiredPkgs: Error: path contained symlinks.
SourcePackage: glibc
UpgradeStatus: Upgraded to impish on 2022-02-08 (42 days ago)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1966064/+subscriptions
More information about the foundations-bugs
mailing list