[Bug 1966064] [NEW] libc does not handle multi-byte grouping chars

Carl-Erik Kopseng 1966064 at bugs.launchpad.net
Wed Mar 23 11:54:54 UTC 2022


Public bug reported:

So I first reported this as a bug to `coreutils`, which closed the issue
as the underlying reason lies in libc (*). Still, for the reproduction
case I will still use `printf` from coreutils, as that uses `printf()`
from libc internally.

As can be seen below, the width specifier for numeric parameters does
some weird calculations when the specified locale is `nb_NO.utf8`. For
instance, the number formatting rules for US and NO locales both result
in the same number of characters (with ' ' instead of ','), but the
Norwegian version lacks two spaces in the padded output:

$ LC_NUMERIC=en_US.utf8 printf "%s%'7d%s\n" XXX 1234 XXX
XXX  1,234XXX

$ LC_NUMERIC=nb_NO.utf8 printf "%s%'7d%s\n" XXX 1234 XXX
XXX1 234XXX

According to Pádraig Brady "The particular issue is the grouping char used
in the nb_NO.utf8 locale is multi-byte. Specifically: e2 80 af. So that character counts as 3 bytes, and the printf implementation is counting bytes, not characters, or display cells.

Given the usual consideration is display width, it probably should be
considering display cells, but that's an issue for libc, not coreutils."

Filing this here in the hopes that it will be pushed upstream at some
point.

* see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=50336 for details

ProblemType: Bug
DistroRelease: Ubuntu 21.10
Package: libc6-dev 2.34-0ubuntu3.2
ProcVersionSignature: Ubuntu 5.13.0-35.40-generic 5.13.19
Uname: Linux 5.13.0-35-generic x86_64
ApportVersion: 2.20.11-0ubuntu71
Architecture: amd64
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Wed Mar 23 12:45:17 2022
InstallationDate: Installed on 2021-08-26 (208 days ago)
InstallationMedia: Ubuntu 21.04 "Hirsute Hippo" - Release amd64 (20210420)
RebootRequiredPkgs: Error: path contained symlinks.
SourcePackage: glibc
UpgradeStatus: Upgraded to impish on 2022-02-08 (42 days ago)

** Affects: glibc (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug impish wayland-session

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to glibc in Ubuntu.
https://bugs.launchpad.net/bugs/1966064

Title:
  libc does not handle multi-byte grouping chars

Status in glibc package in Ubuntu:
  New

Bug description:
  So I first reported this as a bug to `coreutils`, which closed the
  issue as the underlying reason lies in libc (*). Still, for the
  reproduction case I will still use `printf` from coreutils, as that
  uses `printf()` from libc internally.

  As can be seen below, the width specifier for numeric parameters does
  some weird calculations when the specified locale is `nb_NO.utf8`. For
  instance, the number formatting rules for US and NO locales both
  result in the same number of characters (with ' ' instead of ','), but
  the Norwegian version lacks two spaces in the padded output:

  $ LC_NUMERIC=en_US.utf8 printf "%s%'7d%s\n" XXX 1234 XXX
  XXX  1,234XXX

  $ LC_NUMERIC=nb_NO.utf8 printf "%s%'7d%s\n" XXX 1234 XXX
  XXX1 234XXX

  According to Pádraig Brady "The particular issue is the grouping char used
  in the nb_NO.utf8 locale is multi-byte. Specifically: e2 80 af. So that character counts as 3 bytes, and the printf implementation is counting bytes, not characters, or display cells.

  Given the usual consideration is display width, it probably should be
  considering display cells, but that's an issue for libc, not
  coreutils."

  Filing this here in the hopes that it will be pushed upstream at some
  point.

  * see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=50336 for details

  ProblemType: Bug
  DistroRelease: Ubuntu 21.10
  Package: libc6-dev 2.34-0ubuntu3.2
  ProcVersionSignature: Ubuntu 5.13.0-35.40-generic 5.13.19
  Uname: Linux 5.13.0-35-generic x86_64
  ApportVersion: 2.20.11-0ubuntu71
  Architecture: amd64
  CasperMD5CheckResult: pass
  CurrentDesktop: ubuntu:GNOME
  Date: Wed Mar 23 12:45:17 2022
  InstallationDate: Installed on 2021-08-26 (208 days ago)
  InstallationMedia: Ubuntu 21.04 "Hirsute Hippo" - Release amd64 (20210420)
  RebootRequiredPkgs: Error: path contained symlinks.
  SourcePackage: glibc
  UpgradeStatus: Upgraded to impish on 2022-02-08 (42 days ago)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1966064/+subscriptions




More information about the foundations-bugs mailing list