[Bug 1554647] [NEW] sort -n -t, partially ignores field boundaries

Raul Miller raul.miller at nextag.com
Tue Mar 8 18:26:29 UTC 2016


Public bug reported:

ProblemType: Bug
ApportVersion: 2.14.1-0ubuntu3.19
Architecture: amd64
Date: Tue Mar  8 18:12:44 2016
Dependencies:
 gcc-4.9-base 4.9.3-0ubuntu4
 libacl1 2.2.52-1
 libattr1 1:2.4.47-1ubuntu1
 libc6 2.19-0ubuntu6.7
 libgcc1 1:4.9.3-0ubuntu4
 libpcre3 1:8.31-2ubuntu2.1
 libselinux1 2.2.2-1ubuntu0.1
 multiarch-support 2.19-0ubuntu6.6
DistroRelease: Ubuntu 14.04
Ec2AMI: ami-fce3c696
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1d
Ec2InstanceType: m4.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
Package: coreutils 8.21-1ubuntu5.3
PackageArchitecture: amd64
ProcEnviron:
 TERM=screen
 SHELL=/bin/bash
 PATH=(custom, user)
 LANG=en_US.UTF-8
 XDG_RUNTIME_DIR=<set>
ProcVersionSignature: User Name 3.13.0-74.118-generic 3.13.11-ckt30
SourcePackage: coreutils
Tags:  trusty ec2-images
Uname: Linux 3.13.0-74-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
_MarkForUpload: True

$ cat tmp3
1,11111,1
207,970,60
807,120,600
$ sort -n -k2 -t, <tmp3
207,970,60
1,11111,1
807,120,600
$ sort -k2 -t, <tmp3
1,11111,1
807,120,600
207,970,60

Numeric sort places 970 before 120, I believe because it interprets
field 2 as having the values 11111,1 and 970,60 and 120,600 which after
comma removal becomes 111111 and 97060 and 120600.

Non-numeric sort places 120 before 970 but is not a numeric sort so
11111 appears before both of them. This is not a bug, and is simply
mentioned for context.

Note that this general class of problem also occurs with sort -n -k2,3
-t,

Looking at an older implementation of sort (version 5.93 under osx),
this problem happened back then - so it has been happening for quite a
long time.

So basically the problem looks like a modularity violation in the
implementation of numeric sort. So a fix will probably require re-
implementing some part of that system.

Using sort -g instead of sort -n seems to work around the problem. But
if this is somehow deemed to be not a bug in sort itself, it would still
a bug in the manual page for sort (which does not mention or even hint
at this issue).

** Affects: coreutils (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to coreutils in Ubuntu.
https://bugs.launchpad.net/bugs/1554647

Title:
  sort -n -t, partially ignores field boundaries

Status in coreutils package in Ubuntu:
  New

Bug description:
  ProblemType: Bug
  ApportVersion: 2.14.1-0ubuntu3.19
  Architecture: amd64
  Date: Tue Mar  8 18:12:44 2016
  Dependencies:
   gcc-4.9-base 4.9.3-0ubuntu4
   libacl1 2.2.52-1
   libattr1 1:2.4.47-1ubuntu1
   libc6 2.19-0ubuntu6.7
   libgcc1 1:4.9.3-0ubuntu4
   libpcre3 1:8.31-2ubuntu2.1
   libselinux1 2.2.2-1ubuntu0.1
   multiarch-support 2.19-0ubuntu6.6
  DistroRelease: Ubuntu 14.04
  Ec2AMI: ami-fce3c696
  Ec2AMIManifest: (unknown)
  Ec2AvailabilityZone: us-east-1d
  Ec2InstanceType: m4.large
  Ec2Kernel: unavailable
  Ec2Ramdisk: unavailable
  Package: coreutils 8.21-1ubuntu5.3
  PackageArchitecture: amd64
  ProcEnviron:
   TERM=screen
   SHELL=/bin/bash
   PATH=(custom, user)
   LANG=en_US.UTF-8
   XDG_RUNTIME_DIR=<set>
  ProcVersionSignature: User Name 3.13.0-74.118-generic 3.13.11-ckt30
  SourcePackage: coreutils
  Tags:  trusty ec2-images
  Uname: Linux 3.13.0-74-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  _MarkForUpload: True

  $ cat tmp3
  1,11111,1
  207,970,60
  807,120,600
  $ sort -n -k2 -t, <tmp3
  207,970,60
  1,11111,1
  807,120,600
  $ sort -k2 -t, <tmp3
  1,11111,1
  807,120,600
  207,970,60

  Numeric sort places 970 before 120, I believe because it interprets
  field 2 as having the values 11111,1 and 970,60 and 120,600 which
  after comma removal becomes 111111 and 97060 and 120600.

  Non-numeric sort places 120 before 970 but is not a numeric sort so
  11111 appears before both of them. This is not a bug, and is simply
  mentioned for context.

  Note that this general class of problem also occurs with sort -n -k2,3
  -t,

  Looking at an older implementation of sort (version 5.93 under osx),
  this problem happened back then - so it has been happening for quite a
  long time.

  So basically the problem looks like a modularity violation in the
  implementation of numeric sort. So a fix will probably require re-
  implementing some part of that system.

  Using sort -g instead of sort -n seems to work around the problem. But
  if this is somehow deemed to be not a bug in sort itself, it would
  still a bug in the manual page for sort (which does not mention or
  even hint at this issue).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/1554647/+subscriptions



More information about the foundations-bugs mailing list