[Bug 1554647] [NEW] sort -n -t, partially ignores field boundaries
Raul Miller
raul.miller at nextag.com
Tue Mar 8 18:26:29 UTC 2016
Public bug reported:
ProblemType: Bug
ApportVersion: 2.14.1-0ubuntu3.19
Architecture: amd64
Date: Tue Mar 8 18:12:44 2016
Dependencies:
gcc-4.9-base 4.9.3-0ubuntu4
libacl1 2.2.52-1
libattr1 1:2.4.47-1ubuntu1
libc6 2.19-0ubuntu6.7
libgcc1 1:4.9.3-0ubuntu4
libpcre3 1:8.31-2ubuntu2.1
libselinux1 2.2.2-1ubuntu0.1
multiarch-support 2.19-0ubuntu6.6
DistroRelease: Ubuntu 14.04
Ec2AMI: ami-fce3c696
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1d
Ec2InstanceType: m4.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
Package: coreutils 8.21-1ubuntu5.3
PackageArchitecture: amd64
ProcEnviron:
TERM=screen
SHELL=/bin/bash
PATH=(custom, user)
LANG=en_US.UTF-8
XDG_RUNTIME_DIR=<set>
ProcVersionSignature: User Name 3.13.0-74.118-generic 3.13.11-ckt30
SourcePackage: coreutils
Tags: trusty ec2-images
Uname: Linux 3.13.0-74-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
_MarkForUpload: True
$ cat tmp3
1,11111,1
207,970,60
807,120,600
$ sort -n -k2 -t, <tmp3
207,970,60
1,11111,1
807,120,600
$ sort -k2 -t, <tmp3
1,11111,1
807,120,600
207,970,60
Numeric sort places 970 before 120, I believe because it interprets
field 2 as having the values 11111,1 and 970,60 and 120,600 which after
comma removal becomes 111111 and 97060 and 120600.
Non-numeric sort places 120 before 970 but is not a numeric sort so
11111 appears before both of them. This is not a bug, and is simply
mentioned for context.
Note that this general class of problem also occurs with sort -n -k2,3
-t,
Looking at an older implementation of sort (version 5.93 under osx),
this problem happened back then - so it has been happening for quite a
long time.
So basically the problem looks like a modularity violation in the
implementation of numeric sort. So a fix will probably require re-
implementing some part of that system.
Using sort -g instead of sort -n seems to work around the problem. But
if this is somehow deemed to be not a bug in sort itself, it would still
a bug in the manual page for sort (which does not mention or even hint
at this issue).
** Affects: coreutils (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to coreutils in Ubuntu.
https://bugs.launchpad.net/bugs/1554647
Title:
sort -n -t, partially ignores field boundaries
Status in coreutils package in Ubuntu:
New
Bug description:
ProblemType: Bug
ApportVersion: 2.14.1-0ubuntu3.19
Architecture: amd64
Date: Tue Mar 8 18:12:44 2016
Dependencies:
gcc-4.9-base 4.9.3-0ubuntu4
libacl1 2.2.52-1
libattr1 1:2.4.47-1ubuntu1
libc6 2.19-0ubuntu6.7
libgcc1 1:4.9.3-0ubuntu4
libpcre3 1:8.31-2ubuntu2.1
libselinux1 2.2.2-1ubuntu0.1
multiarch-support 2.19-0ubuntu6.6
DistroRelease: Ubuntu 14.04
Ec2AMI: ami-fce3c696
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1d
Ec2InstanceType: m4.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
Package: coreutils 8.21-1ubuntu5.3
PackageArchitecture: amd64
ProcEnviron:
TERM=screen
SHELL=/bin/bash
PATH=(custom, user)
LANG=en_US.UTF-8
XDG_RUNTIME_DIR=<set>
ProcVersionSignature: User Name 3.13.0-74.118-generic 3.13.11-ckt30
SourcePackage: coreutils
Tags: trusty ec2-images
Uname: Linux 3.13.0-74-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
_MarkForUpload: True
$ cat tmp3
1,11111,1
207,970,60
807,120,600
$ sort -n -k2 -t, <tmp3
207,970,60
1,11111,1
807,120,600
$ sort -k2 -t, <tmp3
1,11111,1
807,120,600
207,970,60
Numeric sort places 970 before 120, I believe because it interprets
field 2 as having the values 11111,1 and 970,60 and 120,600 which
after comma removal becomes 111111 and 97060 and 120600.
Non-numeric sort places 120 before 970 but is not a numeric sort so
11111 appears before both of them. This is not a bug, and is simply
mentioned for context.
Note that this general class of problem also occurs with sort -n -k2,3
-t,
Looking at an older implementation of sort (version 5.93 under osx),
this problem happened back then - so it has been happening for quite a
long time.
So basically the problem looks like a modularity violation in the
implementation of numeric sort. So a fix will probably require re-
implementing some part of that system.
Using sort -g instead of sort -n seems to work around the problem. But
if this is somehow deemed to be not a bug in sort itself, it would
still a bug in the manual page for sort (which does not mention or
even hint at this issue).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/1554647/+subscriptions
More information about the foundations-bugs
mailing list