[Bug 1672819] Re: exec'ing a setuid binary from a threaded program sometimes fails to setuid
    Michael Hudson-Doyle 
    michael.hudson+lp at canonical.com
       
    Mon Jul  3 00:11:50 UTC 2017
    
    
  
** Changed in: golang-1.6 (Ubuntu Xenial)
       Status: New => In Progress
-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to golang-1.6 in Ubuntu.
https://bugs.launchpad.net/bugs/1672819
Title:
  exec'ing a setuid binary from a threaded program sometimes fails to
  setuid
Status in Linux:
  Unknown
Status in golang-1.6 package in Ubuntu:
  Invalid
Status in linux package in Ubuntu:
  Fix Released
Status in golang-1.6 source package in Xenial:
  In Progress
Status in linux source package in Xenial:
  Fix Released
Status in golang-1.6 source package in Yakkety:
  Invalid
Status in linux source package in Yakkety:
  Fix Released
Status in golang-1.6 source package in Zesty:
  Invalid
Status in linux source package in Zesty:
  Fix Released
Bug description:
  == SRU template for golang-1.6 ==
  [Impact]
  The kernel bug reported below means that occasionally (maybe 1 in 1000 times) the snapd -> snap-confine exec that is part of a snap execution fails to take the setuid bit on the snap-confine binary into account which means that the execution fails. This is extremely confusing for the user of the snap who just sees a permission denied error with no explanation.
  The kernel bug has been fixed in Xenial+ but not all users of snapd are on xenial+ kernels (they might be on trusty or another distribution entirely).
  Backporting this fix will mean that the snapd in the core snap will get the workaround next time it is built and because the snapd in trusty or the other distro will re-exec into the snapd in the core snap before execing snap-confine, users should not see the above behaviour.
  [Test case]
  This will be a bit tricky as the kernel bug has been fixed. A xenial container on a trusty host/VM should do the trick. The test case from https://gist.github.com/chipaca/806c90d96c437444f27f45a83d00a813 should be sufficient to demonstrate the bug and then, once golang-1.6 has been upgraded from proposed, the fix.
  [Regression potential]
  If there is a bug in the patch it could cause deadlocks in currently working programs. But the patch is pretty simple and has passed review upstream so I think it should be OK.
  == SRU REQUEST XENIAL, YAKKETY, ZESTY ==
  Due to two race conditions in check_unsafe_exec(),  exec'ing a setuid
  binary from a threaded program sometimes fails to setuid.
  == Fix ==
  Sauce patch for Xenial, Yakkety + Zesty:
  https://lists.ubuntu.com/archives/kernel-team/2017-May/084102.html
  This fix re-executes the unsafe check if there is a discrepancy
  between the expected fs count and the found count during the racy
  window during thread exec or exit.  This re-check occurs very
  infrequently and saves a lot of addition locking on per thread
  structures that would make performance of fork/exec/exit prohibitively
  expensive.
  == Test case ==
  See the example C code in the patch, https://lists.ubuntu.com/archives
  /kernel-team/2017-May/084102.html
  Run the test code as follows: for i in $(seq 1000); do ./a; done
  With the patch, no messages are emitted, without the patch, one sees a
  message:
  "Failed, got euid 1000 (expecting 0)"
  ..which shows the setuid program failed the check_unsafe_exec()
  because of the race.
  == Regression potential ==
  breaking existing safe exec semantics.
  ====================
  This can be reproduced with
  https://gist.github.com/chipaca/806c90d96c437444f27f45a83d00a813
  With that, and go 1.8, if you run “make” and then
  for i in `seq 99`; do ./a_go; done
  you'll see a variable number of ”GOT 1000” (or whatever your user id
  is). If you don't, add one or two more 9s on there.
  That's a simple go reproducer. You can also use “a_p” instead of
  “a_go” to see one that only uses pthreads. “a_c” is a C version that
  does *not* reproduce the issue.
  But it's not pthreads: if in a_go.go you comment out the “import "C"”,
  you'll still see the “GOT 1000” messages, in a static binary that uses
  no pthreads, just clone(2). You'll also see a bunch of warnings
  because it's not properly handling an EAGAIN from clone, but that's
  unrelated.
  If you pin the process to a single thread using taskset, you don't get
  the issue from a_go; a_p continues to reproduce the issue. In some
  virtualized environments we haven't been able to reproduce the issue
  either (e.g. some aws instances), but kvm works (you need -smp to see
  the issue from a_go).
  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: linux-image-4.4.0-64-generic 4.4.0-64.85
  ProcVersionSignature: Ubuntu 4.4.0-64.85-generic 4.4.44
  Uname: Linux 4.4.0-64-generic x86_64
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
  ApportVersion: 2.20.1-0ubuntu2.5
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/pcmC0D0p:   john       2354 F...m pulseaudio
   /dev/snd/controlC0:  john       2354 F.... pulseaudio
  CurrentDesktop: Unity
  Date: Tue Mar 14 17:17:23 2017
  HibernationDevice: RESUME=UUID=b9fd155b-dcbe-4337-ae77-6daa6569beaf
  InstallationDate: Installed on 2014-04-27 (1051 days ago)
  InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Release amd64 (20140417)
  MachineType: Dell Inc. Latitude E6510
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-64-generic root=/dev/mapper/ubuntu--vg-root ro enable_mtrr_cleanup mtrr_spare_reg_nr=8 mtrr_gran_size=32M mtrr_chunk_size=32M quiet splash
  RelatedPackageVersions:
   linux-restricted-modules-4.4.0-64-generic N/A
   linux-backports-modules-4.4.0-64-generic  N/A
   linux-firmware                            1.157.8
  SourcePackage: linux
  SystemImageInfo: Error: command ['system-image-cli', '-i'] failed with exit code 2:
  UpgradeStatus: Upgraded to xenial on 2015-06-18 (634 days ago)
  dmi.bios.date: 12/05/2013
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: A16
  dmi.board.vendor: Dell Inc.
  dmi.chassis.type: 9
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: dmi:bvnDellInc.:bvrA16:bd12/05/2013:svnDellInc.:pnLatitudeE6510:pvr0001:rvnDellInc.:rn:rvr:cvnDellInc.:ct9:cvr:
  dmi.product.name: Latitude E6510
  dmi.product.version: 0001
  dmi.sys.vendor: Dell Inc.
To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1672819/+subscriptions
    
    
More information about the foundations-bugs
mailing list