Request for help / ideas to debug issue
Alfonso Sanchez-Beato
alfonso.sanchez-beato at canonical.com
Fri Mar 10 13:58:22 UTC 2017
On Fri, Mar 10, 2017 at 10:22 AM, John Lenton <john.lenton at canonical.com>
wrote:
> Hello!
>
> We're seeing a weird issue with either go, pthreads, or the kernel. If
> you're knowledgeable about one or more of those things, could you take
> a look? Thank you.
>
> The issue manifests as nasty warnings from the "snap run" command,
> which is also the first step into a snapped app or service. It looks
> like
>
> runtime/cgo: pthread_create failed: Resource temporarily unavailable
>
> a very stripped-down reproducer is http://pastebin.ubuntu.com/24150663/
>
> build that, run it in a loop, and you'll see a bunch of those messages
> (and some weirder ones, but let's take it one step at a time).
>
> if you comment out the 'import "C"' line the message will change but
> still happen, which makes me think that at least in part this is a Go
> issue (or that we're holding it wrong).
>
> Note that the exec does work; the warning seems to come from a
> different thread than the one doing the Exec (the other clue that
> points in this direction is that sometimes the message is truncated).
> You can verify the fact that it does run by changing the /bin/true to
> /bin/echo os.Args[1], but because this issue is obviously a race
> somewhere, this change makes it less likely to happen (from ~10% down
> to ~.5% of runs, in my machines).
>
> One thing that makes this harder to debug is that strace'ing the
> process hangs (hard, kill -9 of strace to get out) before reproducing
> the issue. This probably means we need to trace it at a lower level,
> and I don't know enough about tracing a process group from inside the
> kernel to be able to do that; what I can find about kernel-level
> tracing is around syscalls or devices.
>
> Ideas?
>
I found this related thread:
https://groups.google.com/forum/#!msg/golang-nuts/8gszDBRZh_4/lhROTfN9TxIJ
<<
I believe this can happen on GNU/Linux if your program uses cgo and if
thread A is in the Go runtime starting up a new thread B while thread
C is execing a program. The underlying cause is that while one thread
is calling exec the Linux kernel will fail attempts by other threads
to call clone by returning EAGAIN. (Look for uses of the in_exec
field in the kernel sources.)
>>
Something like adding a little sleep removes the traces, for instance:
http://paste.ubuntu.com/24151637/
where the program run sleep for 1ms before calling Exec. For smaller units
(say, 20 us) the issue still happens.
It looks to me that right before running main(), go creates some threads,
calling clone() and probably getting the race described in the thread. As
anyway you are running Exec I guess the traces are harmless, you do not
need the go threads. Nonetheless, I think that the go run time should retry
instead of printing that trace.
>
> --
> Snapcraft mailing list
> Snapcraft at lists.snapcraft.io
> Modify settings or unsubscribe at: https://lists.ubuntu.com/
> mailman/listinfo/snapcraft
>
More information about the Snapcraft
mailing list