[Bug 1124384] Re: Configuration reload clears event that others jobs may be waiting on
Stéphane Graber
stgraber at stgraber.org
Tue Apr 23 18:07:01 UTC 2013
So we've spent a good part of the afternoon going through the two cloud-
init bugs with James and came to the conclusion that they are actually
the same bug.
Both initctl --reload-configuration and an upstart stateful re-exec
cause upstart to reload its configuration, destroy existing jobclass
entries and create new ones.
As part of the process of destroy and re-creating job class entries, upstart decrements the reference counter of some related objects, including emitted events.
That has the result that if a job depends on two events, one that has already been emitted and another that hasn't been emitted yet and that the job that emitted the first event is being reloaded, then the record of that event will be dropped, leading to a failure to start the job (as only half the start condition will match).
The part of the code that causes this issue is post-reexec, which means that once we come up with a fix for this, we'll be able to SRU it and have upstart re-exec itself, applying the fix in the process.
That also means that we can't SRU any of upstart's dependencies until this issue is resolved.
James is currently working on testcases for the various scenarios that
we know we need to support, so we can have comprehensive regression
tests before we attempt to sort this issue. Our current hope is to have
a fix for this by the end of the week.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to upstart in Ubuntu.
https://bugs.launchpad.net/bugs/1124384
Title:
Configuration reload clears event that others jobs may be waiting on
Status in “cloud-init” package in Ubuntu:
Confirmed
Status in “upstart” package in Ubuntu:
Confirmed
Bug description:
Under bug 1080841 we made cloud-init invoke 'initctl reload-
configuration' after it wrote a upstart job. This was necessary
because inotify is not supported on all filesystems (overlayfs being
the one of most current interst).
This seems to be causing upstart some pain, and resulting in cloud-
final (and 'rc') not being run.
Easy user-data to reproduce the problem is:
#cloud-config-archive
- content: |
#cloud-boothook
#!/bin/sh
touch /run/cloud-init-upstart-reload # hack, see trunk commit 783
- content: |
#!/bin/sh
echo "==== $(date -R): user-script run ===" | tee /run/user-script.log
- content: |
#upstart-job
description "a test upstart job"
start on stopped rc RUNLEVEL=[2345]
console output
task
script
echo "==== $(date -R): upstart job run ===" | tee /run/upstart-job.log
end script
You should (and do on quantal) end up with 2 files written to /run.
I've verified that the same behavior is true on quantal. If you
change cloud-init to notify upstart about a job immediately after it
writes it, then quantal's upstart gets confused also.
Related bugs:
* bug 1080841: should reload configuration if an upstart job is added
* bug 1103881: cloud-final is never executed if upstart is upgraded during initialization of the image
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1124384/+subscriptions
More information about the foundations-bugs
mailing list