[Bug 1073478] Re: [SRU] Update squid3 upstart script to kill it with SIGINT and wait longer

Tiago Stürmer Daitx 1073478 at bugs.launchpad.net
Wed Oct 14 18:00:38 UTC 2015


On the contrary, thanks for bringing that up. I was good to be made aware
of such problems. Just because it worked now on Precise and Trusty didn't
mean I should skip that check, as I might as well run into an obsolete or
deleted feature (not that upstart seem to have those). =)
On Oct 14, 2015 9:56 AM, "Robie Basak" <1073478 at bugs.launchpad.net> wrote:

> On Wed, Oct 14, 2015 at 12:34:33PM -0000, Tiago Stürmer Daitx wrote:
> > According to http://upstart.ubuntu.com/cookbook/#stanzas-by-category
> > both "kill timeout" (all upstart versions) and "kill signal" (since
> > upstart 1.3) should be supported by upstart 1.5 in Precise so this SRU
> > should be fine.
>
> You're right. Sorry for the noise.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1073478
>
> Title:
>   [SRU] Update squid3 upstart script to kill it with SIGINT and wait
>   longer
>
> Status in squid3 package in Ubuntu:
>   Triaged
>
> Bug description:
>   [Impact]
>
>   Squid 3.1.19-1ubuntu3.12.04.3 on Precise and 3.3.8-1ubuntu6.3 on
>   Trusty do not clean up its PID file, causing logrotate to spit out an
>   error each night when cron runs it.
>
>   The underlying issue here is that the upstart script does not wait
>   long enough for squid3 to finish - one might also argue that it is
>   using the wrong signal if it expects squid to finish quickly.
>
>   By default, when squid3 receives a SIGTERM it will close the socket
>   for incoming connections and will wait for existing connections to
>   complete. The shutdown_lifetime directive configures how long it waits
>   before forcefully closing those open connections and it is set by
>   default to 30 seconds.
>
>   Current setting, SIGKILL happens after 5 seconds (which is the upstart
> default):
>   =====
>   # strace -r -e trace=signal,file -p $(cat /var/run/squid3.pid)
>   Process 20865 attached - interrupt to quit
>        0.000000 --- SIGTERM (Terminated) @ 0 (0) ---
>        0.000440 rt_sigreturn(0xf) = -1 EINTR (Interrupted system call)
>        0.000418 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118,
> ...}) = 0
>        5.006483 +++ killed by SIGKILL +++
>
>   Killing squid with SIGKILL might lead to cache index corruption and a
>   very slow startup the next time it is started: http://lists.squid-
>   cache.org/pipermail/squid-users/2015-July/004752.html
>
>   The first way to avoid this is by increasing upstart's "kill timeout"
>   to a few seconds more then the "shutdown_lifetime" directive.
>
>   With "kill timeout 40" in upstart script, notice the pid file being
> unlinked (the ~30 seconds wait is due to the default value of
> shutdown_lifetime directive):
>   =====
>   # strace -r -e trace=signal,file -p $(cat /var/run/squid3.pid)
>   Process 20805 attached - interrupt to quit
>        0.000000 --- SIGTERM (Terminated) @ 0 (0) ---
>        0.000284 rt_sigreturn(0xf) = -1 EINTR (Interrupted system call)
>        0.000402 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118,
> ...}) = 0
>       31.611726 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118,
> ...}) = 0
>        0.001792 --- SIGCHLD (Child exited) @ 0 (0) ---
>        0.000247 rt_sigreturn(0xffffffffffffffff) = 63
>        0.006027 unlink("/var/run/squid3.pid") = 0
>        0.000539 open("/proc/sys/kernel/ngroups_max", O_RDONLY) = 0
>        0.000387 open("/etc/group", O_RDONLY|O_CLOEXEC) = 0
>   Process 20805 detached
>
>   Still, this will unfortunately lead to a slow shutdown time as
>   reported in http://askubuntu.com/questions/18127/squid3-starts-and-
>   stops-slowly
>
>   To solve this it is better to issue a SIGINT instead of SIGTERM
>   (upstart default) by adding "kill signal SIGINT" to squid's upstart
>   script.
>
>   With "kill signal SIGINT", which does not wait on outstanding
> connections:
>   =====
>   # strace -r -e trace=signal,file -p $(cat /var/run/squid3.pid)
>   Process 20891 attached - interrupt to quit
>        0.000000 --- SIGINT (Interrupt) @ 0 (0) ---
>        0.000312 rt_sigreturn(0x2) = -1 EINTR (Interrupted system call)
>        0.000237 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118,
> ...}) = 0
>        1.123564 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118,
> ...}) = 0
>        0.000754 --- SIGCHLD (Child exited) @ 0 (0) ---
>        0.000103 rt_sigreturn(0xffffffffffffffff) = 0
>        0.002031 unlink("/var/run/squid3.pid") = 0
>        0.000154 open("/proc/sys/kernel/ngroups_max", O_RDONLY) = 0
>        0.000142 open("/etc/group", O_RDONLY|O_CLOEXEC) = 0
>   Process 20891 detached
>
>   The whole stop process is very quick. The downside is that squid will
>   forcefully close all open connections immediately, but that was
>   already happening when SIGKILL was being issued before - so we are
>   actually keeping the same behavior as before and there are no bugs
>   complaining about it. And the PID file gets removed anyway.
>
>   The proposal is to add both "kill signal SIGINT" and "kill timeout 40"
>   to squid, so squid should have enough time to close all open cache
>   index files.
>
>   [Test Case]
>
>   # start squid3
>   squid3 start/running, process 10392
>   # cat /var/run/squid3.pid
>   10392
>   # stop squid3
>   squid3 stop/waiting
>   # cat /var/run/squid3.pid
>   10392
>   # logrotate -f /etc/logrotate.d/squid3
>   squid: ERROR: Could not send signal 10 to process 10392: (3) No such
> process
>   error: error running shared postrotate script for '/var/log/squid3/*.log
> '
>
>   [Regression Potential]
>
>   * The main regression potencial is related to the "kill timeout" value.
> If set too high, a user's system might *seem* too slow to shutdown. Note
> that this should rarely happen: only when squid actually needs a long time
> to write down and close its cache index files (high system load maybe?).
> Overall, whenever a system behaves this way the user probably has other
> problems to solve.
>   * If "kill timeout" value is set too low we might incur in the same
> problem originally reported and end up calling SIGKILL on squid too early,
> probably while it is closing its cache index files, thus corrupting them.
> Anyway, this was probably happening already anyway, as SIGKILL is always
> called after 5 seconds in the current upstart script.
>
>   Overall, there is very low regression potential for this fix.
>
>
>   [Other info]
>
>   # lsb_release -d
>   Description:    Ubuntu 12.04.5 LTS
>   # dpkg -l squid3
>   Desired=Unknown/Install/Remove/Purge/Hold
>   |
> Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
>   |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
>   ||/ Name               Version            Description
>
> +++-==================-==================-====================================================
>   ii  squid3             3.1.19-1ubuntu3.12 Full featured Web Proxy cache
> (HTTP proxy)
>
>   # lsb_release -d
>   Description:  Ubuntu 14.04.3 LTS
>   # dpkg -l squid3
>   Desired=Unknown/Install/Remove/Purge/Hold
>   |
> Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
>   |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
>   ||/ Name                                        Version
>   Architecture               Description
>
> +++-===========================================-==========================-==========================-===========================================================================================
>   ii  squid3                                      3.3.8-1ubuntu6.3
>    amd64                      Full featured Web Proxy cache (HTTP proxy)
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/squid3/+bug/1073478/+subscriptions
>

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to squid3 in Ubuntu.
https://bugs.launchpad.net/bugs/1073478

Title:
  [SRU] Update squid3 upstart script to kill it with SIGINT and wait
  longer

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/squid3/+bug/1073478/+subscriptions



More information about the Ubuntu-server-bugs mailing list