logrotate configuration seems wrong

John Meinel john at arbash-meinel.com
Mon Sep 15 07:45:11 UTC 2014


I did try running "logrotate.run" directly, and seeing it do nothing, and
then doing "s/512/488/" and seeing that it did, indeed, create a .1.gz. So
my thought that the bug being the "rotate 1" may actually be wrong, and the
bug is just how "size 512M" interacts with the rsyslog confuguration.

John
=:->


On Mon, Sep 15, 2014 at 11:40 AM, John Meinel <john at arbash-meinel.com>
wrote:

> Looking again, I wonder if this is a 1024 vs 1000 bug. Specifically
> rsyslog's configurations says:
> $outchannel
> logRotation,{{logDir}}/all-machines.log,512000000,{{logrotateHelperPath}}
>
> and the logrotate configuration says:
> /var/log/juju/all-machines.log {
>     size 512M
> ...
> }
>
> I'm wondering if logrotate is running and saying 512,000,006 <
> 512*1024*1024 (536,870,912), and not rotating anything, and rsyslog is
> refusing to write since the file is too big.
>
> Regardless, I'm surprised to see it just-not-working in practice, as I
> would have thought someone would have actually tested this before landing
> it.
>
> John
> =:->
>
> On Mon, Sep 15, 2014 at 9:38 AM, John Meinel <john at arbash-meinel.com>
> wrote:
>
>> Going further on this, as I'm discovering new oddities:
>>
>> 1) ll /var/log/juju
>> -rw------- 1 syslog adm    512000006 Sep 15 05:16 all-machines.log
>>
>> Seems surprising that it is 5120000* bytes long.
>>
>> 2) tail -f /var/log/juju/all-machines.log
>> nothing happening
>>
>> 3) I see more data coming into /var/log/juju/machine-0.log, why isn't it
>> ending up in all-machines.log
>>
>> 4) Given that we are using 2 different rotation mechanisms, did we test
>> that messages that were just written to machine-0.log still get copied to
>> all-machines.log when machine-0.log ends up getting rotated?
>>
>> 5) I'm pretty sure we didn't, because clearly we didn't test that
>> all-machines.log actually gets any more data once it 'fills' up.
>>
>> 6) Note that I do still see rsyslog running and it is using a fair amount
>> of CPU, so it is still getting messages from other units, etc. machine-0
>> agent and mongod are both quite active as well.
>>
>> 7) "copytruncate" seems the wrong setting for interactive with rsyslog. I
>> believe rsyslog is already aware that the file needs to be rotated, and
>> thus it shouldn't be trying to write to the same file handle (and thus we
>> don't need to truncate in place). I'm not 100% sure on the interactions
>> here, but "copytruncate" seems to have an inherent likelyhood of dropping
>> data (while you are copying, if any data gets written then you'll miss
>> those last few bytes when you go to truncate, right?)
>>
>> I'm happy to see logrotation being added, but it seems quite half-baked
>> in our current trunk.
>>
>> Am I just doing something wrong? Did someone actually test all of this
>> out and it was working for them?
>>
>> John
>> =:->
>>
>> On Mon, Sep 15, 2014 at 9:13 AM, John Meinel <john at arbash-meinel.com>
>> wrote:
>>
>>> So I was testing scaling today which generally generates huge volumes of
>>> logging (I actually wanted to keep it because I used it for seeing how
>>> everything went, but I understand why we are rotating the logs.)
>>>
>>> However, I found this as it was running:
>>> # ls -sh /var/log/juju
>>> total 909M
>>> 323M all-machines.log
>>> 4.0K ca-cert.pem
>>> 4.0K logrotate.conf
>>> 4.0K logrotate.run
>>> 301M machine-0-2014-09-15T05-02-27.486.log
>>> 301M machine-0-2014-09-15T05-06-53.779.log
>>>  80M machine-0.log
>>> 4.0K rsyslog-cert.pem
>>> 4.0K rsyslog-key.pem
>>>
>>> Notice that there is only 1 all-machines.log that is 300MB in size,
>>> while there are 2 machine-0 logs.
>>>
>>> And when I track down the various configuration files, I find
>>> # cat logrotate.conf
>>>
>>> /var/log/juju/all-machines.log {
>>>     size 512M
>>>     # don't move, but copy-and-truncate so the application won't have to
>>> be
>>>     # told that the file has moved.
>>>     copytruncate
>>>     # maximum of one old file
>>>     rotate 1
>>>     # counting old files starts at 1 rather than 0
>>>     start 1
>>>     # use compression
>>>     compress
>>> }
>>>
>>>
>>> I have the feeling that someone didn't realize "rotate 1" means only
>>> keep the original log file. As in, there are *no* backup files.
>>>
>>> Did the person who implemented this actually test it?
>>>
>>> Did we ever fix things so that "juju debug-log" doesn't become
>>> immediately useless once you reach the rotate threshold (that it can look
>>> in the backup log files)?
>>>
>>> I can understand not fixing debug-log, but I'm a bit surprised that our
>>> idea of "all-machines.log needs to be rotated" became "all-machines.log
>>> needs to be truncated".
>>>
>>> John
>>> =:->
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju-dev/attachments/20140915/0c6083f1/attachment.html>


More information about the Juju-dev mailing list