[apparmor] [patch] update aa.py is_skippable_file() according to libapparmor
Christian Boltz
apparmor at cboltz.de
Thu Dec 25 00:32:31 UTC 2014
Hello,
Am Mittwoch, 24. Dezember 2014 schrieb Kshitij Gupta:
> On Sun, Dec 7, 2014 at 2:49 AM, Christian Boltz wrote:
> > this patch updates is_skippable_file() to match all extensions that
> > are listed in libapparmor _aa_is_blacklisted() - some extensions
> > were missing in the python code.
> >
> > Also make the code more readable - even if it merges 4 re.search()
> > into one, and add some testcases.
> >
> > Notes:
> > - the original code additionally ignored *.swp. I didn't include
> > that ->
> > *.swp looks like vim swap files which are also dot files
>
> I'm not sure do we want to keep this or not. As stated parser_misc
> doesn't ignore this.
> A abc.txt.swp (open by vim) wont be covered under the dot file?
> Also there are other ways a .swp file maybe created.
vim creates a dot file - "vi abc.txt" creates .abc.txt.swp, and that is
already covered by the "filename starts with a dot" rule.
> We'll get a bug report if it is meant to be there.
;-)
> A user says even mv created a .swp file:
> http://unix.stackexchange.com/questions/27923/what-causes-swap-files-t
> o-be-created
Maybe (I never checked), but moving around profiles in GB size range is
a very rare event ;-) and for smaller files it's unlikely that the tools
see that file in the second it exists.
> - the python code ignores README files, but the C code doesn't
>
> > (do we need to add README in the C code?)
>
> Somebody else can answer this?
John? Tyler? Steve?
> > [ is_skippable_file.diff ]
> >
> > === modified file 'utils/apparmor/aa.py'
> > --- utils/apparmor/aa.py 2014-11-29 12:40:10 +0000
> > +++ utils/apparmor/aa.py 2014-12-06 21:12:49 +0000
> > @@ -2546,13 +2546,16 @@
> >
> > else:
> > return False
> >
> > -# rpm backup files, dotfiles, emacs backup files should not be
> > processed -# The skippable files type needs be synced with apparmor
> > initscript>
> > def is_skippable_file(path):
> > - """Returns True if filename matches something to be skipped"""
> > - if (re.search('(^|/)\.[^/]*$', path) or
> > re.search('\.rpm(save|new)$', path)
> > - or re.search('\.dpkg-(old|new)$', path) or
> > re.search('\.swp$', path)
> > - or path[-1] == '~' or path == 'README'):
> > + """Returns True if filename matches something to be skipped
> > (rpm or dpkg backup files, hidden files etc.)
> > + The list of skippable files needs to be synced with
> > apparmor
> > initscript and libapparmor _aa_is_blacklisted()
> > + path: filename (without directory)"""
> > +
> > + skippable_suffix =
> > '(\.dpkg-new|\.dpkg-old|\.dpkg-dist|\.dpkg-bak|\.rpmnew|\.rpmsave|\.
> > orig|\.rej|~)$'
> > + skippable_files = '^(.*/)?(README|\.[^/]*|)$'
> > # README, dot files and empty filename
>
> I suppose the number of calls to this function will be many as a large
> number of files are parsed/ignored.
Yes, basically number of files in /etc/apparmor.d/ (and subdirectories?)
That are quite some files, but we have places that are called more often
(for example parsing each line in the profile ;-)
> So, might I suggest that we (over?)optimise this?
>
> I can think of two ways:
> 1. Use the regex and use re.compile for all the three regex's outside
> the function call (It shows considerable speedup as tested by
> cProfile in Python2.7)
Indeed, re.compile is a good idea.
Note that the first two are only strings that are then merged into
skippable_files (to keep the regex parts easier readable).
Only skippable_files is used as regex.
> 2. Use simple string methods and leverage os.path submodule. This
> gives probably performance better than uncompiled regex and only a
> bit better than compiled versions.
>
> An alternative version string based:
>
> def is_skippable_file(path):
> basename = os.path.basename(path)
> if not basename or basename[-1] == '~' or basename[0] == '.' or
> basename == 'README':
> return True
> skippable_extension = ['.dpkg-new', '.dpkg-old', '.dpkg-dist',
> '.dpkg-bak', '.rpmnew', '.rpmsave', '.orig', '.rej']
> extension = os.path.splitext(basename)[1]
> if extension in skippable_extension:
> return True
> return False
That gives slightly less readable code IMHO - and the os.path.* calls
also don't come for free. If it doesn't bring a big performance boost,
I'd prefer the regex-based way.
BTW: using basename.endswith() could also be an option, and the python
docs say that it accepts tuplis as suffix since py 2.5.
> Letme know if you want to have a look at the script I used to compare
> the performance.
Yes, that would be interesting.
I'd also be interested in some numbers - how long do the following
variants take?
a) the "old" code we currently have in bzr (hint: it does 4 re.search)
b) with my patch applied (only one re.search, so I'd expect it to be
faster)
c) with my patch applied + using re.compile for skippable_files
c1) with re.compile inside the function (= one re.compile per file)
c2) with re.compile outside the function (= only one re.compile in
total)
d) with the code using os.path.* you proposed
(If you send me the test script, I can do the tests myself - but I won't
object if you do them and send the result ;-)
> + skippable_all = '%s|%s' % (skippable_suffix, skippable_files)
>
> > +
> >
> > + if re.search(skippable_all, path):
> > return True
> >
> > Also maybe add a "return False" at the end ;-)
Python returns None by default, which is "good enough" ;-)
OTOH, it never hurts if the code explicitely states what it does, so
I'll add it in the next patch version.
> > def is_skippable_dir(path):
> > === modified file 'utils/test/test-aa.py'
> > --- utils/test/test-aa.py 2014-11-27 22:20:26 +0000
> > +++ utils/test/test-aa.py 2014-12-06 21:06:15 +0000
...
> The tests look good. I'll hold-back the ACK for the modified
> version.
>
> Please have a look at the suggestions.
>
> Merry Christmas. (Apparently you were expecting an ACK under the
> christmas tree ;-) )
A review with comments is also ok ;-)
Regards,
Christian Boltz
--
[GUI vs. Command-Line] Einen ähnlichen Streit wird es in 20 Jahren
auch geben, wenn die "2D-Screenfanatiker" auf die "VR Fans" losgehen
und wieder ein Streit vom Zaun bricht der an Sinnfreiheit kaum zu
überbieten ist. [Phillip Richdale in suse-linux]
More information about the AppArmor
mailing list