[apparmor] [patch 1/3] add base and capability rule classes

Fri Nov 21 23:38:02 UTC 2014

Hello,

Am Donnerstag, 20. November 2014 schrieb Steve Beattie:
> On Wed, Nov 19, 2014 at 01:40:30AM +0100, Christian Boltz wrote:
> > Am Dienstag, 18. November 2014 schrieb Steve Beattie:
> > > > The layout of the base_rules and capability_rules class is not
> > > > too
> > > > different from the proposal I sent some weeks ago.
> > > > 
> > > > The biggest difference is that the class doesn't store the full
> > > > profile and include list. When I tried to do this, aa-logprof
> > > > started to eat *lots of* memory before I Ctrl-C'd it in a
> > > > deepcopy() operation. Either we find a way to implement this in
> > > > a
> > > > sane way that doesn't eat memory, or we just keep it out ;-)
> > > 
> > > Do you have a code snippet of how you went about this? Keeping
> > > references to things should be relatively lightweight. Perhaps
> > > it's
> > > an artifact of the specific implementation of the global hasher
> > > thingy.
> > 
> > Something like that (IIRC):
> > 
> > class capability_rules(base_rules):
> >     fullprofile = None
> >     includes = None
> >     
> >     def __init__(self, fullprofile, includes):
> >         self.fullprofile = fullprofile
> >         self.includes = includes
> > 
> > then use
> > 
> >     foo = capability_rules(aa[profile][hat], include)
> > 
> > That said - keeping the includes outside isn't as bad as I thought,
> > so don't worry about that too much ;-)
> 
> Okay. Again, my suspicion is that this is an artifact of the hasher,
> as normally python just retains a reference to an object, unless you
> do specific work to make a deep copy.

We can test again when we got rid of hasher ;-)

> > > >     def __init__(self):
> > > >         self.delete_all_rules()
> > > >     
> > > >     def delete_all_rules(self):
> > > >         self.rules = []
> > > 
> > > Why is delete_all_rules() necessary? __init__() should probably
> > > just
> > > do the initialization of self.rules directly, and we really
> > > shouldn't
> > > be re-using an object to encapsulate a different set of rules.
> > 
> > I'm using it from aa.py when writing the profile in non-clean mode,
> > and that code needs to clear the rules it already wrote. I could
> > also use> 
> >     aa[profile][hat]['capability'] = capability_rules()
> > 
> > there. However, using
> > 
> >     aa[profile][hat]['capability'].delete_all_rules()
> > 
> > has a big advantage: We can replace 'capability' with a variable and
> > easily loop over all rule types. You can't do that with
> > 
> >     ... = capability_rules() ;-)
> 
> Mmm, I'm still not entirely convinced, but it feels like it again
> might be an artifact of the hasher and/or a lack of use of classes
> for profiles.

Imagine aa.py will have something like this one day (for non-clean 
mode):

for default_write_order as rule_type:
    data += aa[profile][hat][rule_type].get_raw()
    aa[profile][hat][rule_type].delete_all_rules()

If you tell me another way to implement the last line, I'll use it ;-)

> > > And realistically, if you just declare 'rules = []' as a field in
> > > the
> > > class, then neither function is necessary (though it wouldn't
> > > surprise me if we needed __init__() for other reasons).
> > 
> > __init() is needed - try running the testsuite without it and you'll
> > know the reason ;-)
> > 
> > (and yes, I was also surprised that I explicitely need self.rules =
> > [] in __init__() - what python does without the __init__() isn't
> > what I'd expect)
> 
> Right, this was me being dumb, and not remembering that fields
> declared at the class level are Class Attributes and thus modifying
> what the attribute references in a method changes it for *all*
> instances of that class (e.g. if the attribute is a list, items
> appended to the list will appear for all instances of the class).

Does this interesting[tm] behaviour have a real-world usecase? 
("confuse developers" doesn't count ;-)

> (The python tutorial section at
> https://docs.python.org/3/tutorial/classes.html#class-and-instance-va
> riables is a pretty good overview).

Ah, that at least explains the behaviour which is totally different from 
what PHP (and every other language?) does. In PHP, everything declared 
inside a class is local to the instance.

> Generally we should probably avoid Class Attributes unless they are
> something immutable across the class.

OK, that means all variables (except can_glob and can_glob_ext) should 
be initialized in __init__.

> > BTW: is ..._rule vs ..._rules ok or should we use a name that
> > differs
> > more? (If yes, any idea?)
> 
> So, first off, stylistically, it's a common style to use CamelCase
> for python classes, to make them easier to distinguish from functions
> or methods. Second, I agree that failing to distinguish between _rule
> and _rules (or Rule and Rules) is probably too easy; how about Rule
> and Ruleset?

Good idea, changed.

> > > >     def delete_duplicates(self, inccaps):
> > > >         '''Delete duplicate rules.
> > > 
> > > What is the inccaps argument here?
> > 
> > The capabilities of an include file
> > (include[file][file]['capability'] in aa.py IIRC). I should
> > probably rename inccaps to include_rules to make the name fit for
> > all rule types.
> 
> Yes please.

Done.

> > > >     def get_glob_ext(self, path_or_rule):
> > > >         '''returns the next possible glob with extension (for
> > > >         file
> > > >         rules only).
> > > 
> > > When we come around to dbus and some of the other rule types, it's
> > > likely they will also have meaningful results. What's the return
> > > type/value for if there's no meaningful glob? Would 'capability,'
> > > (*shudder*) be a valid glob for capabilities?
> > 
> > At the moment, my plan is to return a string (basically a rawrule),
> > but that's something we can change if you have a better idea.
> > 
> > And yes, "capability," would be a valid glob - and also something I
> > wouldn't want to have in my profiles ;-)
> > 
> > BTW: get_glob_ext() is for "Glob with (E)xt" - for capabilities,
> > only
> > get_glob() is available.
> 
> Well, what I was more proposing/thinking-incoherently-about was a
> generic return-a-set-of-proposed-globs method, where (g)lob and glob
> with (e)xt are elements of that set, but eh.

Let's come back to this when we have some code that needs it ;-)

> > > > utils/apparmor/rule/capability.py:
> > > > 
> > > > class capability_rule(base_rule):
> > > >     '''Class to handle and store a single capability rule'''
> > > >     
> > > >     def __init__(self):
> > > >         self.capability = []
> > > 
> > > Why is it a list, for a single capability? (And again, just
> > > declare
> > > the field with the initial value.)
> > 
> > Because someone had the idea to allow
> > 
> >     capability chown chgrp audit_write,
> > 
> > in the profiles ;-)
> > 
> > Until now, the utils just parsed this as
> > 
> >     "capability"     "chown chgrp audit_write"
> > 
> > but that makes finding duplicates quite hard ;-)
> 
> Mmm, right. A suggestion would be to make it a set() instead of a
> list, to force elimination of duplicates.

Might be an idea, but then we need to ensure not to accidently create 
keys in it (see hasher ;-)

Added as comment for now.

> At sort of a meta level, I know in lots of instances we try to avoid
> making unnecessary changes to policy, yet I think there is value in
> normalizing policy as well, which you could consider separating out a
> multiple capability rule into multiple one-capability-per-rule rules
> as being a type of normalization.

We have get_clean() and get_raw() to explicitely choose between original 
rules and "cleaned up" rules, so that wouldn't be a problem. 

The only problem would be the data type of the *Rule.get_clean() - 
currently it returns a string, so for multiple rules we could return a 
multi-line string (breaks sorting) or an array of lines (which would 
mean a little behaviour change).

AFAIK capability rules are the only rule type with this problem, so I'd 
like to avoid having a solution that makes our life harder for all other 
rule types just to cover a corner case.

I'll add a comment so that it doesn't get lost, but won't implement it 
for now.

> > === added directory 'utils/apparmor/rule'
> > === added file 'utils/apparmor/rule/__init__.py'
> > --- utils/apparmor/rule/__init__.py	1970-01-01 00:00:00 +0000
> > +++ utils/apparmor/rule/__init__.py	2014-11-15 21:51:26 +0000
> > @@ -0,0 +1,217 @@
> > +#
> > -------------------------------------------------------------------
> > --- +#    Copyright (C) 2013 Kshitij Gupta <kgupta8592 at gmail.com> +#
> >    Copyright (C) 2014 Christian Boltz <apparmor at cboltz.de> +#
> > +#    This program is free software; you can redistribute it and/or
> > +#    modify it under the terms of version 2 of the GNU General
> > Public +#    License as published by the Free Software Foundation.
> > +#
> > +#    This program is distributed in the hope that it will be
> > useful,
> > +#    but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +#    GNU General Public License for more details.
> > +#
> > +#
> > -------------------------------------------------------------------
> > --- +
> > +from apparmor.common import AppArmorBug
> > +
> > +# setup module translations
> > +from apparmor.translations import init_translation
> > +_ = init_translation()
> > +
> > +# The key for representing bare rules such as "capability," or
> > "file," +ALL = '\0ALL'
> 
> I'm not so keen on having a magic dict key to represent the bare
> keyword rules; one way I'd thought about handling this was to have
> a keyword rule be represented by a subclass of the rule type; e.g.
> CapabilityAllRule as a subclass of CapabilityRule, and trying to
> encapsulate most of the differences within that subclass.

As discussed on IRC, I'm not too keen on adding another class just to 
avoid two "if ALL in self.capabilitiy" branches.

I changed it to use self.all_caps to avoid having a magic key, even if 
that makes some places slightly more interesting than they would be with 
the ALL magic key.

> > +class base_rule(object):
> As mentioned above, it would fit stylistically if this was BaseRule,
> rather than base_rule, etc.

Changed (thanks, sed ;-)

> > +    '''Base class to handle and store a single rule'''
> > +
> > +    audit = False
> > +    deny = False
> > +    allow_keyword = False
> > +
> > +    inlinecomment = ''
> > +
> > +    rawrule = ''
> > +    logmode = ''  # only set by set_log()
> 
> As these are Class Attributes, I'd rather see them get set in an
> __init__() method, which alas means that the __init__() functions of
> subclasses will need to call super(SUBCLASS_rule, self).__init__()
> in their __init__() method, if they need one.

I changed that, and also added init_vars() that is called by __init__ 
(contains just "pass" in BaseRule/BaseRuleset) so that the child classes 
don't need to re-implement or call the parent __init__ when they need to 
add some variables.

> > +    def get_raw(self, depth):
> > +        '''return raw rule (with original formatting, and leading
> > whitespace in the depth parameter)''' +        if self.rawrule:
> > +            return '%s%s' % ('  '*depth, self.rawrule)
> 
> pep8 style nit: please put spaces around the multiplication operator.

;-)

> > +        else:
> > +            return self.get_clean(depth)
> 
> BaseRule doesn't implement or declare a get_clean() method. Should it
> implement a default one? Or just let it back trace if a Rule subclass
> author mistakenly doesn't implement one? Or maybe raise AppArmorBug?
> 
> Reading up on python's Abstract Base Classes
> (https://docs.python.org/3/library/abc.html), we could do
> annotatations to require methods to be implemented, but it involves
> metaclasses, and the syntax for specifying that changed between
> python 2.7 and 3.x. So I'm leery of going in that direction, until
> we're ready to wash our hands completely of python 2.

Sounds like metaclasses are not an option ;-)

I can (and probably will) add all functions to __init__.py and let them 
    raise AppArmorBug('needs to be implemented')
but I wanted to wait with this until we have the final syntax in place 
to avoid needing to change them at two places.

> > +    def audit_allow_str(self):
> > +        '''return the allow/deny and audit keyword as string,
> > including whitespace'''
> Maybe more generically "modifiers_str", as these are rule modifiers.

Changed.

> > +        if self.audit:
> > +            auditstr = 'audit '
> > +        else:
> > +            auditstr = ''
> > +
> > +        if self.deny:
> > +            allowstr = 'deny '
> > +        elif self.allow_keyword:
> > +            allowstr = 'allow '
> > +        else:
> > +            allowstr = ''
> > +
> > +        return '%s%s' % (auditstr, allowstr)
> > +
> > +    def parse_audit_allow(self, matches):
> > +        '''returns audit, deny, allow_keyword and comment from the
> > matches object +           - audit, deny and allow_keyword are
> > True/False
> > +           - comment is the comment with a leading space'''
> > +        audit = False
> > +        if matches.group('audit'):
> > +            audit = True
> > +
> > +        deny = False
> > +        allow_keyword = False
> > +
> > +        allowstr = matches.group('allow')
> > +        if allowstr:
> > +            if allowstr.strip() == 'allow':
> > +                allow_keyword = True
> > +            elif allowstr.strip() == 'deny':
> > +                deny = True
> > +            else:
> > +                raise AppArmorBug("Invalid allow/deny keyword %s" %
> > allowstr) +
> > +        comment = ''
> > +        if matches.group('comment'):
> > +            # include a space so that we don't need to add it
> > everywhere when writing the rule +            comment = ' %s' %
> > matches.group('comment')
> > +
> > +        return (audit, deny, allow_keyword, comment)
> > +
> > +
> > +class base_rules(object):
> > +    '''Base class to handle and store a collection of rules'''
> > +
> > +    rules = []
> 
> Same issue with Class Attributes here.

Deleted (it was already initialized in __init__, so having it here was 
superfluous)

> > +    # decides if the (G)lob and Glob w/ (E)xt options are displayed
> > +    can_glob = True
> > +    can_glob_ext = False
> 
> These seem like legit Class Attributes, and as such are okay, or am
> I wrong? I would expect the subclass definition to override the value
> if necessary.

Exactly.

> > +    def __init__(self):
> > +        self.delete_all_rules()
> 
> I'd rather just see:
>           self.rules = []
> here.

I need delete_all_rules() (see the aa.py patch for details), so I prefer 
to have it there. If we can get rid of delete_all_rules() one day, I'll 
happily move it to __init__ ;-)

> > +    def add_raw(self, rawrule):
> > +        '''parse rawrule (from profile file) and store it in a
> > structured way''' +
> > +        newrule = self.new_rule()
> 
> Oh, I see how you're using new_rule(). Okay, this makes sense.

Told you so ;-)

> > +        newrule.set_raw(rawrule)
> 
> Generally, idiomatic python avoids having getter and setter functions
> and lets you access the field directly (and then you can do magic
> things with attributes if it turns out you need to do something more
> complex than simply set or get a value, leaving the interface the
> same). But I'm not strongly opposed to handling things this way for
> now.

We have different input types, so having setter functions  (set_raw() 
and set_log()) makes sense IMHO.

> > +    def get_clean(self, depth):
> > +        '''return all rules (in clean/default formatting)
> > +           Returns an array of lines, with depth * leading
> > whitespace''' +
> > +        data = { 'allow': [], 'deny': [] }
> 
> Is there a reason to use a dict here, rather than two variables (allow
> and deny)?

That's probably caused by the first version of the code (which you 
didn't see) when CapabilityRule had allow/deny as string, so I did 
something like
    data[rule.allow_str].append(...)

Now that this changed, having two separate variables sounds better. 
Changed.

> > +    def covered_obj(self, rule_obj, check_allow_deny = True, 
check_audit = False):
> pep8 wants the equals signs in default arguments to not have spaces
> (e.g. check_allow_deny=True).

Fixed.

> > +        '''return True if rule_obj is covered by existing rules,
> > otherwise False''' +
> > +        for rule in self.rules:
> > +            if rule.covered(rule_obj, check_allow_deny,
> > check_audit): +                return True
> > +
> > +        return False
> > +
> > +    def covered_log(self, parsed_log_event, check_allow_deny =
> > True, check_audit = False): +        '''return True if
> > parsed_log_event is covered by existing rules, otherwise False''' +
> > +        rule_obj = self.new_rule()
> > +        rule_obj.set_log(parsed_log_event)
> > +
> > +        return self.covered_obj(rule_obj, check_allow_deny,
> > check_audit) +
> > +    def covered_raw(self, rawrule, check_allow_deny = True,
> > check_audit = False): +        '''return True if rawrule is covered
> > by existing rules, otherwise False''' +
> > +        rule_obj = self.new_rule()
> > +        rule_obj.set_raw(rawrule)
> > +
> > +        return self.covered_obj(rule_obj, check_allow_deny,
> > check_audit)
> Maybe is_obj_covered, is_log_covered, and is_raw_covered to indicate
> that they are boolean functions? Similarly, is_covered in the Rule
> class?

Makes sense, changed.

> > +    def delete_obj(self, rule_obj):
> > +        '''Delete rule_obj from rules'''
> > +
> > +        rule_to_delete = None
> > +        i = 0
> > +        for rule in self.rules:
> > +            i = i + 1
> > +            if rule.covered(rule_obj, True, True):
> > +                rule_to_delete = i
> > +                break
> > +
> > +        if rule_to_delete:
> > +            self.rules.pop(i-1)
> > +        else:
> > +            raise AppArmorBug('Attemp to delete non-existing rule
> > %s' % rule_obj.get_raw(0))
> So if I have two rules, "capability," and "capability
> cap_dac_override,", and I then ask to delete cap_dac_override, both
> rules will get deleted?

Well, probably only one of them (see "break"), but it could be the wrong 
one :-/  - nice catch!

> I suspect we will want a rule equality test method.

Yes. I added is_equal() (to BaseRule) which calls is_equal_localvars() 
in CapabilityRule.

The question is how strict is_equal should be, for example
capability chown   vs.   allow capability chown
capability chown,   vs   capability chown, # comment

Technically they are equal, so I disabled the checks for everything that 
doesn't change the meaning of a rule ("allow" keyword and comment - and 
also rawrule because it includes those)

> > +    def delete_duplicates(self, inccaps):
> > +        '''Delete duplicate rules.
> > +           inccaps must be a *_rules object'''
> > +        deleted = []
> > +        if inccaps:  # avoid breakage until we have a propoer
> > function to ensure all profiles contain all *_rules objects
> s/propoer/proper/

;-)

> > +            for cap_obj in self.rules:
> > +                if inccaps.covered_obj(cap_obj, True, True):
> > +                    self.delete_obj(cap_obj)
> > +                    deleted.append(cap_obj)
> > +
> > +        return len(deleted)
> 
> Would it make sense from a UI perspective to return the deleted list,
> which could then be displayed if requested?

Good question - aa-logprof displays "xx rules deleted" when adding an 
include that covers some of the existing rules, and at the end, we have 
(V)iew changes which compares the old and new profile.

The only place where knowing the deleted list would make sense is 
aa-mergeprof in 3-way mode, but we don't have 3-way yet. Besides that, 
we'll need a "real diff" for 3-way merge and not delete_duplicates.

So I'd say lets keep len(deleted) for now and change it when we have a 
real usecase.

> > === added file 'utils/apparmor/rule/capability.py'
> > --- utils/apparmor/rule/capability.py	1970-01-01 00:00:00 +0000
> > +++ utils/apparmor/rule/capability.py	2014-11-15 22:01:34 +0000
...
> > +class capability_rule(base_rule):
> > +    '''Class to handle and store a single capability rule'''
> > +
> > +    capability = []
> 
> This is not needed, with the assignment in __init__()

Right, deleted.

> > +
> > +    def __init__(self):
> > +        self.capability = []
> > +
> > +    def get_clean(self, depth):
> > +        '''return rule (in clean/default formatting)'''
> > +
> > +        space = '  '*depth
> 
> Again, space around the multiplication symbol please.

Done.

> > +    def set_log(self, parsed_log_event):
> > +        '''parse and store log event'''
> > +
> > +        if parsed_log_event['operation'] != 'capable':
> > +            raise AppArmorBug('Invalid capability log event - not a
> > capability event') +
> > +        if not parsed_log_event['name']:
> > +            raise AppArmorBug('Invalid capability log event - name
> > is empty') +
> > +        self.capability = [ parsed_log_event['name'] ]
> > +
> > +        self.logmode = parsed_log_event['aamode']
> 
> What is logmode saved for?

aa-logprof uses it, and the default answer (allow or deny) depends on 
complain vs. enforce mode. Oh, and it prints a nice headline like 
"Changes in compline mode" ;-)

That's exactly how the perl tools worked. The question is if we want to 
keep this behaviour, or want to change it.

The updated patch (v2) is attached.

Regards,

Christian Boltz
-- 
legacy code:
    code you didn't write (this morning)
[https://twitter.com/pcreux/status/481154970364825600]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1-add-base-and-capability-rule-class.diff
Type: text/x-patch
Size: 13250 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/apparmor/attachments/20141122/455a281b/attachment-0001.bin>