[apparmor] dbus/pair address rule encoding

Wed May 8 21:43:59 UTC 2013

One of the decisions made last week while several us met at a sprint. Was
to change the dbus prototype syntax slightly to make it follow the same
general format of the proposed network/ipc rules.

Currently this is just a syntactic change, no new functionality is being
added at this time.

The change is to separate the local and remote addresses of a connection,
message.

  dbus name=foo.com acquire,

will acquire/bind a name for a local address. The only change here is the
use of the more appropriate word 'name' instead of 'dest'

For sending messages the syntax becomes
  dbus -> name=foo.com send,

the change here is '->' to a remote/peer address is going to be specified
and again 'name' instead of dest.

This syntax allows for rules (though unsupported atm) that specify both,
  dbus name=foo.com -> name=bar.com  send,

which allows tying a send to a specific source and to a specific destination.
This doesn't make a lot of sense when just addresses are used but it may
when labeling is supported in the future.

Now while this syntactic change can be encoded using the current rule
encoding, it is problematic for potential future expansion in that it overlays
both the address being acquired and the address being sent to.

So we need to change to a pair encoding which is similar to link, mount rules
and what is planned for the extended networking.

the general scheme is to encode the address that is to be looked up
first/most common. I would assume this would be the local address for the
acquire rule as this would mean that lookups for the acquire perm would
not require both addresses, and having both addresses does not make
sense.

the second address encoding follows the first as a pairing. This allows us
to restrict access to the second address based on the first if so desired.

Eg. for link rules we have
  /path/to/file\0/path/to/link
              ^              ^
            perm1          perm2

the permissions are encoded out of band (that is each state could have its
own set of permissions. And we use one special perm to indicate there is
more match data/perms to follow.

When making a permission request you only need to proceed with matching until
your requested permissions are matched (knowing the encoding you can short
circuit).

In the case of dbus I see

  <local address set encoding><remote address set encoding>
                             ^                            ^
                           perm1                        perm2

the AA_ACQUIRE perm gets encoded into perm1. And if there is a remote
address for the rule, the AA_CONT perm which means there is more matching
that can happen. AA_CONT isn't actually required it just allows bailing
out early if no more matches are possible (it would also be filtered out
from the set of permissions returned).

technically we could even encode send/receive in perm1 if it is followed
by a match that allows all remote addresses.

With the way the backend of the compiler is setup currently this means we
need to generate 2 rules when a remote address needs to be specified
  <local address> <perm1>
  <local address> <separator> <remote address> <perm2>

perm1 - may just be the AA_CONT perm. While the AA_CONT perm isn't required
having it present makes sure short circuiting can be done in the future
either from user space using a none naive query (see below) or if dbus
moves into the kernel.

the backend will take care of collapsing/recombining states where possible.
I would like to get to a point where this isn't needed but, we aren't there
yet.

the separator is anything that separates the two addresses in the encoding.
In link pairs, its a null transition (\0 character). We use this already
in the address encoding to separate the various possible strings (path,
name, iface, member).

We can either put in a second \0, so at the separator point we have \0\0
where the first is the separator for the member encoding, or we can just
rely on the fact that we know we have a fixed number of parts to the
source address, and that no additional separator is needed.

For the permissions I think we should encode them on the \0 separator,
not the last character of the string. This will allow us to specify a
don't care/any source for the query string of
  \0\0\0\0
which will find us the source permission if rules are encoded such that
for the parts of the address that can match anything they are encoded using
the regex [^0]*\0

To query this encoding,

aquire: just needs to specify the local address and return the perms found
        there.

send/receive: need to specify both the source and dest addresses in their
              query string. This allows us to allow these permissions
              conditionally based on what source they come from.

              I know that we aren't doing the currently but it leaves the
              possibility open in the future.

              I would also assume the source address would be the well
              known address as long as the sender owns it.

	      Again \0\0\0\0 could be used if there is no good source
	      address, which would only be allow when rules don't care
              which source address is used.

naive query:
In a naive query that doesn't specify the requested permissions nor that its
supplying a multipart query could be done. Just like we do now, and if
the query string matches we would reach the end and find the final
permissions.

So the aquire query would just specify enough to match to the source address

A request for send receive could just specify the local address if its
asking permission to send from it to any address (assuming we allow
merging send/receive perms into the perm1 set).

query with requested perms/parts info

A query that contained the requested perms and a value representing where
split from source to dest address happens could use match up to perm1
check the permission and then if not enough and AA_CONT was set continue
on to perm2 other wise it could bail out early.

How this would look is something like

query_string_len part1_len requested_perms query_string

where part1_len would be <= query_string_len

I don't think this is required for a userspace query (the naive query
should always work) but it is something I am taking advantange of in the
kernel.

Encoding of subject/object conditionals

The subject and object conditionals get encoded as part of the perm, and
just are a natural extension of the above encoding.

The details are that the state doesn't encode the permission directly. It
encodes an index into a permission table. That index can be to a set of
conditions. What is needed for those conditions to be resolved is the
ability to specify the subject and object (if needed).

I think we can extend the existing query to supply subject and object
entries. So in addition to profile/label we add the query commands spid,
ssock, opid, osock or something similar, which will let us lookup the
subject, and object. For pid based lookups we might want to add the ability
to specify a time stamp.

For queries that don't specify one or the other, or just use the profile/label
query command we can still do a lookup but some perms are ambiguous/conditional.
So they would not get specified in the allow mask.