Signing snapshots

Tue Jun 21 19:14:23 BST 2005

Aaron Bentley wrote:

> John A Meinel wrote:
>
> >Aaron Bentley wrote:
>
> >>This means that
> >>creating a sha-160 signature for a revision requires adding sha-160s to
> >>every ancestor revision.  I think this makes merge horizons impossible.
> >>
> >As soon as you modify the text, you invalidate the signature. Since we
> >might actually be signing the gzipped form, you need the actual bytes,
> >not something that is newly generated (because you might use a slightly
> >different compression level, or the algorithm has been tweaked between
> >versions).
>
>
> My understanding is that we're signing the tree and revision state, not
> the storage of the tree or revision.  The point of signing is so that
> you can verify that a given revision is or is not a true copy of
> revision $foo.  That way, it doesn't matter how a given revision was
> produced, and you can produce it whatever way is most effective for the
> context.

Sure. You are signing the *output* format. Meaning that you can get the
working tree into the state that I put it in, not the actual archive files.
However, in the end, it comes down to signing a bunch of bytes. And at
least the last proposal I read was to sign the bytes that make up the
revision file. Which means that modifying that at all means you need to
re-sign.

>
> >For new revisions you can switch from using "revision_sha1" to
> >"revision_sha160". You still validate older revisions using sha1
> >*because you have to* that is what was signed.
>
>
> I think that's not great, because it allows someone to stuff bogus
> recent revisions into a branch, and get away with it.  I think designing
> for re-signing is better.

But what happens if *I* upgrade to sha-160, but I'm branching off of
*you* and you haven't upgraded. I can't re-sign your signatures, I still
have to trust sha1 for the first 600 revisions.

>
> >Your new 160 hash does not have a sha1 signature.
>
> >You don't have to update the old revisions if you don't want to, in
> >fact, once we start doing signatures, updating old revisions would
> >require all new signatures.
>
>
> When you use SHA-160 to produce a signature on an SHA-1 hash, you've got
> SHA-1-level security.  It's certainly possible to rewrite a revision and
> add an SHA-160 signature.

Sure, but only if you are the owner. And that doesn't change the
signatures in the 500+ copies of your tree that someone else already copied.

>
> >>Also, I think signing snapshots makes sense because not every snapshot
> >>is a revision.  (Or is it?)  Requiring people to commit in order to
> >>produce changesets seems onerous.
>
>
> >What are you considering to be a snapshot?
>
>
> I mean the state of all versioned files in the tree.
>
> >Are you thinking that a
> >changeset can be produced by comparing to the local working tree? I
> >agree with that, but I don't think it needs a permanantly assigned
> >signature, if it doesn't have a permanantly assigned revision id.
> >If you are wanting to send it as an email, just sign the changeset as it
> >goes into the email. bzr diff | gpg --clearsign
>
>
> It's a nice and convenient way of verifying that the changeset came from
> a trusted source, though.  I think unified handling would be good here,
> instead of requiring external validation.
>
> But it would be possible to create a temporary revision before sending
> the changeset, I guess.
>
>
> >In my changeset plugin, I don't support working tree yet, as there are
> >fields I would like to get from the revision info.
>
>
> Oh.  Err, what kind of fields?

Right now I'm using committer, precursor_sha1, timestamp, timezone, etc.
They are all things that are generated at commit time, which I certainly
could generate at "bzr changset" time. I just didn't want to deal that yet.

Plus I was trying to work out what the notation would be if you wanted
to handle rollup changsets. (Give me a changset for the last 10
revisions). Mostly it was an interface issue, and how to translate what
the usergave you into an appropriate set of revision ids. Some
examples: bzr changeset -r 10: should it give you a rollup from
10->working tree or should it be 10->lastrev, "bzr changeset" is that
give me the last committed, or the change relative to working tree. I
was just having trouble with when None meant no user input, and when
None meant use last/first. So I punted for now.

>
> >>>On the other hand this approach flakes out of the more
> >>>important problem of evaluating whether the code is signed by a
> >>>meaningful key.
> >>
> >>
> >>Sorry, didn't parse that.
>
>
> >The idea with a detached signature is that you don't actually have to
> >parse a semantic meaning to the bytes. Just read in some bytes and
> >compute the sha hash, then check the signature. If you have a
> >--clearsign style signature, you have to at least read the file and look
> >for where the ---BEGIN and ---END lines exist.
>
>
> What I don't parse is "this approach flakes out of the more important
> problem of evaluating whether the code is signed by a meaningful key".
> It sounds like he's talking about *not* signing the hash, to me.

The issue, I think, is that after checking that the signature is valid,
you want to go back and make sure the person who signed the signature is
the person doing the committing. (So that <revision committer="foo">
matches the name on the gpg key).
The problem is that you still have to parse the <revision> xml in order
to be able to match to the gpg key. Which means that someone who is
being malicious (but has a valid key), can still require you to parse
bogus data.

I don't think there is any way to get around needing to try and
understand the data they sent, in order to make sure it is valid. You
could restrict what keys you are willing to look more closely, though.
So you check the signature, make sure it is valid, *and* that you trust
it enough to read the revision contents to make sure that the
committer="" tag matches the gpg key.

>
> >>I wonder whether there's a useful difference between trusted and
> >>authoritative?  E.g., I will trust John Meinel's signature to prove that
> >>data is not malicious, but I will only trust your signature to prove
> >>that the revision produced is actually
> >>mbp at sourcefrog.net-20050620052204-c4253c3feb664088.
>
>
> >gpg makes the difference between an "UNTRUSTED GOOD" signature, trusted
> >good and bad. Which means that if you set your gpg keyring to trust me,
> >you will get trusted good on my signatures, if you leave out mpool, then
> >you will get untrusted good on his signatures.
> >Are you asking for more than this?
>
>
> Yes, I'm considering the possibility of two layers of signing and two
> levels of trust.  One signature on the binary output format, to prove
> that the data is not malicious, and one signature on the revision/tree,
> to prove that the output is a true copy of a given revision.
>
> That way, you can make a revfile version of the bzr codebase, and I can
> trust that revfile version.
>
It depends how you want to trust revfiles. Because it is certainly
possible that chunks are added as you go, and some of them may not be
used. Removing them does not invalidate the revfile.

> Note however, that since we don't want to download the entire revfile,
> we can't quickly validate them against a signature, and worse, their
> hash will change with every commit.  I guess we'll have to sign the
> logical chunks contained within revfiles.

The current method, is to sign the revision-store file, with the idea
being that if you started at the beginning, you could validate the
revfile. Validating a full-text is easy, then you patch it, and can
validate that against the next inventory sha, which is validated from
the revision entry.

>
> >It is arguable that the trust levels should be built into bzr rather
> >than using gpg-keyrings (I know that was a complaint against baz's trust
> >model). But gpg has done all of the hard work of implementing trust
> >networks, why not use it?
>
>
> The issue is that I would like to be able to associate a given branch
> with a given key, and not accept anyone else's signature on that branch,
> even if they are trusted.

Sure. You can do this with custom keyrings. So that you say "trust
things on this branch using this keyring."
I'm not really sure how to specify what keys can commit to what
branches. Are you thinking to add something into the .bzr/ directory
such as "x-allowed-keys"?

Is it something that is controlled at the repository level, and/or is
versioned? Or is it something that each person drops into their local
branch?

>
> >>I'd suggest that we retain that authentication data as well, so that we
> >>can determine later that data is signed by a compromised key.
> >>
> >Retain it in what form? Are you saying that when I pull your patch, I
> >should retain your signature?
>
>
> Yes, that was the idea.
>
> >Interesting idea. I'm wondering if the effort to do a proper chrooted
> >bzr would be better spent on just validating user input properly.
>
>
> It seems to me that chrooting is the more paranoid option.  If you focus
> on validating data, you're saying: "This code has few bugs, but if they
> are exploited, you can get in trouble".  The chroot approach says "This
> code may or may not have bugs, but the chance for damage if they are
> exploited is minimal".
>
> Of course, the importance of being able to say the second is a value
> judgement.

The problem is that a chroot is inherently of limited functionality. So
you have to make sure that everything you need is in that chroot. For
instance, are you checking signatures, and thus need gpg, do you need to
have access to the python standard libraries? Is it possible to exit the
chroot(). I forget the specific steps, but I thought there were some
ways to do it. Is the "StreamTree" class free from bugs, such that you
can't exploit the remote conversation to cause the local bzr to do bad
things.

I agree, limiting the damage is nice, but it might be a considerable
effort, which might be better off spent finding & eliminating bugs.
If the effort is not very large, I'm certainly for adding the limitation.

>
> Aaron

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050621/43fb748a/attachment.pgp