[MERGE] Implement Tree.plan_merge, deprecate RevisionTree.get_weave

Thu Jul 19 15:22:46 BST 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Thu, 2007-07-19 at 08:40 -0400, Aaron Bentley wrote:
>> Robert Collins wrote:
>>> It seems prone to losing a
>>> users data e.g. during a power-off or kill -9 situation as well as
>>> requiring more work at the tree layer to discard and recreate the cached
>>> parent data.
>> Yeah, the question is whether we care enough in these situations:
>>
>> 1. remerge is rarely used
>> 2. user files will also be a mess in this situation, so it's only
>>    incrementaly worse
>> 3. if we do lose the parent data, users can just do "revert" and "merge"
>>    instead.
> 
> I think it would be very surpising to a user, as they won't have any
> reason to think this could have happened - they may happily finish their
> merge and commit, but without the parent data - introducing conflicts.

Really?  They would hard-kill a remerge and then commit?  With all their
files out of whack, I really think the parent flag is the least of their
concerns.  They will still get bad results no matter what if they commit
after a killed remerge.  Files may be missing.

> On consideration I'd really rather see this controlled through a flag to
> the function or something.

What bugs me about this is that it really is accurate information-- if
you remerge, you set the file contents to their old values.  And so the
the old parent flags are the correct parent flags for those contents.

>>> with our current API
>>>
>>> +    def _get_ancestors(self, default_revision):
>>> +        ancestors = set([default_revision])
>>> +        for parent_id in self.get_parent_ids():
>>> +            ancestors.update(self.branch.repository.get_ancestry(
>>> +                             parent_id, topo_sorted=False))
>>> +        return ancestors
>>>
>>> will be faster as
>>> +        ancestors = set([default_revision])
>>> +
>>> ancestors.update(self.branch.repository.get_revision_graph(self.get_parent_ids()))
>>>
>>> +        return ancestors
>>>
>>> because that will do just one index traversal.
>> True, but it's exceptional to perform a merge while there is more than
>> one ancestor, and it seems clearer to use get_ancestry.
> 
> well both ancestors and get_revision_graph do full graph traversal, its
> just that ancestors flattens the graph

The fact that the graph representation can substitute for a set of
ancestors is pure happenstance, and not particularly obvious.

> - personally I don't see using
> get_ancestors in a loop as being clearer. But its definately much
> slower.

Please describe a use case in which it will be slower, then.  The only
one I can think of is merge --force --weave.

> On this point I'll leave it up to you - but ask that if you leave it as
> get_ancestry with a loop that you have a XXX with the faster version and
> a note (e.g. # XXX Using get_ancestry for clarity; if this becomes a
> performance sticking point consider
> self.branch.repository.get_revision_graph(self.get_parent_ids()))

>> Not really necessary.  We really just need graph difference.
> 
> Do you mean 'we only need the revisions from the tips to the full set of
> lcas ?'

Yes.  For each uncommon line, we need to find out whether it was
introduced by an uncommon ancestor or a common one, in order to decide
whether it's new in A or killed in B.  We know that if the line was not
introduced by an uncommon ancestor, it was introduced by a common one.
So all we need, actually, is the set of uncommon ancestors.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGn3O20F+nu1YWqI0RAlAeAJsGFoisXTgM275Cm1/6u2CgvzPzEQCcD61+
pOg054zEERfGEe7y6DQTDpk=
=GXR/
-----END PGP SIGNATURE-----