[MERGE][bug #172657] Graph.find_differences() and status after merge

Fri May 2 21:07:24 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ian Clatworthy wrote:
| John Arbash Meinel wrote:
|> Ian Clatworthy wrote:
|> | John Arbash Meinel wrote:
|> |> Attached is an updated form of my previous submission, which I now
|> |> consider at least ready for review.
|>
|> Attached is the "final form" of the patch. With the updates to
|> find_differences,
|> and a new find_unique_ancestors function (which status uses). I believe
|> I've
|> addressed all of your concerns here. There is one test I'm still working
|> on, to
|> trigger a particular race condition I fixed. But the code itself should be
|> correct, so I would rather submit it, and get it reviewed/merged rather
|> than
|> waiting on that to happen.
|
| I'm fine with that. I'm still digesting the algorithms in graph.py but I
| wasn't planning to critique those in this review anyhow. FWIW, I'm
| confident that the new APIs are clean, the code implementation is good
| quality and that the tests show things are working as documented. Must
| say though that reading parts of graph.py makes my head hurt. :-)
|
| Here are some performance figures (best of two runs) after merging this
| into bzr.dev:
|
| * time ../bzr.dev/bzr status = 4.10 sec
| * time ../bzr.dev/bzr status --no-pending 0.42 sec
| * time ./bzr status = 2.59 sec
| * time ./bzr status --no-pending 0.42 sec
|
| So it's much better but we're still paying a high cost for finding,
| sorting and presenting all of the detailed revision information we do,
| even with your improved algorithm. Maybe we ought to cache the full
| pending merge details in the working tree so we only calculate it once?
| Or perhaps we should just present the merge tips and only show the rest
| in verbose mode? Regardless, let's land this improvement.

Robert has had some discussion about the algorithm, and I think I have an
improvement sitting here that might cut the graph traversal time in half again.

So instead of 2.59s you might see 1.5s or so.

|
| bb:tweak
|
|> +    :return: An the iterator from MergeSorter.iter_topo_order()
|
| s/An the//
|
|
|>  def show_pending_merges(new, to_file, short=False):
|>      """Write out a display of pending merges in a working tree."""
|> +    # we need one extra space for terminals that wrap on last char
|> +    term_width = osutils.terminal_width() - 1
|> +    if short:
|> +        first_prefix = 'P   '
|> +        sub_prefix = 'P.   '
|> +    else:
|> +        first_prefix = '  '
|> +        sub_prefix = '    '
|> +
|>      parents = new.get_parent_ids()
|>      if len(parents) < 2:
|>          return
|
| These last 3 lines (check for the len(parents)) ought to be at the very
| top of the routine.
|
|> +    ignore = set([None, last_revision, _mod_revision.NULL_REVISION])
|> +    graph = branch.repository.get_graph()
|> +    other_revisions = [last_revision]
|
| Do you still need ignore now that you've introduced other_revisions?
| ignore is still checked during the inner loop test but I couldn't see it
| , unlike other_revisions, being updated anywhere.
|

It isn't updated anywhere. I left it in just to make sure we are ignoring
NULL_REVISION. Though I should be handling that elsewhere.

I can probably remove it without harm.

|> +complex_shortcut2 = {'a':[NULL_REVISION], 'b':['a'], 'c':['b'], 'd':['c'],
|> +                    'e':['d'], 'f':['e'],
|> +                    'g':['f'], 'h':['d'], 'k':['h', 'i'], 'j':['h'],
|> +                    'i':['g'], 'l':['k'], 'm':['l'],
|> +                    'n':['m'], 't':['i', 's'], 'u':['s', 'j'],
|> +                    'o':['n'], 'p':['o'], 'q':['p'],
|>                      'r':['q'], 's':['r'],
|>                      }
|
| For slightly improved clarity, these should be listed in alphabetical
| order like all the other graph definitions.

Can do. I think I listed it in order of the ancestry lines. Otherwise I don't
really know why they would be out of order.

|
|> @@ -264,6 +410,10 @@
|>          self.calls.extend(nodes)
|>          return self._real_parents_provider.get_parent_map(nodes)
|>
|> +    def get_parent_map(self, nodes):
|> +        self.calls.extend(nodes)
|> +        return self._real_parents_provider.get_parent_map(nodes)
|> +
|
| This duplication was mentioned in my earlier review I think.

I'm pretty sure it was in a different place, because I did remove one duplication.

|
|> +    def test__remove_simple_descendants(self):
|> +        graph = self.make_graph(ancestry_1)
|> +        self.assertRemoveDescendants(set(['rev1']), graph,
|> +            set(['rev1', 'rev2a', 'rev2b', 'rev3', 'rev4']))
|> +
|> +    def test__remove_simple_descendants_disjoint(self):
|> +        graph = self.make_graph(ancestry_1)
|> +        self.assertRemoveDescendants(set(['rev1', 'rev3']), graph,
|> +            set(['rev1', 'rev3']))
|> +
|> +    def test__remove_simple_descendants_chain(self):
|> +        graph = self.make_graph(ancestry_1)
|> +        self.assertRemoveDescendants(set(['rev1']), graph,
|> +            set(['rev1', 'rev2a', 'rev3']))
|> +
|> +    def test__remove_simple_descendants_siblings(self):
|> +        graph = self.make_graph(ancestry_1)
|> +        self.assertRemoveDescendants(set(['rev2a', 'rev2b']), graph,
|> +            set(['rev2a', 'rev2b', 'rev3']))
|> +
|
| Is there any reason for the double underscore in these method names?

The function itself has a single underscore. Thus adding the prefix "test_"
creates 2 underscores. It is something Robert did a while ago. With the idea
that if there is ever a public function with the same name, the test for it
shouldn't have the extra underscore.

Or in other words:

def foo():
~  # do some setup
~  return _foo()

def _foo():
~  # do something without setup

Then you would have:

def test_foo():
~  ...

def test__foo():
~  ...

I thought it was project policy to do so, if it isn't, then I can switch.

|
| BTW, I really like the new show_pending_merges() code and the better
| display of ghosts it provides. The new tests are great as well. Thanks.
|
| Ian C.
|
|

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkgbdHwACgkQJdeBCYSNAAN6cgCdHwWtYhmryux3QqaaHbls8r/b
VKQAoJRlYECnHso4mqQ7MdRhbi3awACg
=3eJ1
-----END PGP SIGNATURE-----