[MERGE][bug #172657] Graph.find_differences() and status after merge
John Arbash Meinel
john at arbash-meinel.com
Fri May 2 21:07:24 BST 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ian Clatworthy wrote:
| John Arbash Meinel wrote:
|> Ian Clatworthy wrote:
|> | John Arbash Meinel wrote:
|> |> Attached is an updated form of my previous submission, which I now
|> |> consider at least ready for review.
|>
|> Attached is the "final form" of the patch. With the updates to
|> find_differences,
|> and a new find_unique_ancestors function (which status uses). I believe
|> I've
|> addressed all of your concerns here. There is one test I'm still working
|> on, to
|> trigger a particular race condition I fixed. But the code itself should be
|> correct, so I would rather submit it, and get it reviewed/merged rather
|> than
|> waiting on that to happen.
|
| I'm fine with that. I'm still digesting the algorithms in graph.py but I
| wasn't planning to critique those in this review anyhow. FWIW, I'm
| confident that the new APIs are clean, the code implementation is good
| quality and that the tests show things are working as documented. Must
| say though that reading parts of graph.py makes my head hurt. :-)
|
| Here are some performance figures (best of two runs) after merging this
| into bzr.dev:
|
| * time ../bzr.dev/bzr status = 4.10 sec
| * time ../bzr.dev/bzr status --no-pending 0.42 sec
| * time ./bzr status = 2.59 sec
| * time ./bzr status --no-pending 0.42 sec
|
| So it's much better but we're still paying a high cost for finding,
| sorting and presenting all of the detailed revision information we do,
| even with your improved algorithm. Maybe we ought to cache the full
| pending merge details in the working tree so we only calculate it once?
| Or perhaps we should just present the merge tips and only show the rest
| in verbose mode? Regardless, let's land this improvement.
Robert has had some discussion about the algorithm, and I think I have an
improvement sitting here that might cut the graph traversal time in half again.
So instead of 2.59s you might see 1.5s or so.
|
| bb:tweak
|
|> + :return: An the iterator from MergeSorter.iter_topo_order()
|
| s/An the//
|
|
|> def show_pending_merges(new, to_file, short=False):
|> """Write out a display of pending merges in a working tree."""
|> + # we need one extra space for terminals that wrap on last char
|> + term_width = osutils.terminal_width() - 1
|> + if short:
|> + first_prefix = 'P '
|> + sub_prefix = 'P. '
|> + else:
|> + first_prefix = ' '
|> + sub_prefix = ' '
|> +
|> parents = new.get_parent_ids()
|> if len(parents) < 2:
|> return
|
| These last 3 lines (check for the len(parents)) ought to be at the very
| top of the routine.
|
|> + ignore = set([None, last_revision, _mod_revision.NULL_REVISION])
|> + graph = branch.repository.get_graph()
|> + other_revisions = [last_revision]
|
| Do you still need ignore now that you've introduced other_revisions?
| ignore is still checked during the inner loop test but I couldn't see it
| , unlike other_revisions, being updated anywhere.
|
It isn't updated anywhere. I left it in just to make sure we are ignoring
NULL_REVISION. Though I should be handling that elsewhere.
I can probably remove it without harm.
|> +complex_shortcut2 = {'a':[NULL_REVISION], 'b':['a'], 'c':['b'], 'd':['c'],
|> + 'e':['d'], 'f':['e'],
|> + 'g':['f'], 'h':['d'], 'k':['h', 'i'], 'j':['h'],
|> + 'i':['g'], 'l':['k'], 'm':['l'],
|> + 'n':['m'], 't':['i', 's'], 'u':['s', 'j'],
|> + 'o':['n'], 'p':['o'], 'q':['p'],
|> 'r':['q'], 's':['r'],
|> }
|
| For slightly improved clarity, these should be listed in alphabetical
| order like all the other graph definitions.
Can do. I think I listed it in order of the ancestry lines. Otherwise I don't
really know why they would be out of order.
|
|> @@ -264,6 +410,10 @@
|> self.calls.extend(nodes)
|> return self._real_parents_provider.get_parent_map(nodes)
|>
|> + def get_parent_map(self, nodes):
|> + self.calls.extend(nodes)
|> + return self._real_parents_provider.get_parent_map(nodes)
|> +
|
| This duplication was mentioned in my earlier review I think.
I'm pretty sure it was in a different place, because I did remove one duplication.
|
|> + def test__remove_simple_descendants(self):
|> + graph = self.make_graph(ancestry_1)
|> + self.assertRemoveDescendants(set(['rev1']), graph,
|> + set(['rev1', 'rev2a', 'rev2b', 'rev3', 'rev4']))
|> +
|> + def test__remove_simple_descendants_disjoint(self):
|> + graph = self.make_graph(ancestry_1)
|> + self.assertRemoveDescendants(set(['rev1', 'rev3']), graph,
|> + set(['rev1', 'rev3']))
|> +
|> + def test__remove_simple_descendants_chain(self):
|> + graph = self.make_graph(ancestry_1)
|> + self.assertRemoveDescendants(set(['rev1']), graph,
|> + set(['rev1', 'rev2a', 'rev3']))
|> +
|> + def test__remove_simple_descendants_siblings(self):
|> + graph = self.make_graph(ancestry_1)
|> + self.assertRemoveDescendants(set(['rev2a', 'rev2b']), graph,
|> + set(['rev2a', 'rev2b', 'rev3']))
|> +
|
| Is there any reason for the double underscore in these method names?
The function itself has a single underscore. Thus adding the prefix "test_"
creates 2 underscores. It is something Robert did a while ago. With the idea
that if there is ever a public function with the same name, the test for it
shouldn't have the extra underscore.
Or in other words:
def foo():
~ # do some setup
~ return _foo()
def _foo():
~ # do something without setup
Then you would have:
def test_foo():
~ ...
def test__foo():
~ ...
I thought it was project policy to do so, if it isn't, then I can switch.
|
| BTW, I really like the new show_pending_merges() code and the better
| display of ghosts it provides. The new tests are great as well. Thanks.
|
| Ian C.
|
|
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkgbdHwACgkQJdeBCYSNAAN6cgCdHwWtYhmryux3QqaaHbls8r/b
VKQAoJRlYECnHso4mqQ7MdRhbi3awACg
=3eJ1
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list