Rev 3386: Small tweaks to _do_query in http://bzr.arbash-meinel.com/branches/bzr/1.4-dev/find_differences

John Arbash Meinel john at arbash-meinel.com
Tue Apr 22 22:12:21 BST 2008


At http://bzr.arbash-meinel.com/branches/bzr/1.4-dev/find_differences

------------------------------------------------------------
revno: 3386
revision-id: john at arbash-meinel.com-20080422210318-z0mc5q4hdxsur9qm
parent: john at arbash-meinel.com-20080422204514-bm0v3g592dapbx2s
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: find_differences
timestamp: Tue 2008-04-22 16:03:18 -0500
message:
  Small tweaks to _do_query
  Don't call set.update() with a generator, as it is fairly slow, especially
  if the generator is likely to be empty.
  Only 1 in 4 had data, and lsprof claimed 4s => 0.3s for set.update(),
  and 7.3 => 1.13 as the overhead in _do_query versus just the
  get_parent_map() cost.
modified:
  bzrlib/graph.py                graph_walker.py-20070525030359-y852guab65d4wtn0-1
-------------- next part --------------
=== modified file 'bzrlib/graph.py'
--- a/bzrlib/graph.py	2008-04-22 20:30:26 +0000
+++ b/bzrlib/graph.py	2008-04-22 21:03:18 +0000
@@ -833,17 +833,21 @@
         :return: A tuple: (set(found_revisions), set(ghost_revisions),
            set(parents_of_found_revisions), dict(found_revisions:parents)).
         """
-        found_parents = set()
+        found_revisions = set()
         parents_of_found = set()
         # revisions may contain nodes that point to other nodes in revisions:
         # we want to filter them out.
         self.seen.update(revisions)
         parent_map = self._parents_provider.get_parent_map(revisions)
+        found_revisions.update(parent_map)
         for rev_id, parents in parent_map.iteritems():
-            found_parents.add(rev_id)
-            parents_of_found.update(p for p in parents if p not in self.seen)
-        ghost_parents = revisions - found_parents
-        return found_parents, ghost_parents, parents_of_found, parent_map
+            new_found_parents = [p for p in parents if p not in self.seen]
+            if new_found_parents:
+                # Calling set.update() with an empty generator is actually
+                # rather expensive.
+                parents_of_found.update(new_found_parents)
+        ghost_revisions = revisions - found_revisions
+        return found_revisions, ghost_revisions, parents_of_found, parent_map
 
     def __iter__(self):
         return self



More information about the bazaar-commits mailing list