Rev 2636: Remove some unneeded index iteration by checking if we have found all keys, and grammar improvements from Aaron's review. in http://people.ubuntu.com/~robertc/baz2.0/index

Sun Jul 15 05:53:55 BST 2007

At http://people.ubuntu.com/~robertc/baz2.0/index

------------------------------------------------------------
revno: 2636
revision-id: robertc at robertcollins.net-20070715045353-27opxm5h91ez0fjs
parent: robertc at robertcollins.net-20070715044519-140kzz00uzldgt7z
committer: Robert Collins <robertc at robertcollins.net>
branch nick: index
timestamp: Sun 2007-07-15 14:53:53 +1000
message:
  Remove some unneeded index iteration by checking if we have found all keys, and grammar improvements from Aaron's review.
modified:
  bzrlib/index.py                index.py-20070712131115-lolkarso50vjr64s-1
  doc/developers/indices.txt     indices.txt-20070713142939-m5cdnp31u8ape0td-1
  doc/developers/repository.txt  repository.txt-20070709152006-xkhlek456eclha4u-1
=== modified file 'bzrlib/index.py'

--- a/bzrlib/index.py	2007-07-15 04:45:19 +0000
+++ b/bzrlib/index.py	2007-07-15 04:53:53 +0000
@@ -67,7 +67,7 @@
     def add_node(self, key, references, value):
         """Add a node to the index.
 
-        :param key: The key. keys must be whitespace free utf8.
+        :param key: The key. keys must be whitespace-free utf8.
         :param references: An iterable of iterables of keys. Each is a
             reference to another key.
         :param value: The value to associate with the key. It may be any
@@ -180,9 +180,9 @@
 
     It is presumed that the index will not be mutated - it is static data.
 
-    Currently successive iter_entries/iter_all_entries calls will read the
-    entire index each time. Additionally iter_entries calls will read the
-    entire index always. XXX: This must be fixed before the index is 
+    Successive iter_all_entries calls will read the entire index each time.
+    Additionally, iter_entries calls will read the index linearly until the
+    desired keys are found. XXX: This must be fixed before the index is
     suitable for production use. :XXX
     """
 
@@ -259,9 +259,14 @@
             efficient order for the index.
         """
         keys = set(keys)
+        if not keys:
+            return
         for node in self.iter_all_entries():
+            if not keys:
+                return
             if node[0] in keys:
                 yield node
+                keys.remove(node[0])
 
     def _signature(self):
         """The file signature for this index type."""
@@ -299,6 +304,9 @@
     def iter_all_entries(self):
         """Iterate over all keys within the index
 
+        Duplicate keys across child indices are presumed to have the same
+        value and are only reported once.
+
         :return: An iterable of (key, reference_lists, value). There is no
             defined order for the result iteration - it will be in the most
             efficient order for the index.
@@ -313,6 +321,9 @@
     def iter_entries(self, keys):
         """Iterate over keys within the index.
 
+        Duplicate keys across child indices are presumed to have the same
+        value and are only reported once.
+
         :param keys: An iterable providing the keys to be retrieved.
         :return: An iterable of (key, reference_lists, value). There is no
             defined order for the result iteration - it will be in the most
@@ -320,6 +331,8 @@
         """
         keys = set(keys)
         for index in self._indices:
+            if not keys:
+                return
             for node in index.iter_entries(keys):
                 keys.remove(node[0])
                 yield node

=== modified file 'doc/developers/indices.txt'
--- a/doc/developers/indices.txt	2007-07-13 15:05:36 +0000
+++ b/doc/developers/indices.txt	2007-07-15 04:53:53 +0000
@@ -34,7 +34,7 @@
 ========
 
 bzr is moving to a write-once model for repository storage in order to
-achieve lock-free repositories eventually. In order to support this we are
+achieve lock-free repositories eventually. In order to support this, we are
 making our new index classes **immutable**. That is, one creates a new
 index in a single operation, and after that it is read only. To combine
 two indices a ``Combined*`` index may be used, or an **index merge** may

=== modified file 'doc/developers/repository.txt'
--- a/doc/developers/repository.txt	2007-07-13 15:05:36 +0000
+++ b/doc/developers/repository.txt	2007-07-15 04:53:53 +0000
@@ -262,7 +262,7 @@
 Discovery of files
 ~~~~~~~~~~~~~~~~~~
 
-With non listable transports how should the collection of pack/index files
+With non-listable transports how should the collection of pack/index files
 be found ? Initially record a list of all the pack/index files from
 write actions. (Require writable transports to be listable). We can then
 use a heuristic to statically combine pack/index files later.