Rev 2926: (robertc) Add a last-lookup cache to DirState reducing the cost of serial access to paths. (Robert Collins) in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Mon Oct 22 21:45:30 BST 2007
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 2926
revision-id: pqm at pqm.ubuntu.com-20071022204528-m4i3ievs46d19324
parent: pqm at pqm.ubuntu.com-20071022183547-ylcflvri4vahgtmg
parent: robertc at robertcollins.net-20071022200512-ev362dzjjoj9v7r1
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Mon 2007-10-22 21:45:28 +0100
message:
(robertc) Add a last-lookup cache to DirState reducing the cost of serial access to paths. (Robert Collins)
modified:
bzrlib/dirstate.py dirstate.py-20060728012006-d6mvoihjb3je9peu-1
------------------------------------------------------------
revno: 2921.1.4
merged: robertc at robertcollins.net-20071022200512-ev362dzjjoj9v7r1
parent: robertc at robertcollins.net-20071022195941-06a7tdtkd8muueg9
committer: Robert Collins <robertc at robertcollins.net>
branch nick: dirstate.cache
timestamp: Tue 2007-10-23 06:05:12 +1000
message:
Documentation to aid our failing memory in the future.
------------------------------------------------------------
revno: 2921.1.3
merged: robertc at robertcollins.net-20071022195941-06a7tdtkd8muueg9
parent: robertc at robertcollins.net-20071022193257-4sa1111h0d0b01zg
committer: Robert Collins <robertc at robertcollins.net>
branch nick: dirstate.cache
timestamp: Tue 2007-10-23 05:59:41 +1000
message:
Review feedback.
------------------------------------------------------------
revno: 2921.1.2
merged: robertc at robertcollins.net-20071022193257-4sa1111h0d0b01zg
parent: robertc at robertcollins.net-20071022043300-c8sdvyizlgoy3i97
committer: Robert Collins <robertc at robertcollins.net>
branch nick: dirstate.cache
timestamp: Tue 2007-10-23 05:32:57 +1000
message:
Reset the dirstate cache pointer for the entry on lookup of a new block, suggested by John.
------------------------------------------------------------
revno: 2921.1.1
merged: robertc at robertcollins.net-20071022043300-c8sdvyizlgoy3i97
parent: pqm at pqm.ubuntu.com-20071019201226-6z006xotgfe7zmu8
committer: Robert Collins <robertc at robertcollins.net>
branch nick: dirstate.cache
timestamp: Mon 2007-10-22 14:33:00 +1000
message:
Add a cache for dirstate entry lookups.
=== modified file 'bzrlib/dirstate.py'
--- a/bzrlib/dirstate.py 2007-10-10 00:04:46 +0000
+++ b/bzrlib/dirstate.py 2007-10-22 20:05:12 +0000
@@ -345,6 +345,12 @@
self._sha1_file = self._sha1_file_and_mutter
else:
self._sha1_file = osutils.sha_file_by_name
+ # These two attributes provide a simple cache for lookups into the
+ # dirstate in-memory vectors. By probing respectively for the last
+ # block, and for the next entry, we save nearly 2 bisections per path
+ # during commit.
+ self._last_block_index = None
+ self._last_entry_index = None
def __repr__(self):
return "%s(%r)" % \
@@ -1042,6 +1048,12 @@
"""
if key[0:2] == ('', ''):
return 0, True
+ try:
+ if (self._last_block_index is not None and
+ self._dirblocks[self._last_block_index][0] == key[0]):
+ return self._last_block_index, True
+ except IndexError:
+ pass
block_index = bisect_dirblock(self._dirblocks, key[0], 1,
cache=self._split_path_cache)
# _right returns one-past-where-key is so we have to subtract
@@ -1052,6 +1064,9 @@
# simple and correct:
present = (block_index < len(self._dirblocks) and
self._dirblocks[block_index][0] == key[0])
+ self._last_block_index = block_index
+ # Reset the entry index cache to the beginning of the block.
+ self._last_entry_index = -1
return block_index, present
def _find_entry_index(self, key, block):
@@ -1059,9 +1074,24 @@
:return: The entry index, True if the entry for the key is present.
"""
+ len_block = len(block)
+ try:
+ if self._last_entry_index is not None:
+ # mini-bisect here.
+ entry_index = self._last_entry_index + 1
+ # A hit is when the key is after the last slot, and before or
+ # equal to the next slot.
+ if ((entry_index > 0 and block[entry_index - 1][0] < key) and
+ key <= block[entry_index][0]):
+ self._last_entry_index = entry_index
+ present = (block[entry_index][0] == key)
+ return entry_index, present
+ except IndexError:
+ pass
entry_index = bisect.bisect_left(block, (key, []))
- present = (entry_index < len(block) and
+ present = (entry_index < len_block and
block[entry_index][0] == key)
+ self._last_entry_index = entry_index
return entry_index, present
@staticmethod
@@ -1353,7 +1383,7 @@
return block_index, 0, False, False
block = self._dirblocks[block_index][1] # access the entries only
entry_index, present = self._find_entry_index(key, block)
- # linear search through present entries at this path to find the one
+ # linear search through entries at this path to find the one
# requested.
while entry_index < len(block) and block[entry_index][0][1] == basename:
if block[entry_index][1][tree_index][0] not in \
@@ -1380,7 +1410,8 @@
"""
self._read_dirblocks_if_needed()
if path_utf8 is not None:
- assert path_utf8.__class__ == str, 'path_utf8 is not a str: %s %s' % (type(path_utf8), path_utf8)
+ assert path_utf8.__class__ == str, ('path_utf8 is not a str: %s %s'
+ % (type(path_utf8), path_utf8))
# path lookups are faster
dirname, basename = osutils.split(path_utf8)
block_index, entry_index, dir_present, file_present = \
More information about the bazaar-commits
mailing list