Rev 4735: CHKInventory using __slots__ has a huge impact in my testing script. in http://bazaar.launchpad.net/~jameinel/bzr/2.1-chk-memory
John Arbash Meinel
john at arbash-meinel.com
Thu Oct 8 22:26:50 BST 2009
At http://bazaar.launchpad.net/~jameinel/bzr/2.1-chk-memory
------------------------------------------------------------
revno: 4735
revision-id: john at arbash-meinel.com-20091008212628-q5oh7rdg7ikvy7jo
parent: john at arbash-meinel.com-20091008194850-nigahumk4tj2uhy8
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: 2.1-chk-memory
timestamp: Thu 2009-10-08 16:26:28 -0500
message:
CHKInventory using __slots__ has a huge impact in my testing script.
Specifically, we have >= 6 attributes on CHKInventory, which causes the
self.__dict__ to expand to 524 bytes resident.
So for 25k items @524bytes each, that is 10MB. Switching it to use __slots__
changes the overhead to 8*4=32, or saving 492 bytes per object.
This may not translate into real-world, as we may not hold many CHKInventories
in memory at once. I think the log code has a cap of 200 Revision trees at
once. Which is only 100k.
-------------- next part --------------
=== modified file 'bzrlib/inventory.py'
--- a/bzrlib/inventory.py 2009-10-08 19:48:50 +0000
+++ b/bzrlib/inventory.py 2009-10-08 21:26:28 +0000
@@ -732,6 +732,8 @@
inserted, other than through the Inventory API.
"""
+ __slots__ = ('root', 'revision_id')
+
def __contains__(self, file_id):
"""True if this entry contains a file with given id.
@@ -1492,6 +1494,18 @@
want to reuse.
"""
+ # An attribute dict that holds between 6 and 22 entries (inclusive) costs
+ # 524 bytes of memory (32-bit). Using slots for 6 entries costs 24 bytes of
+ # memory, and 88 bytes of memory for 22 entries.
+ # For <6 entries, it costs 140 bytes, but 5 slots == 20 bytes.
+ # Switching CHKInventory to using __slots__ saves 10MB when loading all
+ # bzr.dev's chk inventories, and 30MB when loading all of launchpad.
+ # I don't know the specific effect in real-world operations, because we may
+ # never grab all CHKInventory objects at once.
+ __slots__ = ('_fileid_to_entry_cache', '_path_to_fileid_cache',
+ '_search_key_name', 'root_id',
+ 'id_to_entry', 'parent_id_basename_to_file_id')
+
def __init__(self, search_key_name):
CommonInventory.__init__(self)
# Note: if just loading all CHKInventory objects, these two empty
More information about the bazaar-commits
mailing list