Rev 104: Do some ugly hacks to keep memory low during 'compute_referrers'. in http://bazaar.launchpad.net/~meliae-dev/meliae/trunk
John Arbash Meinel
john at arbash-meinel.com
Thu Oct 22 23:02:50 BST 2009
At http://bazaar.launchpad.net/~meliae-dev/meliae/trunk
------------------------------------------------------------
revno: 104
revision-id: john at arbash-meinel.com-20091022220241-x902omy964q6pk22
parent: john at arbash-meinel.com-20091022214742-8w54cvqz9r1vvte5
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: trunk
timestamp: Thu 2009-10-22 17:02:41 -0500
message:
Do some ugly hacks to keep memory low during 'compute_referrers'.
1) If there is only 1 referrer, keep the 'list' as a simple integer
2) If there is <10 referrers, use a tuple, and create new ones as necessary
3) If there is >=10 referrers, use a list as normal
This should decrease memory a bit, when dealing with really big datasets.
Quite a few objects will only have 1 reference, and this drops memory
consumption down to a dict entry / pointer (because the address is
already known to be a unique PyInt that we already have.)
-------------- next part --------------
=== modified file 'meliae/loader.py'
--- a/meliae/loader.py 2009-10-22 21:29:53 +0000
+++ b/meliae/loader.py 2009-10-22 22:02:41 +0000
@@ -202,18 +202,45 @@
"""For each object, figure out who is referencing it."""
referrers = {} # From address => [referred from]
id_cache = {}
+ unique_address = id_cache.setdefault
total = len(self.objs)
for idx, obj in enumerate(self.objs.itervalues()):
if self.show_progress and idx & 0x1ff == 0:
sys.stderr.write('compute referrers %8d / %8d \r'
% (idx, total))
address = obj.address
- address = id_cache.setdefault(address, address)
+ address = unique_address(address, address)
for ref in obj.ref_list:
- ref = id_cache.setdefault(ref, ref)
- referrers.setdefault(ref, []).append(address)
+ ref = unique_address(ref, ref)
+ refs = referrers.get(ref, None)
+ t = type(refs)
+ if refs is None:
+ refs = address
+ elif t is int:
+ refs = (refs, address)
+ elif t is tuple:
+ if len(refs) >= 10:
+ refs = list(refs)
+ refs.append(address)
+ else:
+ refs = refs + (address,)
+ elif t is list:
+ refs.append(address)
+ else:
+ raise TypeError('unknown refs type: %s\n'
+ % (t,))
+ referrers[ref] = refs
+ del id_cache
for obj in self.objs.itervalues():
- obj.referrers = referrers.get(obj.address, ())
+ try:
+ refs = referrers.pop(obj.address)
+ except KeyError:
+ obj.referrers = ()
+ else:
+ if type(refs) is int:
+ obj.referrers = (refs,)
+ else:
+ obj.referrers = refs
if self.show_progress:
sys.stderr.write('compute referrers %8d / %8d \n'
% (idx, total))
More information about the bazaar-commits
mailing list