MySQL in chk-inventory
John Arbash Meinel
john at arbash-meinel.com
Wed Dec 10 16:59:44 GMT 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I finally managed to get a full conversion of my mysql repository into
split-inventory repository. The results are pretty much what we've been
expecting, but I figure it is nice to have actual results:
Commits: 63546
Raw % Compressed % Objects
Revisions: 127574 KiB 0% 50804 KiB 5% 63546
Inventories: 1012790 KiB 4% 567565 KiB 59% 1263035
Texts: 20555561 KiB 94% 331055 KiB 34% 259395
Signatures: 0 KiB 0% 0 KiB 0% 0
Total: 21695927 KiB 100% 949425 KiB 100% 1585976
Extra Info: count total avg stddev min max
internal node refs 858037 8011413 9 8.3 2 29
internal p_id refs 60426 414721 6 8.0 2 29
inv depth 269751 1757736 6 2.8 1 17
leaf node items 269751 1734108 6 4.6 1 18
leaf p_id items 11275 113518 10 8.6 1 38
p_id depth 11275 120390 10 5.1 1 23
The average depth of the inventory is 6, but the average depth of the
parent_id,basename => file_id map is 10. With a max depth of 17 and 23,
respectively.
We end up with an average of 19.7 inventory nodes per revision, which is
8.9kB in compressed form (approx 450B per node, 15.9kB uncompressed =
800B per node). This is pretty far off the 4kB we were originally
thinking for each node.
For the file_id=>inventory_entry map, have 269k leaf nodes, versus 858k
internal nodes. Or about 3:1 internal versus leaf.
For the parent_id,basename=>file_id map, we have 11.2k leaf versus 60k
internal nodes, or 5:1. Which is comparable to the average depth of 10
versus 6.
It is interesting to me to see how infrequently the tree-shape map
changes versus the inventory content map. A total of 113k leaf items
versus 1.7M.
Anyway, time for me to get back to actually improving things. For those
watching, I used a fairly hacked-up version of bzr to cache inventory
objects during extraction, etc. But it took approx 15 hours to convert
everything. (4 hours for the first 30k revs, and 11 hours for the last,
but I think my machine started hitting swap, as it had a peak memory
consumption of around 1GB.)
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkk/9YAACgkQJdeBCYSNAANarwCePzYk1PvzrEg2mv2M4Ki3Eg3M
X/MAoLjSPIaygma/F72TESl66rEIEcYj
=NXsT
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list