Initial results using 'hash trie'
John Arbash Meinel
john at arbash-meinel.com
Tue Dec 23 20:44:52 GMT 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I hacked together some work to use the zlib crc32 instead of the raw
file-id as the lookup key for split inventories. I got some interesting
results.
1) You can't use the raw crc32, because it will occasionally introduce a
'\n' character, which KnitVersionedFiles really doesn't like. But you
can map that to some other character as long as you do it consistently.
2) I used both "struct.pack('>i', crc).replace('\n', '_')" and "'%08X' %
crc". The former gives a 255-way fan out, the latter gives 16-way fan out.
The results for the first 1034 revisions of mysql are as follows:
original (no hash)
Commits: 1043
Raw % Compressed % Objects
Revisions: 3990 KiB 0% 1228 KiB 5% 1043
Inventories: 17905 KiB 1% 9503 KiB 43% 20970
Texts: 882272 KiB 97% 11174 KiB 51% 7226
Signatures: 0 KiB 0% 0 KiB 0% 0
Total: 904168 KiB 100% 21906 KiB 100% 29239
Extra Info: count total avg stddev min max
internal node refs 11864 114461 9 8.1 2 27
internal p_id refs 1256 6840 5 5.7 2 24
inv depth 6005 35566 5 3.5 1 9
leaf node items 6005 34646 5 5.2 1 16
leaf p_id items 802 6566 8 9.1 1 33
p_id depth 802 7050 8 4.2 1 14
16-way fan out
Commits: 1043
Raw % Compressed % Objects
Revisions: 3990 KiB 0% 1228 KiB 6% 1043
Inventories: 11982 KiB 1% 7882 KiB 38% 19390
Texts: 882272 KiB 98% 11174 KiB 55% 7226
Signatures: 0 KiB 0% 0 KiB 0% 0
Total: 898245 KiB 100% 20285 KiB 100% 27659
Extra Info: count total avg stddev min max
internal node refs 9906 127314 12 5.3 7 16
internal p_id refs 621 5373 8 4.6 2 16
inv depth 7232 28832 3 2.4 1 4
leaf node items 7232 15841 2 1.7 1 14
leaf p_id items 588 7550 12 11.3 1 52
p_id depth 588 2505 4 1.7 1 6
255-way fan out
Commits: 1043
Raw % Compressed % Objects
Revisions: 3990 KiB 0% 1228 KiB 4% 1043
Inventories: 30982 KiB 3% 16000 KiB 56% 11993
Texts: 882272 KiB 96% 11174 KiB 39% 7226
Signatures: 0 KiB 0% 0 KiB 0% 0
Total: 917245 KiB 100% 28403 KiB 100% 20262
Extra Info: count total avg stddev min max
internal node refs 1932 279512 144 120.2 12 255
internal p_id refs 343 26541 77 46.8 2 169
inv depth 7029 15989 2 1.0 1 3
leaf node items 7029 54469 7 5.5 1 14
leaf p_id items 1646 5342 3 6.8 1 52
p_id depth 1646 5458 3 1.4 1 4
Interesting bits:
1) Going from flat to 16-way fan out dropped the raw inventory size from
18MB => 12MB, but going to 255-way increased it to 31MB. And the
compressed size is pretty much the same.
The big reason is that with a hash prefix, we are probably very
successful in creating a fully dense root node, which changes on every
commit.
If each entry is approx 50 bytes, then you have 255*50 =~ 13kB for the
root node. So for 1k revisions, just the root takes 13k*1k = 12.7MB. You
also have 2 of those maps (because you have the p_id as well as the
file_id map), but the p_id map doesn't change with every commit.
Note that this doesn't use any delta compression.
2) With 1k revisions, we are averaging about 7 leaf nodes per commit. At
33k revs, it seems to go down to 4.8 leaf nodes per commit.
3) It is interesting to note that shrinking that top-level node has a
very good effect on total size. It at least makes me wonder about using
a variable fan-out at each level.
I think we can work out an explicit equation for how much data we expect
to encounter. Consider a tree with 8000 entries, with 2 changing on
every revision, and a 16-way fan out at each level. Assume that the
layout is uniformly distributed, and an internal line takes 50 bytes.
The top level node will be a fully packed internal node with 16 records.
That is 16*50=800 bytes. It will change on every commit.
At the next level, we have 16 internal nodes each with 16 records. They
change with an average frequency of 1 in 16 (because of the node above).
On average, we will have 2 of these pages change. (I think the average
is actually lower than 2, because of the chance that both changes fall
into the same place. I think the chance is 1/16, but I'm not positive.)
So 2 nodes with 16 entries is 2*16*50 = 1600 bytes.
We can still only resolve 16*16*20=5120 entries, and we have a tree of
8000, so we would expect one more level.
At the 3rd level, we have again 2 nodes change (the probability they
collide is probably 1/256). So you have 2*16*50 = 1600 bytes
At the leaf level, we have 8000 / 16^3 = 2 entries per leaf node. Leaf
node entries are probably around 300 bytes. 2*300*2 = 1200 bytes.
Which is 5.2kB per commit.
I think the average tree depth is going to be:
ciel(log(tree_size / max_node_per_leaf) / log(entries_per_internal))+1
For example:
ciel(log16(8000 / 20)) = ciel(2.2) = 3 + 1 = 4
(3 internal nodes and 1 leaf node)
The average number of bytes that get written will be:
bytes_per_int_entry * entries_per_internal
* (1 + num_changed * (int_depth-1))
+ bytes_per_leaf_entry * entries_per_leaf * num_changed
So we have a portion which is "entries_per_internal * int_depth" where
int_depth is ~1/log(entries_per_internal)
And entries_per_internal / log(entries_per_internal) for realistic
values is pretty much just ~entries_per_internal.
So while we increase round trips, we get smaller data by decreasing the
number of entries per internal node.
I guess another way to look at it, is that by decreasing the number of
entries per node, you only increase the depth logarithmically, but you
shrink the node size linearly. (Increasing from 16=>255 only increased
the depth by 1, but increased the size of the root node by 16.)
4) So by the result from (3), shrinking the number of entries per
internal node will always shrink the total size of the inventory. So the
question becomes what is the final effect on other metrics. Latency for
copying, time for comparison, etc.
I'll go ahead and try a test using '%011o'. The interesting problem,
though, is that '%o' doesn't have a full 8-way fan out in the first
byte. It seems to only contain the values 0, 1, 2. It may be interesting
to see what happens if you shrink the top-most node to that small.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAklRTcQACgkQJdeBCYSNAAPS6ACfWKE6rToPGZ0ji+YwsArhq5ch
mUgAoIBv2EJuqVR9GVRMYu8r7hpdoMHG
=GXc/
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list