Compressing Revision Texts

John Arbash Meinel john at arbash-meinel.com
Wed Aug 27 21:07:32 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've been trying to investigate the regression for fetch in 1.6. It
fundamentally boils down to not having "get_data_stream_for_search" anymore,
because we end up having to probe the indexes from remote, rather than just
getting a stream of data handed to us.

But I also found some interesting things from bzr-1.5. Doing a fetch of
bzr.dev, I found that out of 77MB in my pack file, 38.8MB was inventories, and
10.4MB was revision texts, and 3.5MB is signatures.

That's >50% in inventories, 14% in revisions and 5% in signatures.

The inventories number makes me think we need "bzr pack" to compact inventory
texts. At least until Robert's reworking of the inventory layer happens.

But it also makes me wonder if we wouldn't want to try to compress revision
texts a bit better. I know we've avoided it because it doesn't seem like they
will compress terribly well, but at 14% of the total data, there is certainly
room for improvement.

Anyway, probably just something to keep in mind for "groupcompress" when it
comes out.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFItbQEJdeBCYSNAAMRAo+EAKCrL5JxA9HdbV6mULv4riOFGBgNzACgxYOP
A5Ff4ek2i9VN+Hq1eDlrBqs=
=mT22
-----END PGP SIGNATURE-----



More information about the bazaar mailing list