Binary file support
Aaron Bentley
aaron.bentley at utoronto.ca
Thu Oct 13 14:39:44 BST 2005
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Martin Pool wrote:
> On 13/10/05, John Arbash Meinel <john at arbash-meinel.com> wrote:
>
>>I know Aaron mentioned a patch in the past, to add a binary flag to
>>files, so that we can more properly handle diff and merge.
>
>
> I'd rather have bzr just notice that the file is binary and therefore
> shouldn't be run through a text diff or merge.
Well, it depends on how and where you intend to detect binaries.
diff's heuristic for 'binary' is reported to be 'contains NUL in the
first 1k'. For text diffing, another useful test is 'contains VT-102
control characters'.
The problems with diffing binaries are
1. they don't (usually) tell humans anything useful.
2. they mess up terminals.
I would prefer to not run diff on binary files, rather than running it,
then having it fail halfway through. It's especially ugly if you're
iterating through the diff, because you may have printed some of it
before you realize you're diffing a binary.
Similarly, I would rather not attempt a text merge on binaries, instead
of trying and failing. Having it as a property would make it cheap to
do detection in advance. It would also allow the user to force a text
to be treated as text or binary when the heuristic was wrong.
I suppose another option would be to have a 'binary-test-cache', indexed
by sha-1 sum, but it just seemed simpler to ship the results of the test
around in the inventory.
> I think a fast weave-like format needs to allow storing full copies
> from time to time (like arch cacherevs), so that you don't need to
> traverse all of history.
Also, it allows you to truncate history.
> For binary files (or some binary files) we
> could just store a full copy every time, so avoiding calculating
> useless diffs but still using just a single format.
While some binaries will change dramatically with every revision (e.g.
compression formats), others like executibles or tarfiles will be
largely the same. So I'd be inclined to allow all binaries to be weaved.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFDTmOg0F+nu1YWqI0RAsqCAJ9eFwknGpnWVr+z7XuA0S6g1R7iOwCeKqT9
2ouayg/Yc2X5GBJ0PUgGcUk=
=F63K
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list