OpenOffice
John Arbash Meinel
john at arbash-meinel.com
Fri Jun 2 23:37:01 BST 2006
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Urs Lerch wrote:
> Hi there,
>
> Just installed Bazaar-NG and played a little around. It's just what I was
> looking for: as small and easy as possible.
>
> For my research project I'll work a lot with OpenOffice and would like to
> put the files in a version control system. (Of course I could do the
> versioning in OpenOffice itself, but that's only one part of the game.)
> Now the problem is that the original files OpenOffice writes are binary
> and it would be better to version the unzipped content. I haven't done
> tests so far with other possible formats like DocBook and don't know if
> they would work for me.
>
> So my general question is: In your opinion, would it make sense to do the
> transformation in Bazaar-NG or should it be done outside (in OpenOffice
> itself or by a script)?
>
> Thanks,
> Urs
- From a delta compression perspective, extracting them into texts would
probably be better. I could see a plugin which would check at commit
time, and would extract the right files, so that bzr would see the
individual files, rather than a zip file.
So your repository size (size of .bzr) would be smaller.
However, any sort of merge operation could probably easily yield invalid
texts. I don't know what sort of XML OOo uses, but there is no guarantee
that inserting a line in the middle of other lines will result in a
valid text.
So if you wanted bzr to handle merging OOo documents, you would need
more than just extracting the text, you would also need an intelligent
merge, which could yield valid OOo, even when there are conflicts. Heck,
you could have an intelligent operation which would insert colored
conflict regions, but how do you handle conflicting <B></B> tags?
Or what about a more exact conflict which fails badly. You have 4 lines:
A
B
C
D
version A bolds 2 of the lines:
A
<b>
B
C
</b>
D
Version B italicies 2 lines:
A
B
<i>
C
D
</i>
A simple line based diff + merge
After merging these with a line-based merge, you would have a merge
which did not conflict, and yielded:
A
<b>
B
<i>
C
</b>
D
</i>
Which is now invalid XML, and may not be possible to load in OOo to fix
the problems.
So basically, without a merge operator, you could do plain diffs just to
get smaller storage. But merging would likely corrupt everything.
If they are treated as binary files, bzr should not try to merge them,
and just fail saying "these files are different, you handle merging
them". Which at this point, is the right thing to do.
So for now, I don't think you will be helped by breaking up a OOo file
into its components.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEgL1cJdeBCYSNAAMRAnLaAJsFVn6qGEC7/4h+mxqYQiMQ23P0/wCfTUij
as+5RB9NKCBOwexcacVQTUE=
=x2rr
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list