[RFC] Tracking file copies

John Arbash Meinel john at arbash-meinel.com
Mon Jun 19 16:46:31 BST 2006


Jelmer Vernooij wrote:
> On Sun, 2006-06-18 at 17:51 -0500, John Arbash Meinel wrote:

...

>>> I'd like to start working on a spec for this, if nobody objects to
>>> supporting tracking file copies.
>> Both Mercurial and Subversion support copies, and as a result neither
>> one of them supports merge after rename.
> Ah, I knew there was a catch somewhere. I hadn't considered whether how
> this would work with merge.

Well, since more than 1 group is doing it, I don't think you're the only
one. :)

...

> Yeah, Subversion has the same issue. Subversion files can be added,
> removed, copied or "replaced" - that's the way they handle rename +
> readd, which is a bit ugly imho, and very weird if you don't know what
> problem it's trying to solve.

I hadn't heard of 'replaced'. That's the problem with kludges, though.
You tend to need to keep layering them on top of eachother. Not that bzr
doesn't have some, but hopefully we can keep our count as small as
possible. That, and I think we are willing to revisit our basic model if
we find holes in it.

> 
>> I think a better thing to spend your time on is figuring out what you
>> actually want to be able to do with copies. And work out how to make
>> that happen. I could see having a file copy over its annotations from
>> another knit. So that you still see the documentation about why/who
>> changed a line.
> I think it would make sense to be able to record that one file was
> originally copied from another - not necessarily say the two are the
> same. That way, only one file (the original one) is candidate for the
> merge.
> 
> I am mainly interested in this information as metadata - so it can be
> imported from and exported to other version control systems. It might
> also help in terms of storage for some storage backends (in the case of
> weaves, keep both files in one weave?).

Doing it as metadata is not really a problem. The question is what kind
of performance do you want from this. eg. Are you willing to wait for a
complete search through history to find the point where the file_id was
marked as a copy from the other id? Do we only allow creating a copy
record at the time of creation, or can you do it at any time.
Are you wanting to create a new separate file
(.bzr/repository/file-id-copies.knit), or just as a revision property,
or just as a record in the knit index, or ...

If you do it at the right layer, I think we would be happy to support
it. But some layers have more/less impact on performance.

> 
> The reason I'm looking at this for Subversion is that it might help me
> to get rid of the revision cache I'm currently keeping. I need to
> traverse history in order to find the proper file id, even for files in
> the latest revision. File id aliases, might also be of help here,
> instead of copies.
> 
>> I think we might be able to do some neat things with a 'file_id X,
>> copied from file_id Y'. But adding the ability to copy files can
>> *really* complicate the model.
> Just tracking the information wouldn't really hurt, though? It would
> come in handy for roundtripping.

Just recording the information is fine. But if you want it for
'roundtripping' you need a way of accessing it. And different recording
locations have different performance implications. If you are willing to
settle for any performance as long as the info is recorded, I'm sure we
can find a place to record it. Right now you could record it as a
Revision Property. Not really the right place, and requires a bunch of
indirections to find it, but you could make it work without any changes
to our current storage model.

> 
>> I'm not specifically trying to scare you off. I would be very interested
>> in reading a potential spec. But I do want to warn you about some
>> potential pitfalls.
> Thanks for the reply. I'm just considering what the best solution would
> be to improve performance of the Subversion plugin. Caching the revision
> history doesn't take long (about a minute for 16000 revisions here, and
> it never expires), but it's not really neat to have to keep a cache.
> 
> Cheers,
> 
> Jelmer
> 

As much as is reasonable, I would like to support round-tripping to
other VCSs. So if you can find some reasonable ways to do it, I think we
would be willing to support it.

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060619/3fbca5dc/attachment.pgp 


More information about the bazaar mailing list