[Extension] Dirty hack of 'shelve' and 'unshelve' command

Tue May 31 15:34:57 BST 2005

Aaron Bentley wrote:
> John Arbash Meinel wrote:
> 
> 
>>>If I am not mistaken, does this mean you are adding the changes into the
>>>revision history, but just as a snapshot/shelf revision?
> 
> 
> I'd say "adding the changes into the revision stores", rather than
> "revision history", but yes, this was one thing I proposed.
> 

Sure, not history, just the store.

> 
>>>My concern with
>>>that, is that it is not 'nuclear codes' safe in the case of the revfile
>>>format.
> 
> 
> So some other options I suggested included using temp stores for
> unique-to-snapshot items, and just using plain full-tree copies.
> 
> 
>>>Because revfiles are append only. Although you probably could
>>>'compact' revfiles, which would use the index file to remove anything
>>>that was not explicitly referenced anywhere. This also makes sense for
>>>the "undo the last commit" fix.
> 
> 
> You could also just wipe the portions that contained the nuclear launch
> codes and remove the launch codes from the indices.
> 

This is probably actually the easiest thing to do. With something like a
 'nuke-selected-revision-id' or something similar which would destroy a
specific section. The only problem is if they were added and the indexes
not updated yet, or something like that, so they are secretly in there,
without knowing about it. Hence 'bzr compact' to remove everything that
isn't referenced.

> 
>>>All you really have to do is go through all of the steps for a commit,
>>>but when you are done, you don't add the revision id into the
>>>.bzr/revision-history file. Instead you store it in something like
>>>.bzr/x-shelf, or .bzr/x-snapshot
>>>
>>>Is this reasonable, or are you bloating the revision/text store?
> 
> 
> Unless you change the file storage, you'll still bloat the stores.  But
> I'm not sure whether the amount of bloat would be significant.
> 

I agree, this doesn't bloat it much. Actually, if you were super
special, you might notice in a later unshelf that the changes already
exist, and just reference them. But probably that is tricky to get
right. And it might cause non-increasing references, ie:

commit - 1
shelf - 2
commit - 3
unshelf and commit -> 2

I don't know if the internal code would care, it seems like revision
history just points to a revision id, which just looks at an inventory
id, which just references all of the appropriate text ids. Nobody really
would care what order they are in the files. The bigger difficulty would
be figuring out that the current commit matches a previous entry in the
revfile.

> 
>>>I certainly think it would be useful to have a 'bzr compact' command.
>>>Basically lock the branch, iterate through all of the inventories, and
>>>remove all files that are not explicitly referenced.
>>>
>>>In the case of hardlinked revfiles, you might have problems, as it might
>>>have been referenced in another tree. But you certainly could make the
>>>statement that "bzr compact" breaks hard-links,
> 
> 
> - From what I understand, revfiles are already updated in a hard-link-safe
> way.

Actually, earlier discussions indicated that a revfile can be updated in
place. Since having unknown contents is fine. Because you use the index
to find everything. And having indexes updated in place is also fine,
since you truly only use the revision history to index what you need.
Other than locking issues, it seems like you could have all of your
revisions in one giant pool, and revfiles would still work.

Some of this is hypothetical, though, as revfiles are not the currently
used storage mechanism.

> 
>  and probably could
> 
>>>(optionally) warn about broken hard-links so that you could track down
>>>your special files. (I don't know that you can say what other file
>>>hardlinks to this one, but something like that would be nice.)
> 
> 
> Yes, if you've got nuclear launch codes, you'll want to get rid of all
> copies.
> 

Is the only way to do this to stat() all files in a given filesystem,
and look for ones with the same inode numbers?

> 
>>>So I guess this is 2 threads, bzr shelf is very nice, and could probably
>>>be implemented as a real-life commit (possibly partial commit), with the
>>>final revision id ending up in some alternative file.
>>>
>>>Thread 2, bzr compact could clean up all files under .bzr/ such that
>>>unreferenced files are removed. This might be simply bzr branch to a
>>>temp dir, and then replace the original, if abentley's comment about his
>>>bzr branch compacting the tree is true.
> 
> 
> One of the limitations of remote file access is that you can't list what
> items are present in a store.  So I can only copy items that are
> referenced in the revision history or inventory of a revision.
> 

Well, it also depends how you handle revfiles in the future. When you
branch from remote, you will have to know the revfile and revindex
names. So you *could* just copy the complete files across. Or you could
recreate the revfile and revindex based on the revision history and
inventory.
Right now because the text store is just compressed whole files, it
doesn't matter.

> 
>>>Though I'm thinking it is more
>>>efficient to just scan for files and fix them one by one.
> 
> 
> Yes, delete is usually faster than copy.  It's just a bit trickier that
> way, but no biggie.
> 
> Aaron

I think it would be nice to have a command to get the cruft out of your
revision store. It seems like this should be separate from branch, and
should probably be aware of hardlinked files. (In the current storage
mechanism, you can just unlink the files, with revfiles you would need
to rewrite indexes.)

John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 251 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050531/5039920f/attachment.pgp