problems with encodings for signed commits

John A Meinel john at arbash-meinel.com
Thu Dec 29 16:46:24 GMT 2005


Aaron Bentley wrote:
> Dafydd Harries wrote:
> | While importing a baz branch into bzr recently, I discovered that the
> | testament code fails if a commit message contains non-ascii characters.
> 
> | There is code that checks that the method doesn't return a unicode
> object, but
> | it's guarded by an "if __debug__", which I consider to be a bit odd.
> |
> | The unicode object in question originates in Commit.commit:
> 
> No, it doesn't.  It originates in baz_import.iter_import_version:
> 
> ~        commitobj.commit(branch, log_message.decode('ascii', 'replace'),
> ~                         verbose=False, committer=log_creator,
> ~                         timestamp=timestamp, timezone=0, rev_id=rev_id)
> 
> |
> |         if isinstance(message, str):
> |             message = message.decode(bzrlib.user_encoding)
> 
> This is bogus.  If Commit.commit gets a bytestring, it should treat it
> as ascii-- there's no defined encoding.  Assuming that this bytestring
> is in the user encoding is not right.  This should be done in
> cmd_commit, where we know that the bytestring came from the user, and
> therefor the user's encoding applies.

I agree. This goes back to the encoding issues. Code inside bzrlib
should not be encoding/decoding. That should only be done at the
interface layer. (So reading/writing to a file, handling command line
arguments, etc).

> 
> | http://muse.19inch.net/~daf/bzr/bzr/devel/
> 
> I don't think this is right.  The testament should be built assuming its
> contents are unicode, or else all fields should be automatically
> converted to utf-8.  No conditionals.
> 
> Aaron

I think everything should be converted to utf-8 since we are computing
the sha sum, which requires a bytes stream. And since ascii is a subset
of utf-8, we can only translate unicode strings, and possibly just do an
assert that plain strings are truly just ascii.

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051229/f7821b31/attachment.pgp 


More information about the bazaar mailing list