[MERGE] Readable and properly encoded diff headers

Adeodato Simó dato at net.com.org.es
Tue Aug 15 16:55:53 BST 2006


* Aaron Bentley [Tue, 15 Aug 2006 10:31:40 -0400]:

> Adeodato Simó wrote:
> > The attached patch changes %r to '%s' in the headers, and also encodes
> > the filenames in the proper encoding, not hardcoded 'utf8'.

> I don't think we should assume that the destination is a terminal. 

Well, I was initially using bzrlib.user_encoding, and John recommended
terminal_encoding() instead. In any case, this is not about the output
being a terminal or not, but about having the headers of diff's output
in the proper encoding.

> For example, I believe this functionality is invoked by bzr-gtk's
> gdiff, and so changing it to something other than utf-8 might be a
> regression.

Or the opposite. Do bzr-gtk or gdiff work with non-ascii filenames in
non-utf8 locales? I'd assume it doesn't, same as current bzr doesn't.

And as for being a regression, current show_diff_trees() always returns
headers in *ASCII*, never in 8bit (because of %r). So between shoving
8bit down applications always in UTF-8, and shoving it in the user's
encoding, I think the latter is preferable, unless somebody can explain
me this would break stuff.

* John Arbash Meinel [Tue, 15 Aug 2006 09:37:41 -0500]:

> I think what we need is for 'show_diff_trees' to take the path_encoding
> parameter, and pass it down the line. Rather than have it detected in
> _show_diff_trees.

> And then 'cmd_diff' can set path_encoding = osutils.terminal_encoding().

I don't fancy this much, having the encoding travelling all over the
place.

> 'self.outf' for cmd_diff is setup *without* encoding, because we don't
> want to silently transcode the contents of the file. What I would really
> like is to have a 'do not accept unicode' file-like object when
> encoding_type='exact'.

> The to_file should definitely be self.outf. And it would be possible for
> self.outf to have an encoding wrapper, and to write out the paths in
> unicode. But I feel it leaves us open to a bug when writing something
> else as part of 'diff'.

Wouldn't it be possible to have a file-like object that does: "if I
receive unicode, I encode it to $user_encoding; if I get strings, I pass
them unmodified"? Would it make sense?

Cheers,

-- 
Adeodato Simó                                     dato at net.com.org.es
Debian Developer                                  adeodato at debian.org
 
                                 Listening to: Jacques Brel - Les Biches





More information about the bazaar mailing list