version-info --include-history UnicodeDecodeError (518609)

Fri Apr 9 06:32:39 BST 2010

Robert Collins writes:

 > As already said in this thread, RIO isn't 'text', its a defined series
 > of bytes.

I know.  So what about RIO?  We are talking about the version-info
command, are we not?  Its help says

  You can use this command to add information about version into
  source code of an application. The output can be in one of the
  supported formats or in a custom format based on a template.

That's text, OK?  The source language gets to choose, NOT bzr.  Eg, if
it's Python source, it's ISO 8859/1 by default through 2.x (IIRC) and
UTF-8 in Python 3, but it can be anything supported by Python (see PEP
263).  If bzr is outputting RIO from that command, that's a bug.

Of course respecting the top-level locale is badly broken if the
project uses PEP 263 coding cookies.  But I don't see what else you
can do.

 > > stdout and stderr are nondeterministic, but the heuristic
 > > "convert to top-level locale encoding if a TTY, else to UTF-8"
 > > should work well in practice.  (Except maybe on Windows, where
 > > the widechar Unicode API might be more appropriate for non-TTYs.)
 > 
 > That is the heuristic I was talking about.

Yes.  I just wanted to emphasize that Windows may prefer a non-UTF-8
Unicode TF.