[MERGE] test and fix for commit with bad non-ascii messages (non-ascii chars that cannot be decoded with current user encoding)

Martin Pool mbp at canonical.com
Thu Oct 4 05:44:11 BST 2007


Martin Pool has voted tweak.
Status is now: Conditionally approved
Comment:

+def probe_bad_non_ascii_in_user_encoding():
+    """Try to find [bad] character with code [128..255]
+    that cannot be decoded to unicode in user_encoding.
+    Return None if all non-ascii characters is valid
+    for current user_encoding.
+    """
+    for i in xrange(128, 256):
+        char = chr(i)
+        try:
+            char.decode(bzrlib.user_encoding)
+        except UnicodeDecodeError:
+            return char
+    return None

I wonder if you should change this to

   probe_bad_non_ascii(bzrlib.user_encoding)

and then people may be able to use it in other places?  It's not much 
longer to call.

+            'BZR_EDITOR': None, # test_msgeditor manipulate with this 
variable

just 'manipulates', not 'manipulate with'


+        # LANG env variable has no effect on Windows
+        # but some characters anyway cannot be represented
+        # in default user encoding
+        char = '\xff'
+        if sys.platform == 'win32':
+            char = probe_bad_non_ascii_in_user_encoding()
+            if char is None:
+                raise TestSkipped('cannot find suitable character')
+        out,err = self.run_bzr_subprocess('commit -m "%s"' % char,
+                                          retcode=1,
+                                          env_changes={'LANG': 'C'})

I think you should call that method on all platforms, to reduce 
divergence.
Also maybe the message should say "in $user_encoding".

Otherwise looks good, thankyou.


For details, see: 
http://bundlebuggy.aaronbentley.com/request/%3C46F3AA12.2050606%40ukr.net%3E



More information about the bazaar mailing list