[Bug 218741] Re: scp cuts UTF8 filenames by bytes instead of characters

Colin Watson cjwatson at canonical.com
Sun Jun 9 21:46:50 UTC 2019


I did some experimentation today with different "stty cols" settings,
and I'm pretty sure this is fixed.  I believe that the commit that fixed
it was probably
https://anongit.mindrot.org/openssh.git/commit/?id=0e059cdf5fd86297546c63fa8607c24059118832
(note in particular "take character display widths into account for the
progressmeter", in which case this has been fixed since OpenSSH 7.3p1,
so since Ubuntu 16.10.

** Changed in: openssh (Ubuntu)
       Status: Triaged => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to openssh in Ubuntu.
https://bugs.launchpad.net/bugs/218741

Title:
  scp cuts UTF8 filenames by bytes instead of characters

Status in openssh package in Ubuntu:
  Fix Released

Bug description:
  Binary package hint: openssh-client

  This is for up-to-date Ubuntu Hardy.

  I left some files for copying today using scp (from the openssh-client package). I happened to look at the output and noticed some “bad character” symbols on the terminal, as pasted below. (These were copy-pasted from the console, in a completely UTF8-based environment. Note that the weird characters may confuse your browser, make sure it's detected the correct encoding.)
      
  21 Lied pentru voce și pian „Regen”.flac 100% 9938KB   1.9MB/s   00:05    
  03 Lieduri pentru tenor și pian, op. 15 „S 100% 2524KB   2.5MB/s   00:01 
  18 Lied pentru voce și pian „Frauenberuf�� 100%   11MB   1.3MB/s   00:09    
  06 Lieduri pentru tenor și pian, op. 15 „S 100% 8961KB   2.2MB/s   00:04    
  09 Lieduri pentru bas și pian, op. 4 „Troi 100%   11MB   1.4MB/s   00:08   
  [after resizing the window]
  10 Suita nr. 3 pentru orchestră, op. 27 „Săteasca”_ „Pârâu sub lun� 100%   13MB   2.6MB/s   00:05    

  As it happens, the next character on each filename of the two weird
  lines was, respectively, ” and ă, both of which are of course
  displayed correctly in other places on the same output. Based on this
  and the misalignment of the last columns, I think scp cuts too-long-
  names by counting bytes rather than characters. This is obviously
  wrong in UTF8, since some characters can contain several bytes, in
  which case the lines would be cut too early, and occasionally in the
  “middle” of a character, thus displaying garbage.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/openssh/+bug/218741/+subscriptions



More information about the foundations-bugs mailing list