[Bug 1947451] [NEW] unzip, manpage doesn't describe -I -O swicthes
psl
1947451 at bugs.launchpad.net
Sat Oct 16 12:41:20 UTC 2021
Public bug reported:
unzip 6.0-21ubuntu1.1
Linux Mint 19.3 (Ubuntu 18.04)
ZIP files doesn't store information about character set used to encode
filenames in the archive. In the case that user tries to extract ZIP
file created at Windows with unzip at Linux, filenames can be corrupted
in the case those contain characters from extended ASCII table...
unzip can handle this, it has switches "-I" and "-O" to specify encoding
of filenames in the archive.
$ unzip -h | grep CHARSET
-O CHARSET specify a character encoding for DOS, Windows and OS/2 archives
-I CHARSET specify a character encoding for UNIX and other archives
Information about these switches is missing in manual page (man unzip)
This is a way how to "list" files in ZIP file from Czech edition of
Windows at Ubuntu:
$ unzip -l -O CP852 archive-win.zip
File archive-win.zip is in format 2.0
Other problem is that it is not possible to instruct ZIP to create ZIP
archive with specific encoding; so I cannot create at Linux ZIP file
that has file names encoded in CP852 codepage, such archive could be
opened at Czech Windows without issue...
Another problem is that it is not possible to create ZIP file in older
format, like 2.0 (zip creates archives in format 3.0 (it seems it could
be a problem for Windows user because that file uses UTF-8 characters).
I am not sure what is problem here but there is a problem... I need a
computer with Windows to check details.
** Affects: unzip (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to unzip in Ubuntu.
https://bugs.launchpad.net/bugs/1947451
Title:
unzip, manpage doesn't describe -I -O swicthes
Status in unzip package in Ubuntu:
New
Bug description:
unzip 6.0-21ubuntu1.1
Linux Mint 19.3 (Ubuntu 18.04)
ZIP files doesn't store information about character set used to encode
filenames in the archive. In the case that user tries to extract ZIP
file created at Windows with unzip at Linux, filenames can be
corrupted in the case those contain characters from extended ASCII
table...
unzip can handle this, it has switches "-I" and "-O" to specify
encoding of filenames in the archive.
$ unzip -h | grep CHARSET
-O CHARSET specify a character encoding for DOS, Windows and OS/2 archives
-I CHARSET specify a character encoding for UNIX and other archives
Information about these switches is missing in manual page (man unzip)
This is a way how to "list" files in ZIP file from Czech edition of
Windows at Ubuntu:
$ unzip -l -O CP852 archive-win.zip
File archive-win.zip is in format 2.0
Other problem is that it is not possible to instruct ZIP to create ZIP
archive with specific encoding; so I cannot create at Linux ZIP file
that has file names encoded in CP852 codepage, such archive could be
opened at Czech Windows without issue...
Another problem is that it is not possible to create ZIP file in older
format, like 2.0 (zip creates archives in format 3.0 (it seems it
could be a problem for Windows user because that file uses UTF-8
characters). I am not sure what is problem here but there is a
problem... I need a computer with Windows to check details.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/unzip/+bug/1947451/+subscriptions
More information about the foundations-bugs
mailing list