[tex-live] Problems with non-7bit characters in filename

Ulrike Fischer news3 at nililand.de
Sat Jul 5 13:31:58 CEST 2014

Am Sat, 5 Jul 2014 08:24:06 +0200 schrieb Reinhard Kotucha:
>  > >> But the main question is why lualatex and xelatex in TeXLive can't
>  > >> handle (probably only non-utf8) file names with non-ascii chars *on
>  > >> the terminal*. I can reproduce his problem on Win7:
>  > >>
>  > > The program has to use a system call to find the filesystem encoding
>  > > and convert the filename from the filesystem encoding to the program's
>  > > internal encoding or vice versa. I am not sure whether it can be done
>  > > in lua but definitely not on macro level.
>  >
>  > Sure. But we are not on the macro level here but on the "system
>  > call" level. Why can't luatex in TeXlive not handle the system call
>  > to file names correctly?

> On Unix everything works as expected, as Markus already confirmed.

Well Klaus is on Unix ... so it work on unix only with some
settings.

> The problem is the user interface.  A German Windows is using CP1252
> and, even worse, CP850 on the command line.

Sure but miktex handles this interface without problem.
I tried this test in various combinations:
lualatex \def\test{äöü}\input{test-utf8}
Regardless if I'm in the command line, in a command line with
chcp 65001, in a msys bash (with \\ there):  miktex works, texlive
not.

> Problems occur when user interfaces are involved which are not aware
> of Unicode.  Sure, if you know which encoding is used you can convert
> any filenames to Unicode.  But what about \openout?  You can convert any
> 8-bit character encoding to Unicode but not vice versa.
>
> What do you expect to happen if you create a file
>
>   "Äöü-Русский язык-日本語"

You don't need not convince me of the benefits of utf8. I would be
happy if there was a button to switch to it in windows.

> I also don't hesitate to use non-ASCII characters in file names.

Also in file names that you want to \input in a pdflatex file? I
doubt this ;-).

I don't hesitate to use non-ASCII in files that are be handled only
by me and one application. As soon as such files should go to other
persons and be processed by more than one application (e.g. latex +
biber) I avoid it.

--
Ulrike Fischer
http://www.troubleshooting-tex.de/