# [tex-live] Problems with non-7bit characters in filename

Reinhard Kotucha reinhard.kotucha at web.de
Sun Jul 6 02:50:12 CEST 2014

On 2014-07-05 at 13:31:58 +0200, Ulrike Fischer wrote:

> Regardless if I'm in the command line, in a command line with
> chcp 65001, in a msys bash (with \\ there):  miktex works, texlive
> not.

Does MiKTeX allow to create a file (\openout) whose mame contains
characters which are not supported by the 8-bit codepage currently
in use, e.g. '日本語.foo'?  If this works then MiKTeX accesses the
filesystem directly and doesn't consult any codepages.

I tried chcp 65001 some time ago on Windows7 but it had no effect.
Subversion stores comments in UTF-8.  So when I ran

svn log

on the command line, non-ASCII characters were not displayed properly.
I expected that chcp 65001 solves the problem but it didn't.
Does anybody know more?

> > Problems occur when user interfaces are involved which are not
> > aware of Unicode.  Sure, if you know which encoding is used you
> > can convert any filenames to Unicode.  But what about \openout?
> > You can convert any 8-bit character encoding to Unicode but not
> > vice versa.
> >
> > What do you expect to happen if you create a file
> >
> >   "Äöü-Русский язык-日本語"
>
> You don't need not convince me of the benefits of utf8. I would be
> happy if there was a button to switch to it in windows.

Sure, but what I meant is that you cannot convert Unicode to ISO-8859.
This is not a problem when reading a file.  The problem occurs when
you create a file.  Hence it's not sufficient to consider reading
only.

BTW, what I prefer is to be able to use UTF-8 on Windows *without* the
need to press a button.  There is no such button on Linux and I don't
miss it.  It's nice that everything works out-of-the-box.

> > I also don't hesitate to use non-ASCII characters in file names.
>
> Also in file names that you want to \input in a pdflatex file? I
> doubt this ;-).

Well, I'm using pdftex for existing files only or for testing TeX
macros which don't require Lua or OpenType fonts.  (pdftex is a bit
faster).

> I don't hesitate to use non-ASCII in files that are be handled only
> by me and one application. As soon as such files should go to other
> persons and be processed by more than one application (e.g. latex +
> biber) I avoid it.

Yes, when I give files away I'm careful too, especially because I
don't know anything about the recipient's environment.  And I expect
problems on Windows.  But I don't worry about names of files which I
download from the internet.  It's simply too much work to rename them
all.

Regards,
Reinhard

--
------------------------------------------------------------------
Reinhard Kotucha                            Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover                    mailto:reinhard.kotucha at web.de
------------------------------------------------------------------