[luatex] [lltx] [tex-live] Location of recorder file

Reinhard Kotucha reinhard.kotucha at web.de
Sat May 14 01:42:33 CEST 2011


On 2011-05-13 at 13:41:32 +0200, Philipp Stephani wrote:

 > Am 13.05.2011 um 13:15 schrieb Heiko Oberdiek:
 > 
 > > I think, the file name interfaces should be transparent
 > > in the sense, that all characters are supported and
 > > the user is only hit by the restrictions of the operating
 > > system or the file system, but not by artificial restrictions
 > > by the software inbetween.
 > 
 > I agree, but this will never happen: try to use a Unicode file name
 > on Windows in any engine ... even though they are perfectly legal
 > as far as the operating system is concerned, no TeX processor ever
 > will accept them because fopen doesn't accept Unicode file names on
 > Windows. Either all engines switch to a nonstandard C runtime that
 > interprets file names as UTF-8, or all engines are rewritten to
 > avoid fopen on Windows. Both are extremely unlikely to ever happen.

I must admit that I don't understand.  First of all, when talking
about character encodings, I don't know what "the operating system"
means.  AKAIK, filenames are stored as UTF-8 in NTFS (don't know
whether FAT supports UTF-8).  

My question is how and where this is implemented.

The user interfaces are using different encodings, in a German
Windows, the Exploder uses CP1252 and cmd.exe is using CP850.  I would
expect that they translate filenames to UTF-8 internally.

When you say

 > even though they [Unicode file names] are perfectly legal
 > as far as the operating system is concerned [...]

I suppose that you have the C API in mind, and I suppose that the
fopen() you mention is that from MSVCRT.

Which character encoding does fopen() expect?

Does the Exploder use fopen() from MSVCRT?  I ask because I've seen so
many differences between the Exploder and cmd.exe, especially
regarding file permissions and UNC paths.

Is it possible to open a file and avoid MSVCRT?  If yes, with which
versions of Windows is it compatible?

I ask because you said

 > Either all engines switch to a nonstandard C runtime that
 > interprets file names as UTF-8, [...]

and I'm wondering whether it's *our* mistake to rely on MSVCRT (which
actually supports MS-DOG only), even though current versions of
Windows provide system calls which support UTF-8 natively.  What do
you mean with "nonstandard"?  Not shipped with recent versions of
Windows or has to be installed explicitly?

There is a system call execvpe() in MSVCRT, but some people mentioned
CreateProcess().  Where does the latter come from?  Obviously not from
MSVCRT.  

Is there another runtime lib beyond MSVCRT?  If yes, is it still
appropriate to rely on the old stuff?

Regards,
  Reinhard

-- 
----------------------------------------------------------------------------
Reinhard Kotucha                                      Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover                              mailto:reinhard.kotucha at web.de
----------------------------------------------------------------------------
Microsoft isn't the answer. Microsoft is the question, and the answer is NO.
----------------------------------------------------------------------------


More information about the luatex mailing list