[luatex] Accented characters on Windows / lfs

Reinhard Kotucha reinhard.kotucha at web.de
Fri Sep 28 21:25:59 CEST 2018


On 2018-09-28 at 14:17:20 -0400, maxwell wrote:

 > On 2018-09-28 07:07, Harald Hanche-Olsen wrote:
 > > From: Hans Hagen <j.hagen at xs4all.nl>
 > > Date: 28 September 2018 at 12:07:03
 > > 
 > > afaik windows has no utf filenames, so when i save a file with that 
 > > name 
 > > i get 
 > > 
 > > cöw.txt 
 > > 
 > > (internally i think names become unicode16 and display depends on
 > > the  code page)  ...  (But I am not a windows user myself, nor do
 > > I know much about windows, so I have nothing to contribute other
 > > than this reference. Sorry if it is off the mark or irrelevant.)
 > 
 > I think this is fundamentally correct, but just in case: Windows
 > supports Unicode UTF-16 in file names in NTFS-based file systems
 > (but not in the earlier FATxx file systems).  NTFS was introduced
 > in Windows NT in 1993, and became a part of consumer-based Windows
 > systems with Windows 2000: https://en.wikipedia.org/wiki/NTFS If
 > you're getting weird characters (like in the line quoted above),
 > it's likely that you're viewing them in a non-UTF16 application.
 > So yes, in such applications the display depends on the code
 > page--although code pages themselves are largely deprecated in
 > modern versions of Windows, in favor of Unicode:
 >      
 > https://en.wikipedia.org/wiki/Windows_code_page#Problems_arising_from_the_use_of_code_pages

It's not sufficient to declare code pages deprecated as long as they
are unavoidable.  The default code page of the CLI is CP850 in Western
Europe.  According to Phil Taylor it's possible to switch to UTF-8
with

  chcp 65001

but this only works if the font used in the terminal window is "Lucida
Console".  I can't imagine why it depends on a particular font but I
tried and it obviously works.

The font change is permanent.  When you start a new terminal window
you get the same font until you change it again.

In order to make CP65001 the default you have to edit the registry.
See

  https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8

Maybe a few problems can be avoided if the CLI is configured to use
UTF-8 by default.  Sure, many problems can't be resolved this way.

Regards,
  Reinhard

-- 
------------------------------------------------------------------
Reinhard Kotucha                            Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover                    mailto:reinhard.kotucha at web.de
------------------------------------------------------------------



More information about the luatex mailing list