[luatex] Input/output and encoding.

Paul Isambert zappathustra at free.fr
Sun Aug 1 18:50:33 CEST 2010

Hello again,

Please consider this document:

 function convert(buf)
   return string.gsub(buf,"(.)",
                      function (ch) return unicode.utf8.char(string.byte(ch))


\input \jobname.xxx

The convert function (adapted from Elie Roux's luainputenc) converts input
character from latin1 to utf-8 (because I'm lost without my old TeXnicCenter,
which does latin1 - windows cp1252 actually, but anyway). It works very well.
However, LuaTeX still writes in utf-8, so the .xxx file is written in utf-8, and
it makes sense, at least to me, to remove the convert function in the callback
before \input'ing that file: it is written in utf-8, so no conversion is needed,
quite the contrary.

However, this doesn't return "Héhé", as I expected. Instead, the "e"'s with
acute accent are ignored, which means LuaTeX hasn't read the proper character.
Of course, if you don't delete the callback beforehand, it doesn't work either,
but that I understand.

I know I could also make LuaTeX write in latin1, which would make everything
simpler. However, I have no heart to produce such code right now, and anyway
that wouldn't make me understand what's going on here, and that annoys me. So if
somebody could enlighten me...?


