[luatex] Input/output and encoding.

Paul Isambert zappathustra at free.fr
Sun Aug 1 18:50:33 CEST 2010


Hello again,

Please consider this document:

\directlua{%
 function convert(buf)
   return string.gsub(buf,"(.)",
                      function (ch) return unicode.utf8.char(string.byte(ch))
     end)
 end
 callback.register('process_input_buffer',convert)
 }

\immediate\openout0=\jobname.xxx
\immediate\write0{Héhé.}
\immediate\closeout0

\directlua{callback.register('process_input_buffer',nil)}
\input \jobname.xxx
\bye

The convert function (adapted from Elie Roux's luainputenc) converts input
character from latin1 to utf-8 (because I'm lost without my old TeXnicCenter,
which does latin1 - windows cp1252 actually, but anyway). It works very well.
However, LuaTeX still writes in utf-8, so the .xxx file is written in utf-8, and
it makes sense, at least to me, to remove the convert function in the callback
before \input'ing that file: it is written in utf-8, so no conversion is needed,
quite the contrary.

However, this doesn't return "Héhé", as I expected. Instead, the "e"'s with
acute accent are ignored, which means LuaTeX hasn't read the proper character.
Of course, if you don't delete the callback beforehand, it doesn't work either,
but that I understand.

I know I could also make LuaTeX write in latin1, which would make everything
simpler. However, I have no heart to produce such code right now, and anyway
that wouldn't make me understand what's going on here, and that annoys me. So if
somebody could enlighten me...?

Best,
Paul


More information about the luatex mailing list