[luatex] Input/output and encoding.
Khaled Hosny
khaledhosny at eglug.org
Sun Aug 1 19:17:56 CEST 2010
On Sun, Aug 01, 2010 at 06:50:33PM +0200, Paul Isambert wrote:
> Hello again,
>
> Please consider this document:
>
> \directlua{%
> function convert(buf)
> return string.gsub(buf,"(.)",
> function (ch) return unicode.utf8.char(string.byte(ch))
> end)
> end
> callback.register('process_input_buffer',convert)
> }
>
> \immediate\openout0=\jobname.xxx
> \immediate\write0{Héhé.}
> \immediate\closeout0
>
> \directlua{callback.register('process_input_buffer',nil)}
> \input \jobname.xxx
> \bye
>
> The convert function (adapted from Elie Roux's luainputenc) converts input
> character from latin1 to utf-8 (because I'm lost without my old TeXnicCenter,
> which does latin1 - windows cp1252 actually, but anyway). It works very well.
> However, LuaTeX still writes in utf-8, so the .xxx file is written in utf-8, and
> it makes sense, at least to me, to remove the convert function in the callback
> before \input'ing that file: it is written in utf-8, so no conversion is needed,
> quite the contrary.
>
> However, this doesn't return "Héhé", as I expected. Instead, the "e"'s with
> acute accent are ignored, which means LuaTeX hasn't read the proper character.
> Of course, if you don't delete the callback beforehand, it doesn't work either,
> but that I understand.
>
> I know I could also make LuaTeX write in latin1, which would make everything
> simpler. However, I have no heart to produce such code right now, and anyway
> that wouldn't make me understand what's going on here, and that annoys me. So if
> somebody could enlighten me...?
There is a process_output_buffer callback that was added just for that.
Regards,
Khaled
--
Khaled Hosny
Arabic localiser and member of Arabeyes.org team
Free font developer
More information about the luatex
mailing list