# [luatex] BOM

Taco Hoekwater taco at elvenkind.com
Thu May 14 19:22:07 CEST 2009

Yannis Haralambous wrote:
> This has probably already been brought up, but please take care of the
> BOM character: it must
> be ignored by the LuaTeX engine.
>
> Here is why: BOM is useful when writing in UCS-16 (or UTF-16) to find
> out whether the file is written in
> big-endian or small-endian way. In UTF-8 it makes no sense because UTF-8
> is written bite-wise, in logical order.
>
> Nevertheless software like M\$ Notepad (under Vista) will systematically
> insert a BOM at file begin (and I didn't found any way to prevent it).
>
> Other text editors, such as Ultra-Edit (Win) or BBEdit (Mac) will let
> the user choose, but by default they will still insert a BOM.
>
> LaTeX then sees a character at file begin which is not a backslash or a
> comment, and stops because there should
> be no text character before \begin{document}.
>
> If one could, once and for all, decide to ignore that character, it
> would be the best. Using lua code for that would be a waste of time and
> energy....

I could set

\catcode "FEFF = 9

as part of the initex initialization code. That would do the trick, yes?

Best wishes,
Taco



More information about the luatex mailing list