[luatex] BOM

Reinhard Kotucha reinhard.kotucha at web.de
Sat May 16 00:16:33 CEST 2009

On 15 May 2009 luigi scarso wrote:

 > Taco's solution is good -- eventually a conscious user can modify
 >  \catcode "FEFF = 9
 > in something else.

Looks fine at a first glance.

On the other hand, A BOM in a UTF-8 file is simply wrong.  It's not
only unnecessary, it's wrong.

Unix shells expect #! at the beginning of a script, not a BOM.  Does
anybody expect that all Unix system distributors adapt their shells
only because there are a few broken editors?

I'm not conviced that it's a good idea to silently ignore such bugs.
It's not luatex's job to repair broken files.  And it's even
counterproductive because developers of broken software do not get the
feedback they need.

I prefer \catcode "FEFF = 13.  Knuth already provided the primitives
\errmessage and \errhelp.  

In my opinion programs should *not* provide workarounds for bugs in
other programs.  Of course, sometimes such workarounds are not always
avoidable, though.

But in this case, however, they *are* avoidable.  There are a few
broken editors, but there are many other ones which support UTF-8
perfectly.  Is this really a luatex issue?

I'm convinced that a BOM in a UTF-8 file is a severe bug:

  The only reason UTF-8 exists at all is because it's the only
  encoding system which supports more than 256 characters without
  breaking existing ASCII files.  A BOM breaks them, see the example

A few stupid questions:

  1. Why do you all think that luatex should provide workarounds for
     bugs in other programs?  That's absolutely strange.

  2. Why, for heaven's sake, does everyone prefer \catcode 9 ?  If the
     input file is buggy, I expect an \errmessage.  Why should severe
     bugs in 3rd party software be silently ignored?  

  3. Taco, are you willing to provide workarounds for everything
     people can do wrong?   Good luck, then.  :)

I think that the only reasonable solution is \errmessage.


Reinhard Kotucha			              Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover	                      mailto:reinhard.kotucha at web.de
Microsoft isn't the answer. Microsoft is the question, and the answer is NO.

More information about the luatex mailing list