[texhax] Blank first page problem (how to remove?)

Philipp Stephani st_philipp at yahoo.de
Sun Jun 5 22:57:05 CEST 2011


Am 05.06.2011 um 22:47 schrieb Reinhard Kotucha:

> On 2011-06-05 at 22:08:02 +0200, Philipp Stephani wrote:
> 
>> 
>> Am 05.06.2011 um 20:54 schrieb Thomas Schneider:
>> 
>>>>> There are three bogus bytes at the very beginning of the file: 
>>> ...
>>> 
>>>> So it seems notepad in Windows have done some formatting of the
>>>> file formatting which I didn't notice.
>>> 
>>> That's yet Another reason to add to the pile to avoid Windows.
>> 
>> This has nothing at all to do with Windows. Unicode BOMs are
>> defined by the Unicode standard, and every Unicode-capable
>> application may choose to use them.
> 
> Well, the BOM is not needed in UTF-8.  It actually has no meaning
> there.  I've never seen that Emacs inserted a BOM when saving a file
> in UTF-8.  One of the major design goals of UTF-8 was compatibility
> with ASCII.  Has this been dropped by the Unicode Consortium?

No, and if ASCII compatibility is desired, then BOMs should be avoided [1]. A BOM in UTF-8 text is not needed to specify the byte ordering, but can be used to tag the text as UTF-8.
(BTW, Emacs can be made to insert a BOM, too: C-x <RET> f utf-8-with-signature <RET>.)

[1] http://unicode.org/faq/utf_bom.html#bom10
“Some byte oriented protocols expect ASCII characters at the beginning of a file. If UTF-8 is used with these protocols, use of the BOM as encoding form signature should be avoided.”


More information about the texhax mailing list