[texhax] Blank first page problem (how to remove?)
Philipp Stephani
st_philipp at yahoo.de
Sun Jun 5 22:57:05 CEST 2011
Am 05.06.2011 um 22:47 schrieb Reinhard Kotucha:
> On 2011-06-05 at 22:08:02 +0200, Philipp Stephani wrote:
>
>>
>> Am 05.06.2011 um 20:54 schrieb Thomas Schneider:
>>
>>>>> There are three bogus bytes at the very beginning of the file:
>>> ...
>>>
>>>> So it seems notepad in Windows have done some formatting of the
>>>> file formatting which I didn't notice.
>>>
>>> That's yet Another reason to add to the pile to avoid Windows.
>>
>> This has nothing at all to do with Windows. Unicode BOMs are
>> defined by the Unicode standard, and every Unicode-capable
>> application may choose to use them.
>
> Well, the BOM is not needed in UTF-8. It actually has no meaning
> there. I've never seen that Emacs inserted a BOM when saving a file
> in UTF-8. One of the major design goals of UTF-8 was compatibility
> with ASCII. Has this been dropped by the Unicode Consortium?
No, and if ASCII compatibility is desired, then BOMs should be avoided [1]. A BOM in UTF-8 text is not needed to specify the byte ordering, but can be used to tag the text as UTF-8.
(BTW, Emacs can be made to insert a BOM, too: C-x <RET> f utf-8-with-signature <RET>.)
[1] http://unicode.org/faq/utf_bom.html#bom10
“Some byte oriented protocols expect ASCII characters at the beginning of a file. If UTF-8 is used with these protocols, use of the BOM as encoding form signature should be avoided.”
More information about the texhax
mailing list