[XeTeX] xetex doesn't recognize/replace all invalid utf8 bytes

Wed Dec 30 16:15:14 CET 2009

On 30 Dec 2009, at 14:56, Peter Dyballa wrote:

> 
> Am 30.12.2009 um 14:17 schrieb Herbert Schulz:
> 
>> Is that encoding anywhere in the list?
> 
> 
> WindowsCentralEurRoman could be CP1252...

No it couldn't; it's CP1250. See (for example) http://en.wikipedia.org/wiki/Windows_code_page#List

The likeliest candidate is what TeXShop calls "IsoLatin"; this might mean ISO-8859-1, or it might actually refer to CP1252; the two labels are sometimes used rather carelessly. IIRC, the encodings are identical except that CP1252 adds some printable characters (e.g., the curly quotes, em- and en-dashes, Euro, etc.) in the upper control code region (0x80-9F).

JK