[XeTeX] xetex doesn't recognize/replace all invalid utf8 bytes
Herbert Schulz
herbs at wideopenwest.com
Wed Dec 30 14:17:20 CET 2009
On Dec 30, 2009, at 4:07 AM, Peter Dyballa wrote:
>
> Am 30.12.2009 um 01:54 schrieb Herbert Schulz:
>
>> Is there a ``common'' name for that encoding?
>
>
> It could be "CP1252" as Mac OS X provides for example these:
>
> /usr/share/locale/ru_RU.CP1251
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP874.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP932.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP936.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP949.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP950.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP1250.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP1251.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP1252.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP1253.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP1254.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP1255.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP1256.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP1257.TXT
> /Applications/Adobe Reader.app/Contents/MacOS/Resource/TypeSupport/Unicode/Mappings/win/CP1258.TXT
> /Developer/SDKs/MacOSX10.5.sdk/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/rexml/encodings/CP-1252.rb
>
> You could also invoke on the command line:
>
> iconv -l | grep CP
>
> The iconv utility is meant to convert file contents between many encodings. It lists, among others:
>
> CP1252 MS-ANSI WINDOWS-1252
>
> --
> Greetings
>
> Pete
>
> Almost anything is easier to get into than out of.
> – Allen's Law
>
Howdy,
Well, here's the complete list of encodings from the TeXShop Help Panel:
• MacOSRoman
• IsoLatin
• IsoLatin2
• IsoLatin5
• IsoLatin9
• IsoLatinGreek
• Mac Central European Roman
• MacJapanese
• DOSJapanese
• SJIS_X0213
• EUC_JP
• JISJapanese
• MacKorean
• UTF-8 Unicode
• Standard Unicode
• Mac Cyrillic
• DOS Cyrillic
• DOS Russian
• WindowsCentralEurRoman
• Windows Cyrillic
• KOI8_R
• Mac Chinese Traditional
• Mac Chinese Simplified
• DOS Chinese Traditional
• DOS Chinese Simplified
• GBK
• GB 2312
• GB 18030
Is that encoding anywhere in the list? If not make a request to Dick Koch to add it.
Good Luck,
Herb Schulz
(herbs at wideopenwest dot com)
More information about the XeTeX
mailing list