[XeTeX] traditional to simplified Chinese character conversion utility or data base

Daniel Greenhoe dgreenhoe at gmail.com
Wed Oct 19 00:50:38 CEST 2011


Hi Zdenek, Thank you for your suggestions.

On Tue, Oct 18, 2011 at 2:53 PM, Zdenek Wagner <zdenek.wagner at gmail.com> wrote:
> you can just use tr, ... if you supply the map.

I don't know what "tr" is, but this comes back to one of my original
problems; and that is, I don't have a map. Does anyone know of a
publicly available map? Such a map very likely exists. For example,
Google Translate can translate from traditional to simplified. But
even if they use a map for this service, that map may be proprietary.

> If you wish to do it on the fly in XeTeX, you can write a TECkit map.
> Having the TECkit map you can also run txtconv from the command line.

I like these solutions. However, again, I would still need a map. SIL
has a collection of maps available here:
  http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&cat_id=ConversionMaps
But I didn't see a Chinese traditional-->simplified character map.

Dan




On Tue, Oct 18, 2011 at 2:53 PM, Zdenek Wagner <zdenek.wagner at gmail.com> wrote:
> 2011/10/17 Daniel Greenhoe <dgreenhoe at gmail.com>:
>> I know that this is not really the right mailing list for this
>> question, but I have so far not found the answer by any other means
>> ...
>>
>> I would like to find or write some a utility that would take an
>> unicode encoded file and map Chinese traditional characters to
>> simplified, while leaving all other code points (such  as those in the
>> Latin and IPA code spaces) untouched. For example, the traditional
>> character for horse (馬) is at unicode U+99AC, the simplified one (马)
>> is at unicode U+9A6C, and the Latin character for "A" is at U+0041. So
>> I want a utility that would change the 99AC to 9A6C, but leave the
>> 0041 unchanged.
>>
> If it is really that simple 1:1 mapping, you can just use tr, it does
> exactly that if you supply the map. If you wish to do it on the fly in
> XeTeX, you can write a TECkit map. Having the TECkit map you can also
> run txtconv from the command line.
>
>> Does anyone know of such a utility? Does anyone know of any data base
>> with a traditional to simplified character mapping such that I could
>> maybe write the utility myself?
>>
>> Many thanks in advance,
>> Dan
>>
>>
>>
>> --------------------------------------------------
>> Subscriptions, Archive, and List information, etc.:
>>  http://tug.org/mailman/listinfo/xetex
>>
>
>
>
> --
> Zdeněk Wagner
> http://hroch486.icpf.cas.cz/wagner/
> http://icebearsoft.euweb.cz
>
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



More information about the XeTeX mailing list