[XeTeX] traditional to simplified Chinese character conversion utility or data base

Tue Oct 18 08:53:22 CEST 2011

2011/10/17 Daniel Greenhoe <dgreenhoe at gmail.com>:
> I know that this is not really the right mailing list for this
> question, but I have so far not found the answer by any other means
> ...
>
> I would like to find or write some a utility that would take an
> unicode encoded file and map Chinese traditional characters to
> simplified, while leaving all other code points (such  as those in the
> Latin and IPA code spaces) untouched. For example, the traditional
> character for horse (馬) is at unicode U+99AC, the simplified one (马)
> is at unicode U+9A6C, and the Latin character for "A" is at U+0041. So
> I want a utility that would change the 99AC to 9A6C, but leave the
> 0041 unchanged.
>
If it is really that simple 1:1 mapping, you can just use tr, it does
exactly that if you supply the map. If you wish to do it on the fly in
XeTeX, you can write a TECkit map. Having the TECkit map you can also
run txtconv from the command line.

> Does anyone know of such a utility? Does anyone know of any data base
> with a traditional to simplified character mapping such that I could
> maybe write the utility myself?
>
> Many thanks in advance,
> Dan
>
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>


-- 
Zdeněk Wagner
http://hroch486.icpf.cas.cz/wagner/
http://icebearsoft.euweb.cz