[XeTeX] traditional to simplified Chinese character conversion utility or data base

Daniel Greenhoe dgreenhoe at gmail.com
Wed Oct 19 05:57:36 CEST 2011


On Wed, Oct 19, 2011 at 10:05 AM, Andy Lin <kiryen at gmail.com> wrote:
> You can try digging in the source for Tong Wen Tang ... Or email its developers.

That's a great idea --- thanks!

Dan


On Wed, Oct 19, 2011 at 10:05 AM, Andy Lin <kiryen at gmail.com> wrote:
> You can try digging in the source for Tong Wen Tang (a Firefox
> extension). Or email its developers. They should have a map and
> additional notes on the conversion.
>
> On Tue, Oct 18, 2011 at 18:50, Daniel Greenhoe <dgreenhoe at gmail.com> wrote:
>> Hi Zdenek, Thank you for your suggestions.
>>
>> On Tue, Oct 18, 2011 at 2:53 PM, Zdenek Wagner <zdenek.wagner at gmail.com> wrote:
>>> you can just use tr, ... if you supply the map.
>>
>> I don't know what "tr" is, but this comes back to one of my original
>> problems; and that is, I don't have a map. Does anyone know of a
>> publicly available map? Such a map very likely exists. For example,
>> Google Translate can translate from traditional to simplified. But
>> even if they use a map for this service, that map may be proprietary.
>>
>>> If you wish to do it on the fly in XeTeX, you can write a TECkit map.
>>> Having the TECkit map you can also run txtconv from the command line.
>>
>> I like these solutions. However, again, I would still need a map. SIL
>> has a collection of maps available here:
>>  http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&cat_id=ConversionMaps
>> But I didn't see a Chinese traditional-->simplified character map.
>>
>> Dan
>>
>>
>>
>>
>> On Tue, Oct 18, 2011 at 2:53 PM, Zdenek Wagner <zdenek.wagner at gmail.com> wrote:
>>> 2011/10/17 Daniel Greenhoe <dgreenhoe at gmail.com>:
>>>> I know that this is not really the right mailing list for this
>>>> question, but I have so far not found the answer by any other means
>>>> ...
>>>>
>>>> I would like to find or write some a utility that would take an
>>>> unicode encoded file and map Chinese traditional characters to
>>>> simplified, while leaving all other code points (such  as those in the
>>>> Latin and IPA code spaces) untouched. For example, the traditional
>>>> character for horse (馬) is at unicode U+99AC, the simplified one (马)
>>>> is at unicode U+9A6C, and the Latin character for "A" is at U+0041. So
>>>> I want a utility that would change the 99AC to 9A6C, but leave the
>>>> 0041 unchanged.
>>>>
>>> If it is really that simple 1:1 mapping, you can just use tr, it does
>>> exactly that if you supply the map. If you wish to do it on the fly in
>>> XeTeX, you can write a TECkit map. Having the TECkit map you can also
>>> run txtconv from the command line.
>>>
>>>> Does anyone know of such a utility? Does anyone know of any data base
>>>> with a traditional to simplified character mapping such that I could
>>>> maybe write the utility myself?
>>>>
>>>> Many thanks in advance,
>>>> Dan
>>>>
>>>>
>>>>
>>>> --------------------------------------------------
>>>> Subscriptions, Archive, and List information, etc.:
>>>>  http://tug.org/mailman/listinfo/xetex
>>>>
>>>
>>>
>>>
>>> --
>>> Zdeněk Wagner
>>> http://hroch486.icpf.cas.cz/wagner/
>>> http://icebearsoft.euweb.cz
>>>
>>>
>>>
>>> --------------------------------------------------
>>> Subscriptions, Archive, and List information, etc.:
>>>  http://tug.org/mailman/listinfo/xetex
>>>
>>
>>
>>
>> --------------------------------------------------
>> Subscriptions, Archive, and List information, etc.:
>>  http://tug.org/mailman/listinfo/xetex
>>
>
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



More information about the XeTeX mailing list