[XeTeX] How to Convert Devanagari (sanskrit) text to Telugu Text (A u)

Shrisha Rao shrao at nyx.net
Thu Oct 27 09:56:51 CEST 2011


El oct 27, 2011, a las 12:36 p.m., Andrew Moschou escribió:

> On 27 October 2011 16:52, Shrisha Rao <shrao at nyx.net> wrote:
> Being able to use UTF-8 codings in such scripts to produce outputs in other scripts would require n × n mappings, as against 1 × n if the input is only in ITRANS.
> 
> Actually, 1×n is all that is required, as long as the mappings are bijectional, but this requires two passes. Firstly to convert to ITRANS, then secondly to the desired script. I can identify one instance where the mapping is not bijectional, as ব in Bengali stands for both ब and व in Devanagari, and this has already been mentioned here, I believe, but even so, a set of n×n mappings doesn't help this situation.

The problem is much worse with Tamil, which does not have separate symbols/sounds for क, ख, ग, घ, or प, फ, ब, भ, etc. (which is also a reason that Tamil speakers are known to mispronounce Sanskrit words/names where these consonants are found).  I meant that 1 × n is preferable for Sanskrit texts (as suggested in the thread subject) to be expressed in multiple fully Sanskrit-compatible scripts.  I don't know about Bengali, but I believe for Tamil there are special extended notations, something like க் with subscript 1 being क, with subscript 2 being ख, etc.  These are non-classical typographical notations and have no Unicode formulation so there is no way to handle the situation with either 1 × n or n × n.  The xetex-itrans package offers a way to express Tamil using ITRANS input, and Sanskrit in Kannada/Roman/Telugu/Devanagari script using ITRANS input, but not a way to code Sanskrit in Tamil script using ITRANS input.  As far as I know, there is no straight-forward way to code Sanskrit in Tamil script using Unicode.

Regards,

Shrisha Rao

> Andrew




More information about the XeTeX mailing list