[XeTeX] Devanagari ASCII to Unicode mapping

Mike Maxwell maxwell at umiacs.umd.edu
Sat Feb 17 17:57:59 CET 2018


On 2/17/2018 11:08 AM, Daniel Greenhoe wrote:
> Does anyone know where I can find an ASCII to Unicode mapping for Devanagari?
> 
> For example, it seems that the Devanagari  glyph "ब" is encoded as
> 0x61 (hex) in ASCII (lower case 'a' for the Latin alphabet), but is
> 0x092C in the Unicode standard:
>    http://www.unicode.org/charts/PDF/U0900.pdf
> 
> So what I am asking for is a map (or table) that maps 0x00-0x7F in
> Devanagari ASCII to 0x0900-0x097F in Unicode.

In addition to the ASCII-to-Devanagari transcription system that Philip 
Taylor mentioned, you may be interested in the ISCII encoding for 
Brahmi-derived writing systems, including Devanagari:
 
https://en.wikipedia.org/wiki/Indian_Script_Code_for_Information_Interchange

This is _not_ an ASCII-to-Devanagari encoding, rather it leaves the 
ASCII range intact, and encodes Devanagari (etc.) in the range 128 
(actually, 161)-255.  It was afaik never widely used, but there were 
(and probably still are) fonts for it.  I don't imagine those fonts 
would be terribly high quality by today's standards, e.g. I'd be 
surprised if they handled conjunct characters.

FWIW, there was a similar encoding called TSCII for Tamil.

iconv can be used to map TSCII to other encodings, but for some reason 
it doesn't seem to have ISCII in its reportoire (it does include VISCII, 
but that's a legacy Vietnamese encoding).
-- 
    Mike Maxwell
    "My definition of an interesting universe is
    one that has the capacity to study itself."
          --Stephen Eastmond


More information about the XeTeX mailing list