[XeTeX] Devanagari ASCII to Unicode mapping
Mike Maxwell
maxwell at umiacs.umd.edu
Sat Feb 17 17:57:59 CET 2018
On 2/17/2018 11:08 AM, Daniel Greenhoe wrote:
> Does anyone know where I can find an ASCII to Unicode mapping for Devanagari?
>
> For example, it seems that the Devanagari glyph "ब" is encoded as
> 0x61 (hex) in ASCII (lower case 'a' for the Latin alphabet), but is
> 0x092C in the Unicode standard:
> http://www.unicode.org/charts/PDF/U0900.pdf
>
> So what I am asking for is a map (or table) that maps 0x00-0x7F in
> Devanagari ASCII to 0x0900-0x097F in Unicode.
In addition to the ASCII-to-Devanagari transcription system that Philip
Taylor mentioned, you may be interested in the ISCII encoding for
Brahmi-derived writing systems, including Devanagari:
https://en.wikipedia.org/wiki/Indian_Script_Code_for_Information_Interchange
This is _not_ an ASCII-to-Devanagari encoding, rather it leaves the
ASCII range intact, and encodes Devanagari (etc.) in the range 128
(actually, 161)-255. It was afaik never widely used, but there were
(and probably still are) fonts for it. I don't imagine those fonts
would be terribly high quality by today's standards, e.g. I'd be
surprised if they handled conjunct characters.
FWIW, there was a similar encoding called TSCII for Tamil.
iconv can be used to map TSCII to other encodings, but for some reason
it doesn't seem to have ISCII in its reportoire (it does include VISCII,
but that's a legacy Vietnamese encoding).
--
Mike Maxwell
"My definition of an interesting universe is
one that has the capacity to study itself."
--Stephen Eastmond
More information about the XeTeX
mailing list