[XeTeX] Mixed Roman and Indian alphabets for Sanskrit

Zdenek Wagner zdenek.wagner at gmail.com
Fri Feb 17 17:15:09 CET 2017

Hi all,

the situation is quite complex. I will start with the Latin script. I have
a set of commercial fonts from the Czech vendor. The fonts can be used for
Slovalk English, but not for French, Danish, Icelandic, because all glyphs
are not covered. it can thus be characterized as \czechfont or \slovakfont
but it cannot be a font for any language using the Latin script. Similarly,
\russianfont need not contain glyphs used in Serbian. But let's return
to French. The French typography places tiny spaces in front of a colon,
semicolon, question mark, and exclamation mark but it is not used in other
OpenType implements it by cobination of two tags. You will have to specify
Script=Latin, Language=French.

Devanagari is even more difficult, because the same (but not the same)
script is used for several languages. For instance, Hindi makes use of a
few charcters
with nukta but they are not used in Marathi and Sanskrit (I am not sure
about Nepali but probably they are not used here either). If you look at
Hindi Wikipedia
or Hindi newspapers, the surname of actress Priyanka Chopra is written as
चोपड़ा but im Marathi newspapers it is written as चोप्रा because ड़ is not
used in
Marathi. A few years ago FreeSans (with the Devanagari block derived from
Gargi) did not contain half-ZA. In adition, Sanskrit requires the kta
ligature while
nowadays Hindi prefers half-ka+ta. Fonts intended for Sanskrit contain such
ligatures, fonts intended fot Hindi do not contai it. So if you define
\devanagarifont, which language do you have in mind? The only font with
very good support is most probably FreeSerif, because it inderstands also
language tag and switches features. Try to typeset शक्ति (shakti) with
Language=Hindi and Language=Sanskrit. You will see the difference. The
is also honoured in new versions of web browsers if the font supports it. I
have such an example on my web
so that if your browser supports such a feature, you will see the

What the language packages (Babel, Polyglossia) should do id to instruct
users that they should assign a script to a language. I may wish to write a
of Hindi in Czech so that I will define \czechfont and \hindifont. If I
wished to typeset a text containing both Russian and Serbian, I would
select a font covering
all glyphs for both languages and define \cyrillic font. The package will
then be instructed to change the language, in th TeX sense the hyphenation
and if needed the script will also be changed.

The problem with Sanskrit is that it can be written in several scripts
(even in Tibetan). I do not know what is the best solution in this case.

Zdeněk Wagner

2017-02-17 15:47 GMT+01:00 Philip Taylor <P.Taylor at rhul.ac.uk>:

> Dominik Wujastyk wrote:
> > I'm not sure what more to say, Phil. My comments arise out of my >
> orientation to end-users (including myself), not the internals of the > OT
> language or the "you can do anything" strengths of TeX. I'm > interested in
> transparent terminology that makes it obvious to a > user, for example,
> which hyphenation table is active at any > particular moment in a document.
> OK, this I understand and accept.  But if an open standard such as the OTF
> specification uses terms such as "language" and "script" with specific and
> well-defined meanings, is it helpful to end-users to then re-define those
> terms within an adjunct package such as Polyglossia or Babel ?  Just as
> with Unicode, or the TEI, is it not better to stick with well-established
> and standardised usage rather than invent a (La)TeX-specific usage that can
> (IMHO) only lead to even worse confusion ?
> ** Phil.
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20170217/38f8e99e/attachment-0001.html>

More information about the XeTeX mailing list