[XeTeX] OpenType: script & language

Jonathan Kew jonathan_kew at sil.org
Tue Mar 1 14:52:07 CET 2005


On 1 Mar 2005, at 1:26 pm, Will Robertson wrote:

> On 1 Mar 2005, at 9:50 PM, Simon Spiegel wrote:
>
>> Sorry for my ignorant question, but what would this actually mean? 
>> Would this mean that I could select different fonts for different 
>> languages and that a \setlanguage would automatically change the 
>> font? Or is language just OT internal? What does setting the language 
>> actually do?
>
> I don't have any experience with this feature, but yes, it's OT 
> internal and it specifies which set of features you wish to have 
> available to use. So a unicode font might have features for Hebrew and 
> other features for Arabic and you use the language option to specify 
> which set you want. (Does this example make sense?)

Those would be different *scripts* supported by an OpenType font.

OpenType features are organized primarily by script; and each script 
(for which "shaping engine" support is available) has a default set of 
features that will be applied. Within a script there may be multiple 
language systems, which may provide differing sets of features or 
different implementations of certain features. So, for example, within 
the Latin script, there may be different 'liga' lookups used for 
Turkish than for other languages (because of the need to preserve the 
i/dotlessi distinction).

In the case of XeTeX, at present only the "default" (Latin and similar, 
or unknown script) shaping engine supports explicit specification of 
features by their OpenType tags. For Arabic or Devanagari or other 
"complex" scripts, the shaping engine will simply use the features that 
it "knows" are right for that script, and additional specifications 
will be ignored. (For now; this may change.)

The recent Tibetan example behaves as it does because XeTeX does *not* 
have a specific Tibetan shaping engine; therefore, it's using the 
default OpenType engine; and therefore the additional features listed 
get applied. This only works because (at least for the fonts and texts 
we've tried so far) it seems to be acceptable to apply all those 
features across all the glyphs. This is not true in the more general 
case, which is why there are script-specific shaping engines. (E.g., if 
this were Arabic, the engine needs to decide, character by character, 
whether to apply 'init', 'medi', or 'fina' features; this can't be done 
globally in the font definition.)

If sometime in the future, we do get support for 'script=tibt', then it 
won't be necessary to list the individual features; simply calling for 
that shaping engine will result in the right features being applied. 
Additional features (as for Latin) would only be specified for optional 
typographic refinements, not for basic script behavior.

Hope this is helpful,

JK



More information about the XeTeX mailing list