[XeTeX] Looking for a better font selection method

Jonathan Kew jonathan_kew at sil.org
Fri May 19 11:25:08 CEST 2006

On 19 May 2006, at 9:59 am, Pektiong Tân wrote:

>> Some kind of "multiple current fonts" scheme, associated with
>> different scripts, is a request that comes up from time to time, and
>> could make things much simpler. It's not as straightforward as it
>> sounds, though, because not all characters in Unicode are
>> unambiguously identified with a single script; punctuation, in
>> particular, is generally shared across scripts. Getting it "right" in
>> all the edge cases is tricky.
> That is why I suggest unicode blocks or unicode code point ranges,
> which are different from the concept of the "script".

Blocks or ranges are not that simple either, as real text will often  
need to use characters from a variety of blocks. For example, the  
Indic "full stop" character is encoded at 0964, in the Devanagari  
block; but this character is also used in languages such as Bengali.  
So in a Bengali text, 0964 should be rendered from the Bengali font;  
but in a Hindi text, from the Devanagari font. And in a bilingual  
text, the appropriate font for the punctuation character will depend  
on the script of the preceding word.

Simply associating a font with a codepoint (or range or block),  
although it could help in many cases, is not a sufficiently powerful  
model for the general problem.

> Well, I am not an expert and I might be wrong. Assigning different
> fonts to different code points might still have problems which I
> can't foresee.
> On the other hand, I doubt there will be a single font which provide
> enough OT support for all scripts in the unicode.

Definitely not; but at present, the answer to this is that the source  
text (or whatever macros etc handle the formatting) is responsible to  
select appropriate fonts for the various runs of text. (I realize  
that in highly-mixed-script documents, this is a nuisance, and it  
would be nice to provide something more automatic.)

Note that in general, mixed-script/mixed-language documents require  
some kind of markup indicating language changes anyway, if things  
like hyphenation are to work correctly.

Anyhow, for now... I understand the desire, and you're not alone in  
wishing for something like this!


More information about the XeTeX mailing list