[XeTeX] Syriac abbreviations, and issues with polyglossia, fontspec and bidi

Ross Moore ross at ics.mq.edu.au
Tue Dec 22 20:01:39 CET 2009

Hi Jonathan,

Thanks for the clarification.
It helps a lot in understanding what this is about.

On 23/12/2009, at 5:16 AM, Jonathan Kew wrote:

> I don't have a handy way to produce the correct rendering, but I  
> can describe it: there should be a line over the top of the word,  
> from the place where the Syriac Abbreviation Mark (SAM) occurs  
> until the end of the word.

Is this to the right- or left-hand edge,
given that the individual characters occur RtoL?

> In other words, SAM shouldn't be rendered as an individual glyph at  
> all; it's more like markup that applies to the following string of  
> characters.

> Implementing this requires a pretty advanced font system, or else  
> special-case code in the rendering engine. I guess Uniscribe  
> provides that. XeTeX doesn't, at least currently. So what you're  
> seeing in the examples posted here is simply the "placeholder"  
> glyph for SAM, which should disappear and be replaced by the  
> overline on the following letters.

OK; this seems similar in principle to how TeX uses ^ or _
in math-mode to alter the way the following character
(or character group) is presented.

So the SAM character could be made active ...

> One approach would be to implement this in AAT or OT fonts by using  
> glyph substitution: the SAM glyph would be deleted, but would  
> trigger a contextual replacement of the following glyphs with  
> overlined versions. However, I haven't seen a font that actually  
> does this; the assumption seems to be that text engines will handle  
> this with special-case code.
> An issue with supporting SAM is that (last time I checked) it's  
> defined as applying "until the end of the word", but this begs the  
> question of how exactly one defines the "end of a word" in Syriac  
> script. Obviously spaces, etc., would act as terminators; but there  
> are likely to be some edge cases (what about invisible control  
> characters such as join controls? directional controls?  
> discretionary breaks? arbitrary diacritics? etc) that I have not  
> seen clearly defined anywhere.

  ... invoking code to determine the end of the word, subject to what
you have said here. Then it would construct the line (with dots)
to be placed over the top. Use a box to put this line over the
remaining letters, then place this so as to finish the word.

But there would remain issues of how this box joins/overlaps the
early part of the word, and what about linebreaks or hyphenation ?
(And the discretionaries and diacritics, as you say.)

Looks like a job for macros, perhaps with some internal coding
to help determine the word ending.

> JK



Ross Moore                                       ross at maths.mq.edu.au
Mathematics Department                           office: E7A-419
Macquarie University                             tel: +61 (0)2 9850 8955
Sydney, Australia  2109                          fax: +61 (0)2 9850 8114

More information about the XeTeX mailing list