[XeTeX] Syriac abbreviations, and issues with polyglossia, fontspec and bidi
garzohugo at gmail.com
Wed Dec 23 17:47:24 CET 2009
As Jonathan said, SAM should be an overline stretching over the last few
letters of a word (to the left edge of the word), and hyphenation isn't
an issue. SAM is supposed to be entered at the beginning of the line
(right-hand edge of line), which may well be mid-word. Sometimes the
line is broken with a dot at each end and one in the middle, but that's
a matter of style. Numerals, which are traditionally written with
letters, often get this mark too (in which case, the line extends over
the entire numeral), to stop people trying to read them as words.
Fr. Michael Gilmary wrote:
> FWIW --- although I'm no Syriac scholar --- I tried your sample text
> with the SAM (as you say) in Mellel ... and all the fonts come out
> the same: no extension of the abbreviation marker.
It's interesting to know that Mellel can't handle SAM either. Thanks for
Jonathan Kew wrote:
> One approach would be to implement this in AAT or OT fonts by using
> glyph substitution: the SAM glyph would be deleted, but would trigger
> a contextual replacement of the following glyphs with overlined
> versions. However, I haven't seen a font that actually does this; the
> assumption seems to be that text engines will handle this with
> special-case code.
> An issue with supporting SAM is that (last time I checked) it's
> defined as applying "until the end of the word", but this begs the
> question of how exactly one defines the "end of a word" in Syriac
> script. Obviously spaces, etc., would act as terminators; but there
> are likely to be some edge cases (what about invisible control
> characters such as join controls? directional controls? discretionary
> breaks? arbitrary diacritics? etc) that I have not seen clearly
> defined anywhere.
The font Serto Jerusalem does have a combining overline that is used for
this purpose. The end of the line is always a space or punctuation; I
don't think other cases apply. I would imagine one could make the SAM
character active in XeTeX and set it to overline following text until it
meets a space or punctuation.
Jonathan Kew wrote:
> This might be possible, though I'm not sure whether the SAM
> necessarily occurs at a location where the cursive joining of the
> Syriac letters is interrupted; if not, there'd be a problem (in
> xetex, at least) of how to get correct shaping/joining across the box
> edge (and other commands, etc), as OpenType shaping only applies
> within a contiguous sequence of characters.
> But I don't know enough about Syriac to really judge whether this is
> feasible purely at the macro level.
Ah, yes. My example put SAM in required cursive break, but it could
easily occur between two characters that should join. The problem is
that there is plenty of leeway in writing abbreviations in Syriac: I
could always try and start the line at a cursive break, but there won't
always be one, and it might look a little odd.
Department of Eastern Christianity
+44 (0)1865 615331
More information about the XeTeX