[XeTeX] Discretionary line-breaks in Tamil

Zdenek Wagner zdenek.wagner at gmail.com
Sun Sep 29 11:02:25 CEST 2019


Hi

ne 29. 9. 2019 v 7:29 odesílatel Suki Venkat <suki.venkat at gmail.com> napsal:
>
> Hi,
>
> There is a unicode character at 200B that is used for discretionary line-breaks (DLB) quite in the spirit of discretionary hyphens. In Tamil words broken at the end of a line are not hyphenated (as it is  agglutinate language and is not a isolating language like English).
> Some editors like Emacs and InDesign do not allow cursor to move freely between characters, so I worked out a solution by putting these DLBs after every half consonants, which seems to be a nice solution to the hyphenation problem as well (but this may not be sufficient).
>
> Wondering if XeTeX care about DLBs (they are useful to break long URLs and stuff like that).
>
TeX has a \discretionary primitive with thre parameters: pre-break,
post-break, no-break. Thay can contain any material with a fixed
width, they cannot contain variable-with material such as a rule or a
space. For instance, German word Zucker is properly hyphenated as Zuk-
ker which can be encoded in TeX as Zu\discretionary{k-}{k}{ck}er. If I
understand it well, what you need is just \discretionary{}{}{}. You
can enter any character and make it active and define it to expand to
that discretionary. Better solution would be to redefine \hyphenchar
of the font as an invisible character with a sero width. I am not sure
whether zero-width-joiner or zero-width-nonjoiner can be used because
they have special meaning for interpretaion of Indic scripts.

> Suki


Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz



More information about the XeTeX mailing list