[tex-hyphen] hyphenation for Hebrew

Yannis Haralambous yannis1962 at gmail.com
Sun Feb 28 11:14:18 CET 2021


dear Jonathan,

in 1995 I organized a conference at Technion University in Haifa, and I had a famous linguist
and member of the Hebrew Academy, Uzzi Ornan (he is still alive and well, at the age of 97!),
who gave a presentation on the hyphenation of Hebrew.

When I saw your message I placed the video of the presentation on Youtube:

https://youtu.be/VmEHaS_f_eE

Basically the problem with Hebrew is that sometimes it is used in abjad mode and sometimes
in phonographic mode. When in abjad mode, the eye has to see the whole word to analyze it
morphologically and breaking it can hinder reading alot. When in phonographic mode (for
example for foreign words) hyphenation is possible without slowing down reading too much.

The problem is of course that hyphenation patterns apply everywhere and it is hard to distinguish
abjad from phonographic words. It is not impossible, but difficult, and requires a large hyphenated
corpus.

Are you ready to do some work? For example, if I supply you with the list of Hebrew Wiktionary
entries (there are ~ 22 thousand of them) are you willing to mark those that are abjad with an
asterisk and give hyphenation for those that are phonographic?

Examples:

פּוֹלִיפוֹנִיָה   is phonographic, it is a Greek word and it can be hyphenated as     פּוֹ-לִי-פוֹ-נִיָה
סֵפֶר           is abjad, it should not be hyphenated

Let me know what you are planning/willing to do

Cheers

Yannis

> Le 28 févr. 2021 à 10:52, Yonatan Zilpa <yz11235 at gmail.com> a écrit :
> 
> Dear Arthur,
> Thanks a lot for your willingness to help. Hebrew is an RTL (Right to Left) language. Thus the default English hyphenation pattern doesn't work in Hebrew. First I would like to adjust the lines in such a way that long words would be split between lines automatically. Second I would like to write a hyphenation pattern file for Hebrew language to hyphenate  well known hyphenated words. 
> 
> Kind regards,
>            Jonathan Zilpa
> 
> ‫בתאריך שבת, 27 בפבר׳ 2021 ב-23:09 מאת ‪Arthur Rosendahl‬‏ <‪arthur.reutenauer at normalesup.org <mailto:arthur.reutenauer at normalesup.org>‬‏>:‬
>         Dear Jonatan Zilpa,
> 
> On Sun, Jan 31, 2021 at 07:08:48PM +0200, Yonatan Zilpa wrote:
> > Dear Mojca Miklavec / Arthur Reutenauer,
> > I would like to write a hyphenation pattern for Hebrew for XeLaTex /
> > Polyglossia.
> > May you please help me by providing guidance  on how to do this.
> 
>   We can help you with that, but note that Mojca and I are only
> responsible for distributing the patterns, we may not be the most
> knowledgeable ones about hyphenation in one particular language.  I do,
> however, know that Hebrew is not normally hyphenated, so can you
> give a little more details on what you’d like to do?
> 
>         Best,
> 
>                 Arthur Rosendahl (né Reutenauer)

 <http://www.imt-atlantique.fr/>	Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
 <http://perso.telecom-bretagne.eu/yannisharalambous/> <https://twitter.com/y_haralambous> <https://www.linkedin.com/in/yannis-haralambous-5529073?trk=hp-identity-name>Technopôle Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT <http://www.imt.fr/>
We're going always. — 'We're going always.' — Totally. —
That's not actually a sentence. — Well it's got a verb in it.     (Doctor Who)



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/tex-hyphen/attachments/20210228/8b7e65b7/attachment-0001.html>


More information about the tex-hyphen mailing list.