[tex-hyphen] weighting hyphenation points (was: hyphenation (what else ; -))

Stephan Hennig mailing_list at arcor.de
Mon May 17 13:01:00 CEST 2010

Am 17.05.2010 00:55, schrieb Mojca Miklavec:

>>  From a readability point of view 'lava-bo' is better for me since one can
>> guess the rest of the word (whereas you can't guess the rest of la-)
> <not-to-be-taken-seriously>
> Oh, and yes ... I was already wondering when somebody will come up
> with the idea to extend TeX with tolerances for preferable breaking
> points in addition to the allowed ones :) :) :)
> </not-to-be-taken-seriously>

Incidentally, I've had a mail conversation about this with Taco and 
Werner a couple of weeks ago.  The good news is, I think Taco has this 
on his list.  Here's a sketch of the approach as I understand it 
(ignoring libhnj for now).

Hyphenation points can be weighted by applying multiple pattern sets in 
parallel that have different weights attached.  That is, if a match 
exists in, e.g., a compound word pattern set, then that hyphenation 
point will be weighted higher than a regular hyphenation point.  If 
concurring pattern sets find a match, the highest weight wins.

Consider these pattern sets

   * regular pattern set with an attached weight of 10:

       n1n a1d

   * compound word pattern set with an attached weight of 20:


and the compound word "Tannennadel" (fir needle).  The regular pattern 
set has matches


weighting each hyphenation point equally (10 or whatever).  Compound 
word patterns find the match


weighting that match 20.  Finally, during paragraph breaking, 
hyphenation weights will be

      10  20 10

Therefore breaking the word at the word compound Tannen-nadel will be 
(slightly) preferred.

Best regards,
Stephan Hennig

More information about the tex-hyphen mailing list