[tex-hyphen] weighting hyphenation points (was: hyphenation (what else ; -))
Stephan Hennig
mailing_list at arcor.de
Mon May 17 13:01:00 CEST 2010
Am 17.05.2010 00:55, schrieb Mojca Miklavec:
>> From a readability point of view 'lava-bo' is better for me since one can
>> guess the rest of the word (whereas you can't guess the rest of la-)
>
> <not-to-be-taken-seriously>
> Oh, and yes ... I was already wondering when somebody will come up
> with the idea to extend TeX with tolerances for preferable breaking
> points in addition to the allowed ones :) :) :)
> </not-to-be-taken-seriously>
Incidentally, I've had a mail conversation about this with Taco and
Werner a couple of weeks ago. The good news is, I think Taco has this
on his list. Here's a sketch of the approach as I understand it
(ignoring libhnj for now).
Hyphenation points can be weighted by applying multiple pattern sets in
parallel that have different weights attached. That is, if a match
exists in, e.g., a compound word pattern set, then that hyphenation
point will be weighted higher than a regular hyphenation point. If
concurring pattern sets find a match, the highest weight wins.
Consider these pattern sets
* regular pattern set with an attached weight of 10:
n1n a1d
* compound word pattern set with an attached weight of 20:
en1nad
and the compound word "Tannennadel" (fir needle). The regular pattern
set has matches
Tan-nen-na-del
weighting each hyphenation point equally (10 or whatever). Compound
word patterns find the match
Tannen-nadel
weighting that match 20. Finally, during paragraph breaking,
hyphenation weights will be
Tan-nen-na-del
10 20 10
Therefore breaking the word at the word compound Tannen-nadel will be
(slightly) preferred.
Best regards,
Stephan Hennig
More information about the tex-hyphen
mailing list