[luatex] No hyphenation when \lccodes zero with explicit hyphens

Hans Hagen pragma at wxs.nl
Sun May 15 13:20:01 CEST 2016


On 5/14/2016 6:07 PM, Ulrike Fischer wrote:
> Am Sat, 14 May 2016 16:35:32 +0200 schrieb Hans Hagen:
>
>>>
>>> Your version is too old. But I still see the issue in the newest
>>> luatex from TL2016 pretest. So I would say it has not been fixed.
>>
>> luatex has \hjcode for this (it initializes from lccodes but from then
>> on works with hjcodes thereby untangling character casing from hyphenation)
>
> Changing the \hjcode enables the hyphenation (like changing the
> \lccode) but this doesn't answer the question if it is deliberate
> that there is no global hyphenation point after the hyphen.
>
> Do I have to set all \hjcodes to be able to get an break after an
> hyphen?

there are several things happening:

(1) The hyphen char is only looked at when there is a valid (in terms of 
subtype, language and hjcode) character left of it; this all relates to 
the split in stages, language and hyphenation properties carried with 
glyphs, language specific hyphens etc. This is how luatex works.

(2) The word start is determined by checks against node types.

(3) The same is true for word ends.

So, this is why

     \parindent0pt \hsize=1.1cm
     12-34-56 \par
     12-34-\hbox{56} \par
     12-34-\vrule width 1em height 1.5ex \par
     12-\hbox{34}-56 \par
     12-\vrule width 1em height 1.5ex-56 \par
     \hjcode`\1=`\1 \hjcode`\2=`\2 \hjcode`\3=`\3 \hjcode`\4=`\4
     \vskip.5cm
     12-34-56 \par
     12-34-\hbox{56} \par
     12-34-\vrule width 1em height 1.5ex \par
     12-\hbox{34}-56 \par
     12-\vrule width 1em height 1.5ex-56 \par

gives different results than in pdftex.

Now, (1) is what luatex does, so that's unlikely to change (there are 
some more subtle differences anyway and we don't aim at complete 
compatibility).

The (2) and (3) conditions are currently a bit strict and can be made to 
include e.g. boxes and rules. I played with it and the easiest solution 
is just an option: be more like pdftex in that respect, but provide a 
flag to be strict (1=start of word, 2=end of word, 3=both). This seems 
to work ok (at least at my end, but i didn't check extensively).

But as we are in code freeze stage this will not happen in 0.95 (i need 
to think of a proper name for a variable anyway).

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
       tel: 038 477 53 69 | www.pragma-ade.com | www.pragma-pod.nl
-----------------------------------------------------------------


More information about the luatex mailing list