[XeTeX] \XeTeXdashbreakstate=1

Zdenek Wagner zdenek.wagner at gmail.com
Wed Apr 11 01:04:48 CEST 2012


2012/4/11 Karl Berry <karl at freefriends.org>:
> Barring major objections arising, I plan to have TL set
> \XeTeXdashbreakstate=1 in xe(la)tex.ini this year.  For those who don't
> know about this obscure parameter -- it allows line breaks after
> em/en-dashes.
>
> Pro: this has always been the behavior of traditional TeX.  It is also
> the behavior of LuaTeX.  So it is more compatible for XeTeX to operate
> in the same way.
>
> Con: it is not the way XeTeX has operated to date.  So existing XeTeX
> documents may see their line breaks change out from under them.  Of
> course, they can set \XeTeXdashbreakstate=0 to restore previous behavior.
>
It may be a typographical problem, such line breaks are undesirable in
the Czech and Slovak typography. On the other hand, it is possible to
add \XeTeXdashbreakstate=0 to polyglossia for such languages so that
problems should not occur.

> Jonathan did not know of any specific reason why he had not always set
> it, from the start.
>
> Below is mail from Jonathan and Khaled with technical details.
>
> Best,
> Karl
>
>
> Date: Sun, 08 Apr 2012 12:27:05 +0100
> From: Jonathan Kew <jfkthame at googlemail.com>
> To: Karl Berry <karl at freefriends.org>
> CC: khaledhosny at eglug.org
> Subject: Re: losing breakpoint after em-dash
>
> On 8/4/12 01:39, Karl Berry wrote:
>> [...] I discovered a strange
>> line-breaking discrepancy between XeTeX and LuaTeX: XeTeX disallowed a
>> line break after an em-dash, while LuaTeX allowed it.  Traditional
>> behavior has always had a breakpoint there (a \discretionary).
>>
>> Khaled kindly looked into it for me, and observed that XeTeX does not
>> reconstitute the hyphenation point after the dash.  See his email below,
>> plus input and log files.
>>
>> [...] figured you could shed
>> light on whether this behavior was intentional for some reason, or if
>> it's "just" a bug.  On the face of it, the difference in behavior (and
>> break with the past, no pun intended :) seems undesirable.
>>
>> Thanks,
>> Karl
>>
>
> If you want automatic insertion of a discretionary break after en- and
> em-dash (like explicit hyphen), set \XeTeXdashbreakstate=1.
>
> (The traditional behavior arises because of the use of a ligature of
> hyphens to create the dashes, so the "post-dash" break is really a
> post-hyphen break. When using the literal Unicode dash characters, this
> no longer happens implicitly as a side-effect of the representation of
> the dash, so \XeTeXdashbreakstate lets you extend the hyphen behavior
> explicitly to the dashes.)
>
> JK
>
>
> Date: Sat, 31 Mar 2012 10:17:52 +0200
> From: Khaled Hosny <khaledhosny at eglug.org>
> To: Karl Berry <karl at freefriends.org>
> Subject: Re: --- allowed line break per engine
>
> [...]
> the difference seems that, for OpenType processing XeTeX converts
> each word into a special whatsit node (called native word) that is then
> processed by the layout engine and passed back to TeX, and here the
> em-dash is considered part of the word, so "variants---regular" is a
> single native word node, and it seems XeTeX does not make the dash a
> hyphenation point (XeTeX takes care of inserting hyphenation points
> inside native word nodes).
>
> None of this happens with LuaTeX as the OpenType processing is all done
> directly on TeX nodes by lua code.
>
> Regards,
>  Khaled
>
> -----------------------------------------------------------------------------
> \input ifluatex.sty
> \input ifxetex.sty
> \ifluatex
>  \input luaotfload.sty
>  \font\lmr="Latin Modern Roman:+tlig" at 10pt
> \else
>  \font\lmr="Latin Modern Roman:mapping=tex-text" at 10pt
> \fi
> \output{\shipout\box255}
> \hsize = 12cm
> \tracingall
>
> The basic text family is LucidaBrightOT, with the usual four
> variants---regular, italic, bold, and bold italic; small
> \end
> -----------------------------------------------------------------------------
> This is XeTeX, Version 3.1415926-2.3-0.9997.6 (TeX Live 2012/dev) (format=xetex 2012.3.20)  31 MAR 2012 10:10
> entering extended mode
>  restricted \write18 enabled.
>  %&-line parsing enabled.
> **hh
> (./hh.tex
> (/media/sda8/tex/texlive/2011/texmf-dist/tex/generic/oberdiek/ifluatex.sty
> Package: ifluatex 2010/03/01 v1.3 Provides the ifluatex switch (HO)
> Package ifluatex Info: LuaTeX not detected.
> )
> (/media/sda8/tex/texlive/2011/texmf-dist/tex/generic/ifxetex/ifxetex.sty)
> {vertical mode: \tracingstats}
> {\tracingpages}
> {\tracingoutput}
> {\tracinglostchars}
> {\tracingmacros}
> {\tracingparagraphs}
> {\tracingrestores}
> {\showboxbreadth}
> {\showboxdepth}
> {\errorstopmode}
>
> {\tracinggroups}
> {\tracingifs}
> {\tracingscantokens}
> {\tracingnesting}
> {\tracingassigns}
> {into \tracingassigns=2}
> {\par}
> {\hsize}
> {changing \hsize=469.75499pt}
> {into \hsize=341.43306pt}
> {select font "Latin Modern Roman 10 Regular:mapping=tex-text"}
> {changing current font=\tenrm}
> {into current font=\lmr}
> {the letter T}
> {horizontal mode: the letter T}
> {blank space  }
> {the letter b}
> {blank space  }
> {the letter t}
> {blank space  }
> {the letter f}
> {blank space  }
> {the letter i}
> {blank space  }
> {the letter L}
> {blank space  }
> {the letter w}
> {blank space  }
> {the letter t}
> {blank space  }
> {the letter u}
> {blank space  }
> {the letter f}
> {blank space  }
> {the letter v}
> {blank space  }
> {the letter i}
> {blank space  }
> {the letter b}
> {blank space  }
> {the letter a}
> {blank space  }
> {the letter b}
> {blank space  }
> {the letter i}
> {blank space  }
> {the letter s}
> {blank space  }
> {\end}
> {\par}
> @firstpass
> @secondpass
> []\lmr The ba-sic text fam-ily is Lu-cidaBrightOT, with the usual four vari-ant
> s°ТРТregular,
> @ via @@0 b=* p=0 d=*
> @@1: line 1.3 t=0 -> @@0
> italic, bold, and bold italic; small
> @\par via @@1 b=0 p=-10000 d=*
> @@2: line 2.2- t=0 -> @@1
>
>
> Overfull \hbox (18.12695pt too wide) in paragraph at lines 13--15
> []\lmr The basic text family is LucidaBrightOT, with the usual four variants°ТРТ
> regular,|
>
> \hbox(7.05+2.05998)x341.43306, glue set - 1.0
> .\hbox(0.0+0.0)x20.0
> .\lmr The
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr basic
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr text
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr family
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr is
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr LucidaBrightOT,
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr with
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr the
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr usual
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr four
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr variants°ТРТregular,
> .\glue(\rightskip) 0.0
> .\rule(*+*)x5.0
>
> %% goal height=643.20255, max depth=4.0
> % t=10.0 g=643.20255 b=10000 p=300 c=100000#
> {vertical mode: \end}
> % t=23.92998 g=643.20255 b=10000 p=0 c=100000#
> % t=23.92998 plus 1.0fill g=643.20255 b=0 p=-1073741824 c=-1073741824#
> {globally changing \outputpenalty=0}
> {into \outputpenalty=-1073741824}
> \everypar->{\shipout \box 255}
> {entering output group (level 1) at line 15}
> {internal vertical mode: \shipout}
>
> Completed box being shipped out [1]
> \vbox(643.20255+0.0)x341.43306, glue set 619.27257fill
> .\glue(\topskip) 2.95
> .\hbox(7.05+2.05998)x341.43306, glue set - 1.0
> .\hbox(0.0+0.0)x20.0
> .\lmr The
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr basic
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr text
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr family
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr is
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr LucidaBrightOT,
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr with
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr the
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr usual
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr four
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr variants°ТРТregular,
> .\glue(\rightskip) 0.0
> .\rule(*+*)x5.0
> .\penalty 300
> .\glue(\baselineskip) 3.00002
> .\hbox(6.94+1.92998)x341.43306, glue set 195.79306fil
> .\lmr italic,
> .\glue 3.33 plus 2.08124 minus 0.888
> .\lmr bold,
> .\glue 3.33 plus 2.08124 minus 0.888
> .\lmr and
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr bold
> .\glue 3.33 plus 1.665 minus 1.11
> .\lmr italic;
> .\glue 3.33 plus 2.49748 minus 0.73999
> .\lmr small
> .\penalty 10000
> .\glue(\parfillskip) 0.0 plus 1.0fil
> .\glue(\rightskip) 0.0
> .\hbox(0.0+0.0)x341.43306
> .\glue 0.0 plus 1.0fill
>
> Memory usage before: 594&10194; after: 327&10194; still untouched: 2988772
> {end-group character }}
> {leaving output group (level 1) entered at line 15}
> {vertical mode: \end}
>  )
> Here is how much of TeX's memory you used:
>  33 strings out of 495895
>  729 string characters out of 3167218
>  11228 words of memory out of 3000000
>  1503 multiletter control sequences out of 15000+200000
>  4754 words of font info for 17 fonts, out of 3000000 for 9000
>  1130 hyphenation exceptions out of 8191
>  6i,0n,4p,108b,12s stack positions out of 5000i,500n,10000p,200000b,50000s
>
> Output written on hh.pdf (1 page).
>
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



-- 
Zdeněk Wagner
http://hroch486.icpf.cas.cz/wagner/
http://icebearsoft.euweb.cz



More information about the XeTeX mailing list