# [XeTeX] strange results of measuring boxes

Jonathan Kew jfkthame at googlemail.com
Tue Jul 28 17:45:07 CEST 2009

Hi Marcin,

You have run into one of the situations where XeTeX "cheats" slightly
in its effort to merge OpenType font layout with TeX algorithms!

Here's what is happening: in the font you're using, there is a kern
defined between the characters "GT" in the OpenType data. XeTeX
automatically recognizes and uses this. However, when there is a
discretionary break (such as \-) between the characters, this breaks
the text sequence that is presented to the OpenType layout system, and
so the kern is not recognized.

When you put the text into an \hbox to measure it, discretionaries are
discarded (because the hbox won't be subject to line-breaking), and so
the kern takes effect. But during normal pararaphing, the
discretionary node is present in the hlist that is built, and so the
kern is not seen. After line-breaking, XeTeX removes all the unused
discretionaries, and reconstructs the words across those (former)
boundaries, so that ligatures, kerns, or other OpenType effects are
applied properly.

So line breaks are chosen on the basis of a slightly "false" set of
measurements, in the case where discretionaries occur at positions
where OpenType layout effects should also occur. Normally, this is
unimportant, as the difference in metrics is very slight and the error
is absorbed into justification/packing of the line boxes. But in your
case, where you are trying to fit a sequence of characters with no
flexibility into a precisely-fitted width, it's a problem; the line
(as measured in the paragraph's hlist) does not fit correctly into the
(true, final) width, so the potential breakpoint has infinite badness,
TeX's secondpass and emergencypass come into effect, and in the end
the hyphenated solution (with an underfull box) is chosen in
preference to the apparently-overfull line (which would actually have
been a perfect fit after removal of the unused breakpoint). :(

This issue has always been present in XeTeX, though you seem to be the
first user to run into a real-life problem as a result -- sorry. (Or
at least, the first to report it!) As a workaround, I would suggest
adding a small "fudge factor" to the measured width of the word; a
little trial and error may be needed to determine how much is needed.

(Another "fix" would be to insert something like \hbox{} at the place
where the discretionary break occurs, to disable the kern there
regardless of whether the break is chosen or not. But of course
disabling kerns is not a nice solution, as the result will be inferior
letter spacing.)

I'd like to fix this properly sometime, but it requires some care to
handle the interaction between the TeX and font layers of processing.

JK

On 24 Jul 2009, at 17:13, Marcin Woliński wrote:

> Dear XeTeX Gurus!
>
> In an application I'm trying to set the width of a ‘p’ column in
> LaTeX
> tabular to the minimal width which will accommodate for a certain
> word.
> On the TeX level that means I'm measuring the width of an \hbox
> containing the word, than use this value to set \hsize within a \vbox
> (see the attached file).  In pdfTeX this seems to work reliably.  In
> XeTeX the word sometimes gets broken.  As shown in the example, a
> rather
> large value of \emergencystretch is necessary (however, the default
> value used by multicol is enough to trigger the problem).  The word
> being measured needs to contain a hyphen, either explicit or
> discretionary.  And it seems that XeTeX has to be using OpenType
> fonts,
> but then the problem is not specific to TeX Gyre Heros used in the
> example.
>
> Can anyone help in understanding the problem?  Questions:
>
> 1. Do you get a similar result on the test file, that is the test word
> fits in one line in the first \vbox, but gets broken in the second?
> (I'm using svn XeTeX 821).
>
> 2. Why does (Xe)TeX find a solution during @secondpass in the first
> case, but has to resort to @emergencypass in the second?  There
> seems to
> be no overfull in the first \vbox, although the line is reported as
> ‘tight’ in the trace (why?).  So how come the value of
> \emergencystretch
> influences the second pass???
>
> 3. Do you see a way of overcoming this problem without setting
> \emergencystretch to 0pt?  Narrow columns in a table are the exact
> case
> where \emergencystretch comes in handy.
>
> 4. Do you see a reliable method of measuring the minimal width I need?
>
> With best
> Marcin
>
> <testmeasurement.pdf><testmeasurement.log><testmeasurement.tex>