TeX line-breaking anomalies (was Re: [texhax] Isthis a bug in LaTex?)

Ian Collier imc at comlab.ox.ac.uk
Sat Sep 20 00:43:07 CEST 2003

(Warning - esoteric discussion of TeX's internals follows)

sojka at informatics.muni.cz (Petr Sojka) writes:
>On Thu, Sep 18, 2003 at 08:37:08PM +0100, Ian Collier wrote:
>| Earlier I wrote:
>| >I question (a) why b=10000 on that penultimate break, given that
>| >\rightskip has infinite stretchability, and (b) why the break has
>| >to be via @@82 rather than @@81 when on the face of it @@81 would
>| >seem to give fewer demerits.

>| Answering my own question, I suppose (b) is because once we have hit
>| b=10000 the demerits are no longer relavent.  However, (a) is still
>| a mystery.

>try to uncomment the line with \adjdemerits in plain tex file below,
>and read TeXbook exercise 14.11:

>\hsize 360pt
>%\adjdemerits=5083 %plain sets 10000 
>\font\f = cmr10 scaled 1095 \f
>\rightskip 0pt plus 1fil
>I claim that a graph G can not have a poor subgraph with only one
>or two vertices but it can have a poor subgraph
>with only three vertices. If G is a graph with exactly
>three vertices and it has a poor subgraph then I claim G is

Both the example and the exercise seem to be irrelevant to the points
above.  However, the example is curious because it illustrates the
point made at section 836 in `TeX: The Program'.  Note that \adjdemerits
shouldn't make any difference in theory, since when \rightskip has the
above value *all* lines are classed as either `decent' or `tight'.
However, what actually happens is that when \adjdemerits is small
enough TeX saves time by not computing any `tight' lines, so when
it comes to the anomalous \break\break sequence there is no `wrong'
breakpoint for TeX to select.

>So clearly it is a tex [line breaking algortihm's] feature, not a bug, 
>and has nothing to do with latex. 

I didn't claim that it has anything to do with LaTeX (though, as Robin
says, this all comes about because of LaTeX's naïve definition of
\raggedright), but I do still claim that it is an anomaly, if not
an outright bug.

For more weirdness, try this:

spam\spam\spam\spam\spam\hfil\spam\spam\spam spam\par
\rightskip=0pt plus1fil
spam\spam\spam\spam\spam\hfil\spam\spam\spam spam\par
\rightskip=0pt plus.5\hsize
spam\spam\spam\spam\spam\hfil\spam\spam\spam spam\par

Ignoring the first \break in each case (as there isn't enough text to
fill the line) and also the last \par (since \parfillskip fills the line),
we have three \breaks before the \hfil and three after.

Case 1: the first three breaks have b=0 and the next three have b=10000
Case 2: the first three breaks have b=10000 and the next three have b=0
Case 3: the first three breaks have b=0 and the next three have b=800.

Something fishy is going on, and I suspect it has something to do with
glue being discarded after linebreaks.  But surely the glue should only
affect the line it's on and not the two subsequent lines.

Another curiosity: in case 2 we have a line of @firstpass output before
the @secondpass begins.  But even though the same first break is chosen
with zero badness, the second pass has d=* instead of d=0.

[The above is all with TeX version 3.14159]
---- Ian Collier : imc at comlab.ox.ac.uk : WWW page below
------ http://users.comlab.ox.ac.uk/ian.collier/imc.shtml

More information about the texhax mailing list