Not a bug in original TeX

This page lists a few reports which could plausibly be considered notable bugs in the original TeX software written and maintained by Donald Knuth, but have been deemed not something to be fixed, either by Knuth or his vetters.

Many other reports have been declined that are not listed here (Knuth's tune-up reports mention some: 2021, 2014, 2008). The original reports and answers have been edited or paraphrased for presentation here.

The page and section numbers are merely an initial hint about a relevant location; typically, more than one place in the code and/or documentation is involved. The initial letter (A, B, …) refers to the Computers & Typesetting volume.

A list of accepted bugs is also available. These are not expected to be reviewed by Knuth until the next tune-up.

For any discussion about these issues, or further reports, please use the contact information on the main TeX bugs page here.

Contents: A005: missing \null - A009: primitive operations - A213: \csname and active characters - A214: \endinput behavior - A275: catcode types - A233: \frac and dangerous bends - A271: "FFIL invalid - A308: \copyright tie - A335: \headline offset - A374: \asts behavior - A415: \ninebig delimiters

B016: first line not logged - B028: extra blank line logged - B032: use of word “procedure” - B035: newlines inconsistently written to terminal - B040,D036: can fix can fix - B133: max_param_stack comment - B214: input file name flushed - B274: line number ranges - B350: \finalhyphendemerits reference unclear - B442: \spaceskip not logged - B506: bogus dimen display/other overflow - B546: output routine braces - B550: index single-character primitives

C031: gftodvi modes - C172: epsilon not x - D040: print_scaled behavior - D326: infinite macro expansion - D350: unused variable m declared - E037: doubled and missing kern pairs - E468: alignment of \in

dvitype: unnecessary loop condition - gftodvi: unused char_italic text and macro - gftopk: unused text_file declaration (and more)


A005, et al.: missing \null

From various people: as a first example, in exercise 2.4, to get the right spacefactor, “OK,” should be “OK\null,”. The same issue occurs dozens of times throughout the books and WEB sources, after all kinds of punctuation. “\TeX.” and “MF.” are particularly prevalent.

Response from DEK: My practice has been to insert \null only when I notice something amiss in proofreading. Similarly with lots of other refinements.


A009: primitive control sequences vs. operations

From Ulrich Diez, 2022-12-09. The TeXbook says:

About 300 of TeX's control sequences are called primitive; these are the low-level atomic operations that are not decomposable into simpler functions. All other control sequences are defined, ultimately, in terms of the primitive ones.

but there are many primitive operations, such as typesetting a character, which are not primitive control sequences, so the second sentence is false.

Response: no argument, but currently the term “primitive” is rather consistently used to mean “primitive control sequence”. The repertoire of primitive actions not invoked by a control sequence is never given a label, as a group. So no simple correction is evident.

The sentence could be reworded to be pedantically correct via something like (thanks to Paul Vojta for the basic suggestion):

All other control sequences ultimately expand to a token list in which the only control sequences are the primitive ones.

or in various other ways; but, this is extremely early in The TeXbook. It would sow confusion to new readers to say anything like this so early in the book. For example, the concept of tokens is not mentioned until 30 pages further on.

Knuth's disclaimer in the preface of “deliberate lying” would seem to apply here.


A213: \csname and active characters

From Hu Yajie, 2021-03-01: The \csname...\endcsname entry on page 213 of The TeXbook says that “only character tokens should remain” after the tokens between \csname and \endcsname are fully expanded. But the truth is that only non-active character tokens should remain, and active characters will cause errors.

Response: There seems to be no technical reason why TeX doesn't allow unexpandable active characters within \csname. That would have made sense but it's not behavior that will change now. When the input is something that “should” not be done, an error is a reasonable result.


A214: \endinput behavior

The TeXbook defines the behavior of \endinput with:

The next time TeX gets to the end of an \input line, it will stop reading from the file containing that line.

and that is exactly how it behaves. N.B. It does not say “stop reading from the file containing the \endinput” (let alone “stop reading immediately”). Thus, when material is placed on an input line after \endinput, there are counter-intuitive effects (report 1) and/or wrong/imprecise error messages (report 2).

Also, TeX's error message “File ended while ...” is technically inaccurate even in the simple case of any text at all following \endinput, in that file reading did not reach EOF.

Response: In general, Knuth has said that extreme cases of TeX input deserve whatever they get. Furthermore, he knows error messages are not always optimally worded. But he has consistently declined to tinker with changes only for the sake of small incremental improvements at this point.


A233: \frac and dangerous bends

From Hanson Char, 2024-02-16: Exercise 22.1 of The TeXbook [about typesetting a recipe as a table] is not marked with a dangerous bend, but the solution provided utilized the \frac macro from exercise 11.6, which is dangerous bend material. As someone who considers themselves a beginner, I found this discrepancy a bit challenging to navigate.

Response: Agreed this is not optimal. Our best idea to improve it would be to add something like “(Use \frac from exercise 11.6.)” into the question, since the main point of this exercise is to typeset the table, not explain the correct typesetting of recipe fractions. Unfortunately, there is no room on page 233 to add that, or anything else, and Knuth has stated he does not want to change page breaks at this late date.

By the way, \frac was not present in the first printings; it was added when Knuth himself learned about typesetting fractions in recipes, which perhaps partly explains why it is not so well-integrated into the rest of the book as it might be.


A275: catcode types 1 and 2

From Hu Yajie 胡亚捷, 2024-08-08: A275, line -2 speaks of “explicit character tokens whose category codes are respectively of types 1 and 2”. But nowhere else in the book are category codes further grouped into types.

Response: Although this singular use of “types” is perhaps unfortunate, it's not being used in a technical way. Thus what's written is not optimal, but it's not an error, and that's all Knuth wants to hear about at this late date.


A271: "FFIL is invalid

From Udo Wermuth, 2023-11-21: the syntax rules on pages 271, 270, and 269 imply that "FFIL should be a valid <fil dimen>, with a <factor> of "F (hexadecimal F), and <fil unit> of FIL. But an assignment like \skip255=0pt plus "FFIL gets an error.

Response: Page 270 states an overriding condition:

An <integer constant> must not be immediately followed by a <digit>; in other words, if several digits appear consecutively, they are all considered to be part of the same <integer constant>. A similar remark applies to the quantities <octal constant> and <hexadecimal constant>.

In other words, TeX reads the hex digits (which can only be uppercase) greedily, so we get the 255 interpretation of "FF.


A308: tie instead of space after \copyright

From Victor S: in the answer to exercise 7.9 (the question is on p.41), a non-breaking space should follow the \copyright command.

Response: While it is certainly true that no line break would be desirable after (or before) a \copyright symbol in a normal copyright statement, such a line break would not render the copyright statement invalid. In addition, this exercise is clearly hypothetical, since it's not good practice, or perhaps even legally meaningful, to generate the year in a copyright statement instead of writing it out; this is a much worse legal issue than an extraneous line break. Also, since the exercise is about token expansion, not proper form of copyright statements, the presence of a ~ could lead readers to wonder if that was relevant to the exercise. Knuth has consistently declined to spend time on improvements on side issues that are not related to the TeXnical topic at hand.


A335: \headline offset from where it should be

From Igor Liferenko, 2023-10-07. The TeXbook's solution for Exercise 23.2, generating a headline with the word “RÉSUMÉ”, places the headline lower than it should be, because the height of the accented E (9.36111pt) is larger than the height of the \vbox which \makeheadline uses (8.5pt).

Response: agreed; clearly the plain TeX definition of \makeheadline is deficient. However, we feel that Knuth's desire for stability is such that he would not consider changing the definition of \makeheadline at this late date, since documents might well be depending on its precise definition, e.g., working around the problem in some way already. Also, it seems too confusing to change the answer to the exercise to use a different definition without also changing plain.tex.

Your reports gives an alternate definition that users can employ if they wish; another possibility (due to Donald Arseneau) would be along the lines of \vbox to {\vfil...} so ascenders could go higher than a strut, but a headline taller than some limit would generate an overfull box warning.


A374: \asts mode-dependent operation

From Bertram Scharpf, 2023-04-13. The TeXbook gives a solution for inserting \n asterisks:

\setbox0=\hbox{*}\cleaders\copy0\hskip\n\wd0

However, this does not work in every situation. In vertical mode, it produces an error and thus needs to be preceded by \leavevmode or similar. Worse, at the end of a paragraph, the leaders glue will be removed; a \null or similar is needed to avoid that.

A general solution would be to put an \hbox around the whole thing:

\hbox{\setbox0=\hbox{*}\cleaders\copy0\hskip\n\wd0}

Response: agreed on all counts, but since Appendix D is maximally-dangerous bend material, it seems reasonable for Knuth to leave out such bullet-proofing from the macro. The main point is the use of leaders to produce the variable number of asterisks.


A415: \ninebig delimiters axis height

From Hu Yajie 胡亚捷, 2020-07-29 and 2020-07-29 again and 2020-08-02:

The \ninebig macro in manmac.tex typesets \big delimiters in 9-point math by borrowing the 10-point ones in cmr10 and cmsy10, but it forgets to retain the 9-point axis height. Thus examples like \input manmac \ninepoint $\bigl(()\bigr)$\end are vertically asymmetrical. This asymmetry can be observed in the real books (page A245, line 20; page C298, line -1; etc.), and it can be fixed by changing ‘\hbox{...}’ to ‘\vcenter{\hbox{...}}’ to get vertical symmetry.

Response: Knuth accepts the analysis, but says: “Since I've been happy with that for nearly 40 years, I guess I'm still happy with it.”


B016: first line of input not logged

From Dominik Leininger (and many other people over the years), 2023-08-16. Give the ** prompt (or a command line) commands that ordinarily write to the log file. For instance, running Metafont like this:

**\ show tracingonline; show origin..right; end

gives the output:

>> 0
>> Path at line 0:
...

According to the ‘>> 0’, the path should not be displayed on the terminal, but it does. The second thing I noticed is that you should see the path in the transcript file, but mfput.log only contains the following [...]

[An analogous TeX first line: \showthe\tracingonline \showbox255].

Response: TeX and MF can't write to the log file until it's opened, and they don't open the log file until its proper name is known. That is, we want to let the user do stuff like \somemacros \input myfile and get myfile.log as the log file name. Thus, the programs wait until the (implicit) \input happens, or there isn't one by the end of processing the ** line, in which case we get texput.log or mfput.log.

This is explained, somewhat, on page B016 (section 25, “We need a special routine…”), and (even less specifically, but still implied), on page A023 (“TeX uses the name texput …”), and the analogous areas in mf.web and The Metafontbook.

Although one might imagine the programs buffering all would-be log output until the log file is opened, instead of writing to the terminal, it's questionable whether this would be an improvement. In any case, it's not a change Knuth would make at this late date.


B028: extra blank line logged

From Igor Liferenko, 2023-03-06. Run TeX on:

\setbox0=\hbox{A}\showbox0\end

and the log file is (notice two blank lines):

.\tenrm A


! OK.

Response: Granted this is not ideal, but presumably Knuth would not rework the basic I/O functions to avoid it, as the bug happens because TeX does not keep perfect track of which column it's at in the log file (or terminal output). He's declined to fix similar problems in the past, such as (your/Igor's) “newlines inconsistently written to terminal” report below.

The response from Tyge Tiessen explains the situation in more detail, including other examples where spurious blank lines can occur.


B032 (section 72): use of word “procedure”

From Martin Ruckert, 2021-03-14: The text says “The print_err procedure”, but print_err is a macro, not a procedure.

Rino Jose, 2018-12-18, made a similar report for Metafont, page D120 (section 266): “This procedure returns” instead of “This function returns”. [The phrase occurs in several other places in mf.web and tex.web.]

Response: As a matter of English, it is normal to use “procedure” interchangeably with other terms (macro, function, (sub)routine), and since it's not formatted in bold, it shouldn't be taken as implying a Pascal procedure.


B035 (section 35): newlines inconsistently written to terminal

From Igor Liferenko, 2021-07-06: Before exiting, TeX (and other WEB programs) sometimes use write and sometimes use write_ln, e.g., tex.web line 1036 (B035, section 35) vs. line 1085 (B036, section 37).

Response from DEK: This is not important enough to warrant any change. [However, he has noted that it's fine/expected for change files to do as they see fit in this regard.]

Further information from DRF: For the historical record. The Sail/Waits OS, where DEK spent his time back in the day, had strong knowledge of what text was where on your screen, as well as what was buffered up for (custom) keyboard input and (custom) screen output, and it was all tightly bound with the “shell” that was really integrated with the OS. The system handled whole lines of input, and when the user hit <return>, it put the cursor at start of the next line, and knew it; when the program put characters on the screen, the system knew exactly what row and column they were in; and when a program ended, any remaining output buffered up for the screen was flushed, including moving the cursor to the start of a new line if necessary.

Score/TOPS-20, the only port we directly provided, was the same but different (completely different OS and no special terminal hardware, but the shell was tightly integrated with the terminal IO, and the system knew where the cursor was at all times; see the SFPOS and RFPOS system calls).

The only question is whether I'm lying about “if necessary”, and that it was really “always”. Looking for further clues, note that Tangle and Weave always do not end with a newline to the terminal, while PLtoTF, TFtoPL and PoolType always do end with a newline to the terminal. This is evidence that either the OSes we dealt with added a newline at the end only as needed, or that they always added a newline and DEK didn't care that the PL/TF programs left an extra blank line. The latter is believable, as those programs were virtually never used, while Tangle and Weave were in constant use, especially by DEK. (Perhaps oddly, DVItype and GFtype don't report errors or progress to the terminal; all their output goes into the .TYP file, so they don't really provide any evidence.)

That leaves TeX and MF. In normal operation, they also do not end with a newline to the terminal, and they too were in constant use. However, in most exceptional cases they do end with a newline (minus l.1085). Looking back at version 0.97 on saildart.org/[TEX,DEK] it looks like it was all pretty much the same mix.

I'd say that the normal-operation TeX/MF, along with Tangle/Weave, is fairly strong evidence that care was being taken for intentionally ending without a newline; and that it is in fact a mistake in the (very) exceptional cases where it does, but nobody cared or noticed, since those exceptions pretty much never happened. So, I suppose I'd say that [in principle] lines 1036, 1328, 10164, 23810, 24289 should all be changed not have a newline, so that everything is self-consistent; with the super advantage that it's then always ok for other ports to always add a newline at the end of the job, and that will never cause an extra blank line.

I don't think there's any mysterious reason for the odd-man-out case of 1085 where the code seems already “right”.


B040 (section 95) and D036 (section 90): can fix can fix

From Gregor Purdy, 2023-06-12 (and others in the past):

[the] procedure confusion which calls `help1` with an argument ending in the duplication "...who can fix can fix"

Response: This is one of Knuth's jokes: the program is so broken it issues a broken error message.


B133 (section 308): max_param_stack comment

From Wolfgang Helbig, 2021-07-23:

The comment

     { largest value of param_ptr, will be <= param_size + 9 }
at the declaration of max_param_stack seems misleading to me. I'd suggest instead:
     { largest value of param_ptr }
The param_ptr must not exceed param_size, which is ensured in section 390.

Response: That's true about param_ptr. What's misleading is that second half of the comment, “will be <= param_size + 9”, applies to max_param_stack, not param_ptr. A semicolon instead of comma would have made that clearer.

More from DRF about this: DEK is commenting on the fact that he had to make the type of max_param_stack be integer rather than 0..param_size+9, which is what it really ought to be—but Pascal doesn't let you use even a constant additive expression in the range definition (and WEB only lets you if it's from a numeric (=) macro so it can collapse the addition, but DEK wanted max_param_stack to be compile-time changeable in the const section, evidently). See all of the other max_* global variables for confirmation; they're all 0..<whatever>_size (and <whatever> is a Pascal const).

He could have detected the overflow before doing the addition, and thus be able to use max_param_stack:0..param_size and gotten rid of the comment entirely, but then the statistics report at the end of the TeX run would not have shown how big you need to increase param_size to for the job to run.

This is not the only terse comment that needs much thought / experience / analysis to figure out the motivation for.


B214 (section 537): input file name flushed prematurely

From Wolfgang Helbig, 2020-10-26:

TeX sometimes flushes the name of an input file, keeping only the base name without directory and extension. This causes an error if the full name of the file needs to be passed to the editor during error recovery, as suggested by Prof. Knuth. The same bug is in Metafont (section 793).

[...] change block[s] from my tex.ch [...]:

@x [tex 537] continued
if name=str_ptr-1 then {we can conserve string pool space now}
   begin flush_string; name:=cur_name;
   end;
@y
@^Editor@>
@z

Response: The assumption was that on any “reasonable” OS, you could easily ask the system for the full canonical file name of the appropriate open alpha_file when you need it, so there's no need for TeX to remember it. Since *nix is not able to do this in general, dealing with this in the changefile as shown seems reasonable. (There are various system-dependent ways to approximate this, but no reliable and portable method is possible.)

On the other hand, filesystems on popular TeX-able OSes of the day (TOPS-20, VAX/VMS, etc.) had both Logical Name and Version Number features, resulting in the need to ask the OS for the full name of the file that actually got opened.

This call to flush_string also causes a non-standard filename extension to be lost when calling the editor. Knuth recommends that implementors avoid this, either via Wolfgang's change that eliminate flushing the string or some other method.

These issues all fall under the rubric of “system wizardry” mentioned in the description of the E option (page B036, section 84).


B274 (section 663): line number ranges may come from different files

From Udo Wermuth, 2017-01-25.

In overfull/underfull box messages, the beginning and ending lines shown might come from different files, and thus be misleading. Suppose we have a file main.tex containing this line:

Main 1\par \input auxone \end

and a file auxone.tex with these two lines:

Aux1 1\par
Aux1 2 bug in underfull message?\break

then running tex main gives:

This is TeX...
(./main.tex (./auxone.tex)
Underfull \hbox (badness 10000) in paragraph at lines 2--1
...

A range of lines that begins after it ends does not make sense. Also, a user will connect both numbers to the file main.tex as it is the only active file; there is a ) after auxone.tex, so this file has been processed.

The situation can also occur in alignments and with overfull messages. It is shown in the trip test.

Response: Indeed, and because it is shown in the trip test, we can conclude that DEK was aware that if you started a paragraph in one source file and ended it in another, then line numbers in messages would be problematic. The attitude was that robustness in the face of this edge case (not a recommended best practice) wasn't worth the extra bytes of memory (both code and data). Especially since the rest of the context is usually clear from the logging of the actual text of the paragraph.


B350 (section 829): \finalhyphendemerits reference unclear

From Didier Verna, 2024-01-17: The comment in section 829 ends with this sentence: “The end of a paragraph is also regarded as ``hyphenated''; this case is distinguishable by the condition cur_p=null.” I couldn't see the relevance, or the possible motivation for this, and it took me a while to realize that it was related to section 859 (demerits computation). My suggestion would be to reformulate that sentence.

Response: Agreed that it's not obvious, and a reference to §859 (where the code can thus quickly decide whether to add double_hyphen_demerits or final_hyphen_demerits) would be helpful, especially since it's necessary to search for cur_p<>null as well as the stated cur_p=null. The word “regarded” might be better as “marked”. And so on. Knuth is no longer making incremental improvements like this, however.

DRF provides details for further study: the break_type parameter to try_break also gets squirreled away into the type field of a newly-created active node for linebreaking (down in <Insert a new active node from |best_place[fit_class]| to |cur_p|>). So, you might wonder if the special end-of-paragraph call to try_break with break_type=hyphenated will create an active node (it does) that has a type of hyphenated (yup) and if this matters anywhere else in the code (it doesn't; whew).

The only place that looks at the type of an active node in order to differentiate between hyphenated and unhyphenated (as opposed to worrying about delta nodes), is the conditional controlling whether we execute the code we've been looking at that chooses between double_hyphen_demerits and final_hyphen_demerits:

if (break_type=hyphenated)and(type(r)=hyphenated) then
  if cur_p<>null then d:=d+double_hyphen_demerits
  else d:=d+final_hyphen_demerits;

The second condition, type(r)=hyphenated, looks at the type of active node r, but r is a breakpoint that's immediately before the one currently being considered, so it can't possibly be the final one. Voila!

I'd been worrying that the notion that the magical break_type=hyphenated at the end of a paragraph “only matters in section 859, and that's the one and only thing it's for” might have been wrong, but now I'm confident it's right (especially since I also ran some tests to confirm).


B442 (section 1043): \spaceskip not logged when \spacefactor has no effect

From Igor Liferenko, 2022-06-21:

Spaceskip glue with zero stretchability and shrinkability is not marked as such in log when spacefactor is not 1000.

\spaceskip=1pt \setbox0=\hbox{I turn}\showbox0

Output:

.\glue 1.0
Spacefactor does not change the glue (due to zero stretchability and shrinkability), so output must be:
.\glue(\spaceskip) 1.0

Response: our feeling is that this is not a bug to be passed on, mainly because Knuth never explicitly says, either in The TeXbook or tex.web, exactly when the special glues such as \spaceskip are marked. So the only guide is the code, and it seems intentional that the marking is avoided when the spacefactor != 1000, regardless of whether the spacefactor affects the glue setting.

As suggested, there is certainly a reasonable argument that it would be nicer if \spaceskip was marked when it is the only source for the glue item. But the current behavior doesn't seem wrong, since there is no statement being contradicted. Also, the behavior is plausible, namely, only mark \spaceskip if the glue was not modified, even potentially, by the space factor.

Looking at the code (tex.web sections 1041–1044), it seems it would not be simple to change, since right now the code applies space factor modifications without needing to know the context, that is, if the stretch and shrink are actually going to be used (consider \unhbox). Putting extra stuff into this “almost inner loop” code merely for the sake of different logging is presumably not something Knuth would entertain.


B506 (section 1238): bogus display of bogus dimen, and other overflow

From Bruno Le Floch, 2020-10-22.

Slightly incorrect display of 32768pt dimen: The following shows --32768.0pt with a double minus sign.

\dimen0=\maxdimen
\advance\dimen0\maxdimen
\advance\dimen0 2sp
\showthe\dimen0

Response: Any bug is the lack of an error message at \advance\dimen0\maxdimen, but the lack of overflow checking is pervasive in TeX, and is a deliberate choice by Knuth. The resulting display is a case of GIGO.

Supplement: Overfull \hbox not reported: the lack of complete overflow checking induces strange behavior in other ways. For example, on 2021-05-16, Matteo Caoduro reported that an extremely long line does not generate an overfull box message:

xxx...6298 x's...xxx\end

With 6297 x's, and then starting at 24924 x's, there is an overfull box message, but not in between. TeX's errors and warnings are not intended to handle such pathological situations.

Patches to implement more complete overflow checking would be welcome (Knuth has said this would be ok). The performance hit is unlikely to matter nowadays.


B546 (section 1372): output routine braces are super special

From Bruno Le Floch, 2020-10-22.

The \output routine is surrounded by very peculiar braces, and by removing the closing one with \let\next=, one ends up in a black hole where TeX does not interpret any further token. My question and answer on tex.sx describe the strange behaviour. It is probably not a bug as there is an explicit comment “loops forever if reading from a file”. It would be interesting to have a rationale.

Response: The idea here is that “The error message [I can't handle that very well] told you you've made a mess, and if the error message isn't enough, then the help info warns you that error recovery is not likely, and if that's not enough, then you'd better look at Volume B.” Basically, “I can't handle that very well” includes the possibility “I can get in a loop.”

As for a rationale, it would be more or less impossible to generally recover into any sensible state, and certainly not one that would give any reasonable subsequent output. This is a case that normal users of macro packages and documents would never run into, and would need a deep expert's close study to fix, so it is not worth spending time or (precious, at the time) bytes of code on trying to let the job continue, most likely fruitlessly in the end.

For that matter, it's somewhat surprising that this case doesn't just bail out immediately, as with the few

fatal_error("(interwoven alignment preambles are not allowed)")

cases, or the favorite:

confusion("256 spans"); {this can happen, but won't}
both of which are more likely for a real user to run into directly.

B550 (section 1380): single-character primitives not indexed

From Thierry Laronde, 2021-09-14.

[…] the two one-character primitives “” and “\/” are not in the index.

Response: These (also “\-”) are indexed under “Single-character primitives” (on page B575), as noted in section 267 (page 113). We agree that it would be better to index the single-character primitives as themselves, as is done in The TeXbook, or at least put a note at the beginning of the index, but these are the sorts of incremental improvements Knuth is no longer making.


C032: GFtoDVI and modes and max_h

From Igor Liferenko, 2021-11-16:

In The METAFONTbook, page C32 perhaps gives the impression that gftodvi can be used on any input GF file. With a regular font, the result of running gftodvi may have many warnings and/or a max_h of 0, according to dvitype.

Response: The default mode in Metafont is supposed to always be be proof, so it's reasonable for The MFbook to keep it simple at this early point, and not ask the user to type in mode:=proof in their very first input.

The incorrectly zero max_h is unfortunate, but dvitype.web explicitly says “Since characters can legally be set outside of the page boundaries, it is not an error when |max_v| or |max_h| is exceeded.” So getting warnings on unusual input files is not unexpected and isn't something Knuth would fix now. (It's also unfortunate that dvitype has no way to output errors only, i.e., suppress the warnings. This could be added to a particular implementation.)


C172: text(epsilon) vs. text(x)

From Bertram Scharpf, 2022-06-30:

In The METAFONTbook, page C172, line 10 reads:

case text($\epsilon$) is omitted. [...]

but in line 3 ‘text’ is defined by

@for@ $x=\epsilon_1,\epsilon_2,\epsilon_3$: text($x$) @endfor@
Notice the ‘\epsilon’ instead of the ‘x’.

Response: Knuth is using the unsubscripted \epsilon here to mean any of the \epsilon_1, \epsilon_2, \epsilon_3 values given in the for statement. This is indicated by the text in the sentence on line 9 (before the quoted line 10): “The \epsilon's might also be empty, in which [case …]”.

The idea being that if a particular epsilon is empty, then the text() expression for that epsilon is omitted. Saying “text(x) could be omitted” might be misread as saying the entire for loop expands to nothing if any of the epsilons were empty, which is clearly not the case. It might have been clearer to say “A given \epsilon_i might also be empty, in which case text($\epsilon_i$) is omitted …”, but even if Knuth agreed in theory, he is no longer making that kind of micro-improvement to the exposition.


D040 (section 103): print_scaled mf vs. tex

From Patrick Varilly, 2024-02-06: The intended functional difference [of print_scaled] is that Metafont tries to print integer scaled values such as 42 as “42”, whereas TeX happily prints them as “42.0”. However, there is a second difference, where the statement that “rounds the last digit” is

s := s + '100000 - (delta div 2)
in METAFONT, but is
s := s + '100000 - 50000"
in TeX. (More details in the posted email.)

Response: Agreed that improvements in both code and doc would be possible, but Knuth has previously declined to make changes unifying TeX and MF; and it seems impossible he would change anything about scaled arithmetic at this point.

From DRF: To rephrase what both routines are trying to do, and then give a bit of a motivational riff, we might have:

Within the round_decimals round-trip constraint, we always choose brevity over closest-rounding; for instance we'll print “0.1” rather than “0.10001” even though the later is slightly closer to 6554/unity; and we'll print “0.9” rather than “0.89999” even though the later is similarly slightly closer to 58982/unity. But when we do have to print all five decimal places in order to fulfill the round-tripping condition, the result is rounded conventionally. This seems to be the best compromise, given that fixed-point arithmetic can quickly accumulate errors in low-order bits, so the shorter representation is likely to be more appropriate anyway.

Also: MF's comment “{round the final digit}” is ever so slightly more informative, so maybe [in principle] should be adopted by TeX, and perhaps augmented a bit, too.

Also: I like [Patrick's suggestion of] using 50000.

Finally: let's note that the programs behave differently wrt whether or not at least one decimal place will always be printed—TeX: yes; MF: no.


D326 (section 718): expansion runs out of memory

From Dominik Leininger, 2023-11-14:

I came across a segmentation fault when passing the following input to mf:

def?=scantokens""?enddef;?

where other tokens can be used instead of ‘?’ and the string token can contain tokens, commands, etc.

Since scantokens causes ? to be expanded before any tokens in the (empty) string, I would expect a "! METAFONT capacity exceeded, sorry."

Response (from Karl): in short, mf does not protect itself against infinite macro expansion, so the crash happens if the available stack space is less than mf's main_memory size.

You might wonder about TeX. The original TeX also does not protect itself against infinite macro expansion, but in Web2C we (long long ago) added a parameter expand_depth (10,000 by default) to catch these recursions. The parameter was not added to mf.

(in detail, from DRF): my unmodified MF also ends up in an infinite recursion and crashes with a stack overflow:

#0 0x102ddc8c8 in get_x_next 
#1 0x102e178d0 in expand 
#2 0x102ddc930 in get_x_next 
#3 0x102e25410 in scan_primary 
...

Setting a breakpoint at jump_out, the debugger reports that we're about 90,000 stack frames deep at this point. I re-linked MF with an 8Mb stack, and voilà:

! METAFONT capacity exceeded, sorry [main memory size=65535].

For reference/posterity: I added the magic invocation -Wl,-stack_size,0x800000 to get the 8Mb stack space.


D350 (section 788): unused variable m declared

From David Fuchs, 2021-01-24.

This line in mf.web should be removed:

@!m:integer; {the current month}

as the local variable k is used, as correctly commented, by open_log_file for indexing into months, and m is now an unused variable.

Response from DEK: In accordance with the wonderful Japanese tradition of wabi-sabi, I won't be changing that.


E037: missing and doubled kern pairs

From Bogusław Jackowski and Janusz Nowacki (2005, reported at EuroTeX 2005, where Knuth was present; article, slide), and Hans Hagen and Mikael Sundqvist (2023), and others: there are some repeated kern pairs in Computer Modern. For example, in roman.mf (E037), ka is defined with both -u# and -.5u#; in mathit.mf, N+slash and X+slash similarly are defined twice.

Response: There's no harm in this, apart from a few bytes of wasted space in the tfm files, and it won't be changed. All engines use the first value, as stated in The Metafontbook, page C317.

There are other infelicities in the CM kerning tables, e.g., the second (smaller) value for ka in cmr10 is arguably better than the first, Av (among many other possible pairs) is not kerned at all, etc. Knuth has stated that no further tweaks will be made to CM metrics; they have not changed since the 1980s.


E468: alignment of $\in$ et al.

From “studying mathematics”, 2023-11-17, tex.stackexchange.com q&a. In displaystyle the off-centering (or centering?) of the \in symbol isn't very notic[e]able, however it is in scriptstyle.

Response: agreed that the alignment of \in can be considered suboptimal, but it's a feature of Computer Modern that the bottom bar of the \in symbol is below the baseline; the symbols \in, \owns, \subset, \supset, and others are all designed in the same way. The proof characters shown on page E468 make this clear:


dvitype.web (section 36): unnecessary loop condition

From Lucas Mirelman 2021-07-10: In

@<Store character-width indices...@>=
if wp>0 then for k:=width_ptr to wp-1 do

the condition on wp is unnecessary.

Response from DRF: In the declarations:

var k:integer; {index for loops}
...
@!lh:integer; {length of the header data, in four-byte words}
...
@!nw:integer; {number of words in the width table}
...
@!wp:0..max_widths; {new value of |width_ptr| after successful input}

I think k should have been 0..max_widths like wp, and then the suspicious check would make sense. Instead, true to form, DEK saved a word of memory by using k for another loop with the range (0..lh+3) and when someone pointed out lh could be larger than max_widths, it was easier to make k an integer rather than clarify the code a little and use a different index variable for that loop.

For what it's worth, lh and nw should both have had type 0..65535 since they're set to lh:=b2*256+b3; and nw:=b0*256+b1; which are both guaranteed to be in that range. There's another case just a bit later in DVIType, exactly like the one Lucas pointed out, Anyway, if this were 40 years ago, I'd militate for the changes I suggest above; now it seems ok to leave it be.

Further down the rabbit hole: In quite a number of places in TeX and friends, there's code like this that does seem necessary in order to protect for-loops from having their “to” value be out-of-range for their index variable. I believe that Hedrick and/or Vax/VMS Pascal optionally enforced this when you turned on some runtime checks. But the old Pascal User Manual and Report from back in the day (as well as the more recent ISO/IEC Pascal standard documents) are pretty clear that first you check if the for-loop is going to happen at all, and then you check that the “from” and “to” values are in range. So, perhaps all the guards scattered about Knuth's code were not supposed to be needed, other than to satisfy a too-fussy compiler?


gftopk.web (section 36): unused char_italic text and macro

From Richard Sandberg, 2024-01-09: On line 1827 of gftodvi.web it reads:

The italic correction of a character will be denoted by
|char_italic(f)(q)|, so it is analogous to |char_width|.
But char_italic is never used in the program so the above paragraph seems misleading.

Response: True, and the @d char_italic below could be removed too. Clearly Knuth just copied this material from tex.web, and figured it wasn't worth editing out the char_italic references, or didn't think of it. Either way, he's consistently declined to remove such unused code/text (see next item for more instances).


gftopk.web (section 36): unused text_file declaration

From Igor Liferenko 2022-04-13: The following code in gftopk.web serves no purpose:

@<Types...@>=
@!text_file=packed file of text_char;

Response: Certainly true, but Knuth has consistently declined to remove unused declarations. Other examples:

No doubt there are others.


For any discussion about these issues, or further reports to be listed here, please use the contact information on the main TeX bugs page here.


$Date: 2024/12/10 17:42:15 $; TUG home page; join TUG/renew membership; webmaster; facebook; x; mastodon.