Descender frequency question

barbara beeton bnb at tug.org
Sun May 5 04:14:57 CEST 2024


On Sat, 4 May 2024, Doug McKenna wrote:

> All -
>
> Thanks for the info.
>
> The reason I asked is because I have written a format-checking, MacOS app that reads in PDF files (conference proceedings papers), built with either LaTeX or Word.  The app digitizes the page images and analyzes the 1pt x 1pt pixels, trying to understand where certain parts of the document are (e.g., title, author, margins, etc.).  Doing so requires analyzing where white space is distributed vertically.
>
> I'm writing an OCR-like algorithm makes certain decisions based on a statistical analysis of descenders and their pixels in order to find the baselines of text.
>
> A related question I have is this: Does LaTeX always used a fixed amount of vertical space between the bottom line of a \title{}, and the top of the first line of the \author{}?  Or does LaTeX stretch things out based on later (or any) text on the page?  If fixed, what is the amount of vertical space it uses?

Assuming a proceedings paper is created using LaTeX, the format of the
first page will almost certainly be defined in a document class.  In my
experience, the space between title and author will be fixed, neither
stretchable nor shrinkable, and if either title or author requires
multiple lines, the space will be measured between the baseline of the
last title line and the baseline of the first author line.  (In other
words, accents on an author name shouldn't make any difference.)  But
the actual measured distance may differ for different document classes.

Separation between components is usually fixed until the end of the
first text paragraph.  After that, if the paper is set with \flushbottom,
vertical space may be increased or decreased as specified in the class
file to force the baseline of the last line that fits to a specified
position on the page.  Such expansion or shrinkage can be applied in
only particular places; the most common are
  - before and sometimes after section headings, usually relatively small;
  - before and after math displays, potetially much more generous;
  - between ordinary text paragraphs, usually smaller than before section
    headings;
  - above, below, and between items in a list, usually comparable to the
    stretch or shrink applied between paragraphs;
  - above a footnote block, if present.

If a page break would leave a large chunk of empty space at the bottom
because the first object at the top of the next page won't fit in the
space still available at the bottom of the broken page, stretch will be
added proportionally as listed above.  There should never be any vertical
stretch between the lines of a paragraph, although some lines may be
forcibly spread apart by a fixed distance if something like a fraction or
other large math expression appears on a line within the paragraph.

If \raggedbottom is specified, all vertical spacing will be fixed at the
distance defined in the class file.

Good luck.
 						-- bb

> Doug McKenna



More information about the texhax mailing list.