[texhax] Line Spacing Calculations

Nate Bellowe nathanb at windward.net
Fri Aug 26 20:16:35 CEST 2016


Hello! Sorry to intrude on your dev mailing list, this may seem a little off topic, but I have been banging my head on this forever, and I know you guys are the experts at typography!

I was wondering if the developers that have worked on this before would like to talk some about this, I'd really appreciate it!

It may seem a little off-topic at first glance, but I think you guys will be a great resource if anyone is willing!

Basically, I have some questions on how to calculate the line spacing between lines, when parsing and rendering a docx file.

My requirement is to exactly match Word, not necessarily the OOXML spec, in the spacing between lines in a simple paragraph when parsing and rendering a docx.

In order to try to do this, I have built a tool to analyze the differences between my layout and Word's layout. To do so it does the following:

- First it generates a (or many) docx files.
- Next it creates pdfs from the docx files. It uses Word to render the docx to PDF, and my program to render the docx to PDF. "word.pdf", and "me.pdf"
- Then it analyzes the resulting PDFs for differences in layout.

So, my tool would say:

- Create a document "template.docx" with 1000 "a" characters in a single run of text with the same properties.
- Make a "word.pdf" and "me.pdf" from this docx
- Calculate info from the pdfs, in particular, calculating the line spacing in terms of the calculated leading between a lines ascent and the previous lines descent (our (Ascent + Descent) are identical-ish, so all that differs is the whitespace between lines). I often think of it as the lines whitespace...

This tool showed me that the leading varies greatly from font to font.

To depict this, I used the tool to make thousands of these comparisons, in particular generating for:

- For each font in system
- For "a", "y", and a mix of letters and spaces.
- For different font sizes.
- For different line spacing types (Single, One and a half, and Double)

I was hoping to find groupings, such as "this type of font has 1.3 times my calculation of leading".

I was able to conclude far less than I had hoped, and was wondering if you could help me further with the issue of calculating line spacing. I'm providing you with a file that is best downloaded and opened using the filters in the header row. Note that its not totally complete, there are missing entries, but I doubt they will be a problem for anyone, and I'm going to regenerate it soon but its pretty slow, so I'm finishing up some changes to it first.

Here is a comparison of the layout of our software, vs the layout of Word's for every font installed on my system, etc. (attached and linked)
https://drive.google.com/file/d/0BzQpUdPjnJUUclRXVXFkaEh3Mms/view?usp=sharing

I'm not positive, but I believe the issue could be one of the following:

- Word is using a different process than we are to calculate the "leading" of a font. We don't parse the font files ourselves, instead rely on libraries to get font sizing information, and perhaps in the "world of font files" I am missing something, and word is parsing the fonts directly and differently.
- Word has some sort of lookup table that handles groups of fonts, or an algorithm, that scales a fonts leadings up or down based on some criteria I am unaware of.
- Word is using an additional criteria besides leading, ascent, and descent, to determine line spacing.

Please feel free to email me at nathanb at windward.net<mailto:nathanb at windward.net>

Thank you so much for your time!!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/texhax/attachments/20160826/fda82451/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: actual.xlsx
Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size: 148563 bytes
Desc: actual.xlsx
URL: <http://tug.org/pipermail/texhax/attachments/20160826/fda82451/attachment-0001.bin>


More information about the texhax mailing list