[XeTeX] font license

Douglas McKenna doug at mathemaesthetics.com
Sat Sep 6 20:09:29 CEST 2014

Lorna Evans wrote, about trying to get at the license information in the Nikosh Bengali font -

> ...it quits at this point.

FWIW, it appears (I'm not entirely sure, but someone may be able to confirm) there's something wonky/corrupt in the Nikosh font,

see <http://www.bpdb.gov.bd/bpdb/index.php?option=com_content&view=article&id=239>

in particular its internal 'name' table, where the license and other information strings are kept in various forms, based on platform, encoding, language, etc.

If I install this font into my Mac's Fonts folder, the Mac's FontBook shows the English license information in a string called "Description", followed by lots of Asian characters in a different string called "License."  The long Asian text then lapses back into English text that repeats the start of the Creative Commons license, but then cuts it all short in mid-sentence.

According to Apple's documentation on the 'name' table, see
regardless of the platform or encoding, each text string is stored with a length field that always counts bytes, even if the string is stored as UTF-16.  The TrueType/OTF spec, at Section, says the same thing: it specifies a byte count, regardless of bytes per character.

Internally in TTF/OTF fonts, platform 0 is Unicode, platform 1 is the Mac, and platform 3 is Windows.  So the Mac's FontBook should be ignoring all the strings in the 'name' table for platforms 0 and 3 and using strings for platform 1.  Peculiarly, the Asian text the Mac is displaying is internally stored as for Windows only.

After parsing the internal data in the table with some of my own tools, I've found that the license string byte count for platform 0 (Unicode) is almost the same as the byte count for platform 1 (Mac), which is half the byte count for platform 3.  But text for the Unicode and Windows platforms is stored as two-byte UTF-16, whereas for the Mac it's one-byte ASCII/MacRoman/UTF-8/whatever.  Therefore it looks like the count for the Unicode platform License string is half what it should be (likely someone counted characters, not bytes).  And it appears only half is stored in the data.  Whatever, something is definitely out-of-whack.

The number of strings is different for each of the three platforms, which is also sloppy/weird/corrupt.

Other of the font's tables seem to parse okay, and the license flag bits in the 'OS/2' table show up as

    restricted license embedding
    preview and print embedding
    editable embedding
    no subsetting
    bitmap embedding only

which confirms an earlier reply. OTOH, perhaps there are other problems in this font's data.

Doug McKenna
Mathemaesthetics, Inc.

More information about the XeTeX mailing list