About boundary characters

Didier Verna didier at didierverna.net
Thu Sep 19 12:13:55 CEST 2019

Doug McKenna <doug at mathemaesthetics.com> wrote:

> it seems that a boundary character is used to prevent ligatures and
> kerns from occurring when two or more adjacent characters are in
> different fonts.

> There's no express metrics stored for a boundary character per se in
> the TFM, but if it's a legal character code (between 0 and 255 for
> TFM), then presumably that character in the font can have metrics,
> usually of zero width, but not precluded from having non-zero width.

  I found 24602 fonts in TeX Live 2019 with a non-zero width RBC. Here
  are a couple of examples.

Comfortaa-Bold-LGR (looks like it's a real character; it also has an height):
CODE                  = 0
DEPTH                 = 0
HEIGHT                = 339739/1048576
ITALIC-CORRECTION     = 27263/524288
WIDTH                 = 409993/1048576

In this font, there are kerning instructions with both this character at
the left and right position. So I'm wondering whether it is really
appropriate to call it "right" boundary character.

There is one ligature with it at the left position. Wait, the right
boundary character, at the left position?? ;-):
0 . 45
PASS-OVER     = 0

There is also one ligature with it at the right position. Wait, "don't
delete the RBC", which is not supposed to be typeset anyway??
115 . 0
COMPOSITE     = 99
PASS-OVER     = 0

This font is for greek. I read something about different forms of
sigmas, which may be related. I still don't see why the RBC would have
an height, unless it is actually typeset in some circumstances (unless
of course, the character metrics doesn't make any sense, or doesn't
really matter).

Apart from that, the vast majority of the other fonts seems to provide
RBCs with widths, but no height or depth. Random example:
tfm/public/poiretone/. These ones have kerning instructions for the RBC.
So now, my suspicion is that even if the boundary characters are not
typeset, their width is taken into account in the output.

> Because of that little non-orthogonal problem, there's the TFM font's
> so-called "false" boundary character, which is synthesized when the
> TFM file is read in. The false boundary character is the boundary
> character, unless the boundary character's width is non-zero, in which
> case the false boundary character is set to a not-a-character value.

  Oh my.

Resistance is futile. You will be jazzimilated.

Lisp, Jazz, Aïkido: http://www.didierverna.info

More information about the texhax mailing list