[XeTeX] supplying missing glyphs?

Dave Howell groups.2009a at grandfenwick.net
Sat Jan 8 02:30:16 CET 2011


On Dec 6, 2010, at 3:35 , Michiel Kamermans wrote:

> Adam,
> 
> On 12/2/2010 4:20 PM, Adam McCollum wrote:
>> Dear list members,
>> I like the Hoefler Text font very much, but I see that it apparently doesn't have glyphs for a number of letters with diacritics, which I sometimes need for transliteration; please see the example text below. I've tried both unicode entry and the TeX way for entering these. Is there any way at all to "fake" these glyphs or otherwise supply them?
> 
> My thought would be to simply use a different font for those characters, by making use of the interchartok functionality of XeTeX.

I can't find any documentation to explain what this "interchartok" functionality is.  It looks like it might be an alternative (and possibly better) way of dealing with my problem than the solution I reached, which I think is very similar to the one Adam described. 

He was missing roman letters with diacriticals. My problem is a manuscript which requires characters from all over the place. There are a number of special symbols, upper and lower case Greek, and various Chinese ideograms. 

I should explain upfront that I'm not the original author; I'm a professional book designer and typesetter. Among other things, it means that "try using a different font" is not helpful advice. I have a large library of "Pro" fonts, and I know how to evaluate which ones have various glyphs. The final book was typeset in Garamond Premier Pro. The only fonts I had that were more complete were esthetically unacceptable. Also, the book in question (approx. 2400 typeset pages of medical textbook) is basically finished. I was able to find a couple of different ways to solve the multi-glyph problem. What I'd like to figure out is if there's a *better* one than the one I used. 

The "old school" way was to reference such characters with commands like \alpha (α), \leftarrow (←), \circlenum{1} (①), and \chinesechar{592A}
. However, in these modern Unicode-enabled times, it is much easier for both the author and myself to just put the Unicode characters themselves into the manuscript. 

We're both using the Mac platform, and all the usual OSX programs use OSX's type libraries for rendering. One of the consequences of this is that if you're merrily typing along in some pretty but glyph-poor font, and you insert something that the current font doesn't have, the OS rendering code substitutes the glyph from some other font. I'm not sure how it decides *which* other font to use, but for making a manuscript, that's really convenient. The "ǎ" might look a little funny, but at least it's there. 

Of course, such haphazard substitution isn't acceptable for professional use, so I don't mind that XeTeX doesn't let AAT select glyphs for it. However, I was surprised that I couldn't find any pre-existing way to construct a composite font. 

I used to take care of this sort of thing with virtual fonts. I have a lot of older works with customized *.vf and *.tfm files that map in non-lining numbers, trademark and copyright symbols, and such; and a lot of scripts to transfer PostScript fonts with expert sets over into the TeX font system. I certainly appreciate XeTeX! But I couldn't find any way to specify "fallback" fonts for when the main font was missing a glyph. 

I ended up dealing with it in two ways. For the Chinese characters, I just defined their own font:

	\newfontfamily{\Chinese}[Scale=MatchLowercase]{STSong} 

This meant changing the manuscript from 
	. . . Taixi, “supreme stream” (太溪) . . .
to
	. . . Taixi, “supreme stream” ({\Chinese 太溪}) . . .


However, it seemed a little silly to have to explicitly specify this when XeTeX has the information it needs to handle this without requiring extra commands. For the various symbols and Greek letters, I took a different approach, and made all the necessary characters active, like this:
	\catcode"221E=\active 
	\def∞{\invokeglyph{221E}} 
The infinity symbol is Unicode point 221E, so I made it active so I could define it, then did so. Here's my "invokeglyph" code:

\newcommand\invokeglyph[1]{% 
	\begingroup 
	\@tempcnta=\XeTeXcharglyph"#1\relax 
	\ifnum\@tempcnta>0% 
		\char"#1% 
	\else 
		\ifx\csname symbolfont\endcsname\relax\else %if \symbolfont is defined 
			\csname symbolfont\endcsname 
			\@tempcnta=\XeTeXcharglyph"#1\relax
		\fi 
		\ifnum\@tempcnta>0 
			\immediate\write17{Used symbolfont for Unicode character #1 ^^M}% 
			\char"#1% 	
		\else 
			\ifx\csname lastresortfont\endcsname\relax\else 
				\lastresortfont 
				\@tempcnta=\XeTeXcharglyph"#1\relax 
			\fi 
			\ifnum\@tempcnta>0 
				\immediate\write17{Used lastresortfont for Unicode character #1 ^^M }
				\char"#1% 
			\else 
				\immediate\write17{Glyph Warning: No glyph found for character #1^^M }%
			 \fi
		 \fi
	 \fi 
	\endgroup
} 

Basically, \invokeglyph checks to see if glyph #1 is in the 'normal' font. If so, then just output it. If not, then it switches to "\symbolfont" and checks THAT. If it's still empty, then it switches to "\lastresortfont" and checks THAT. For this particular book, I picked "Minion" as having greek letters that were an acceptable match to Garamond Premier Pro. Normally, I use "Menlo" as the lastresortfont, but I found in this case, it was also missing too many glyphs. I'd originally selected Hiragino Minchu Pro for the chinese, but, oddly, it was *also* missing characters. There was a single Chinese glyph used in the manuscript that was missing. So STSong became both the Chinese font and the lastresortfont.

This allowed the manuscript to be much cleaner and more elegant, because we could just put the actual Unicoded symbols in the text, and I could get the final camera-ready copy to supply those characters in the font of my choice. However, the behind-the-scenes code seems really lumpy and inefficient, in part because I have to assign and define active catcodes for every character I intend to use. This is why I didn't try doing it with the Chinese characters; I figured XeTeX would be unhappy if I tried to define 40,000+ new commands. 


So! What does "\XeTeXinterchartoks" do, exactly, and is it a better tool for this than making every 'special' character require executing my "invokeglyph" macro? Or is there some other clever thingamabob that would work even better? 


More information about the XeTeX mailing list