[XeTeX] fontspec and hyperref
Michiel Kamermans
pomax at nihongoresources.com
Thu Sep 24 06:49:18 CEST 2009
Jonathan Kew wrote:
> (It might be worth looking into the polyglossia package to manage
> language/script switching, but even without that, I think this should
> help.)
The problem is that I need switch within language/script. I need some
way to be able to indicate that Chinese glyph A is in extension block A
and thus needs font X, while Chinese glyph B from extension block B
needs to use font Y, simply because these blocks are so vast there are
no good looking unified fonts for them.
So far, it seems a toss-up between using a script to check which glyph
is from which unicode block and prepend with the appropriate fontspec
code where needed on the one hand, and relying on interchartoks because
I'm using CJK and then explicitly adding fontspec codes where I need to
override what the interchartoks behaviour is on the other.
Sadly, the first approach just breaks hyperref. While your suggestion of
defining new font families and then using their shorthand command works
in terms of no longer having xetex generate a terminal error, hyperref
still "throws away" the font change instructions, and so any text that
lacks glyphs in Palatino Linotype ends up either missing or using
'unknown' glyphs.
The second approach sadly also fails, because there doesn't seem to be a
way to dynamically turn interchartok processing on/off, which is a
problem because it looks like the interchartok behaviour is suppressing
fontspec instructions. If I issue a font change with fontspec, the
interchartoks instructions seem to win from the fontspec instruction,
and text that was supposed to be using font X according to the fontspec
command I issued actually ends up using the font that I told
interchartoks to use when going from class 0/255 to 1/2/3, or vice versa
(if there is a way to dynamically toggle interchartoks behaviour on/off,
then this might be the a solution for my problem, but I couldn't find
any mention of it after googling for anything interchartoks related)
It might sounds like an obscure problem, but for CJK there's a very real
problem in that even though the CJK languages all use the same unicode
glyphs, glyphs can look radically different between the various
languages. This makes it impossible to use, for instance, a more
complete Chinese font when the text also contains Japanese, because a
lot of the glyphs will be plain wrong. So, essentially, neither (plain)
interchartoks, nor polyglossia, nor exCJK is quite specific enough from
what I can tell from the documentation on each of them. None of them
seem to offer me the ability to say "for glyph ..., use font ..." or
even a more general "for unicode block ..., use font ...". Relying on
being able to only specify behaviour for language, or script, is simply
not good enough: unicode (and consequently, fonts that implement parts
of it) is divided into more specific categories, multiple of which can
be used by a single script or language =(
To make this problem more obvious, a bit from the material I'm working on:
------
stroke drawing order examples
㇄ top to bottom, then left to right, as one stroke 兦, 山
㇅ left to right, then top to bottom, then left to right 凹
㇇ left to right, then a hook curving down left 水
㇆ left to right, then top to bottom with a serif to the upper left 刀, 方
left to right, then top to bottom 囗
乚 top to bottom, then left to right with a serif upward at the end 礼
乁 top left to right, then curving down right with an upward serif at
the end 虱,丮
------
In unicode aware email clients, the above data shouldn't really pose any
problems except perhaps the character 囗.
This seems reasonably innocent data, but I'm wracking my brain on how to
typeset this in a way that doesn't lead to at best a character not being
drawn, and at worst causing xetex to throw up a terminal error.
On the first line, the first character is from the unicode CJK STROKES
block, the second character is from CJK UNIFIED IDEOGRAPHS, but is rare,
and thus not found in most Japanese fonts (thus requiring an explicit
fontchange), and the third is also from the CJK UNIFIED IDEOGRAPHS
block, but common and found in any Japanese font. The example character
for the fifth stroke is one from CJK UNIFIED IDEOGRAPHS EXTENSION B, and
isn't even found in most "complete" Japanese, Chinese or even "Unicode"
fonts like Code2000/1/2 or Arial Unicode, instead being only found in
special fonts that implement this particular extension block, because
it's huge (containig close to 43,000 glyphs). However, *all* of these
glyphs would fall under the "CJK", "Chinese", as well as "Japanese"
language/script headers... so this really is a very serious problem for me.
If there are alternative ways to do what I would like to do, then I'll
gladly use those instead, but by now I'm kind of running out of ideas on
how to get around the lack of being able to rely on font linking (of the
explicit, user indicated link order type) to get glyphs that are missing
in one font being substituted for by glyphs from another font.
A last ditch attempt would be to create a huge list of
XeTeXinterchartoks definitions for all the various characters that fall
under CJK, but with over 50,000 characters to tag, this would be
maddness O_o
- Mike
More information about the XeTeX
mailing list