[XeTeX] Re: XeLaTeX and jurabib

Jonathan Kew jonathan_kew at sil.org
Sat Oct 2 13:28:29 CEST 2004


On 2 Oct 2004, at 11:19 am, Simon Spiegel wrote:

>> You do need to replace all -- by –, — by —, ` by ‘ and ' by ’ etc. 
>> Same
>> reason: XeTeX nees proper UTF-8 to work. But in that case
>> utf8accents.sty cannot help.
>
> That's a bit annyoing. As I said, I used UTF8 before, 
> \usepackage[utf8]{inputenc} handled this properly...
>

To be more accurate: this was nothing to do with the use of UTF8 or 
with any inputenc package. The en- and em-dashes and quotation marks in 
standard TeX fonts are handled by ligature rules associated with the 
specific fonts (the lig/kern programs in the .TFM files). This is 
invisible to macro packages such as inputenc or utf8accents.

The equivalent behavior when using AAT/OT fonts directly in XeTeX would 
be achieved by including these ligature rules in the font tables, but 
that is not something you can expect to find in any standard font; this 
input convention is a TeX idiosyncracy that Knuth devised to work 
around the limitations of the ASCII character set.

In principle, it would be possible to use tricky "active-character" TeX 
macro programming to achieve the same effect with other fonts, but I 
wouldn't recommend this: aside from the difficulty of writing the 
macros in the first place, they'd probably have undesired interactions 
with other macros, etc.

A possible future solution may be the use of "font mappings", a feature 
that Ross has requested and that may appear in a future version of 
XeTeX. This would provide, in effect, a way to layer additional 
behavior such as these ligature replacements onto existing AAT/OT 
fonts. But it doesn't exist yet.

For now, if you have text that uses the '---' and similar conventions, 
a trivial script in Perl (or sed or another tool) could be used to 
convert the source to standard Unicode characters.

Jonathan



More information about the XeTeX mailing list