[XeTeX] Unicode in bib file?

Jonathan Kew jonathan_kew at sil.org
Fri Jun 11 14:04:41 CEST 2004


On 11 Jun 2004, at 12:51 pm, J P Blevins wrote:

> Let me ask another question that others must have grappled with 
> already.
>
> Since the input to XeTeX is unicode, umlauted characters like ü must be
> entered rather than the usual TeX \"{u}, etc. But how does one handle
> accented characters in references? Running BibTeX on a .bib file with
> \"{u} passes the wrong character to XeTeX, and gives errors like:
>
> LaTeX Font Warning: Font shape `OT1/AdobeGaramondPro/m/n' undefined
> (Font)              using `OT1/cmr/m/n' instead on input line 53.
>
> Yet converting the .bib file to UTF-8 and entering accents directly
> defeats BibTeX:
>
> This is BibTeX, Version 0.99c (Web2C 7.5.2)
> The top-level auxiliary file: aik2.aux
> The style file: lsa.bst
> Database file #1: bibx.bib
> Sorry---you've exceeded BibTeX's buffer size 5000
> Aborted at line 0 of file bibx.bib
> (That was a fatal error)
>

It seems odd that this would exceed the buffer size. Is BibTeX not even 
8-bit clean? Surely it must be. I would have expected it could handle 
UTF-8 data as streams of bytes, similarly to any 8-bit encoding, 
although it wouldn't have an accurate idea of "real" Unicode characters 
once you're outside the ASCII set.

Could it be that in the course of converting to UTF-8, you also 
converted line-ends to a convention that your BibTeX doesn't like?

> Running BibTeX on files with \"{u} and then making the unicode accent
> substitutions in the .bbl file provides a work-around, but has to be
> repeated each time you run BibTeX.
>
> Any better ideas? (I suppose that one could define a script that
> substitutes unicode glyphs for some of the most common TeX expressions,
> including those for dashes, quotes, accents, etc.)

You could try definitions such as

	\def\"#1{#1^^^^0308}

although this depends whether the fonts you're using provide decent 
support for the Unicode combining marks (as opposed to using 
precomposed accented letters).

Jonathan



More information about the XeTeX mailing list