[XeTeX] how to do (better) searchable PDFs in xelatex?

Mojca Miklavec mojca.miklavec.lists at gmail.com
Mon Oct 15 11:27:48 CEST 2012

On Mon, Oct 15, 2012 at 10:32 AM, Joe Corneli wrote:
> but fi => fi, despite the latter
> copy-pasting as "fi".  Somehow this does seem like it's just an
> oversight on the part of the font developers.

It can also be an oversight on the part of PDF viewer developers.
Apple decomposes all accented Latin characters for example ("C"
followed by "composing caron" instead of just "Č"). I always found
that horribly annoying. On the other hand it had zero problems with
infinity, other math symbols and Greek letters from pdfTeX-generated
documents, so I usually had no problems copy-pasting mathematical
formulas. I only had to add an extra pair of dollars to get a nicely
formatted formula.

You should try "pdftotext", Adobe Acrobat, Apple's Preview (if you
have access to it), some free viewers, ... and the results are often
different. In my opinion any decent PDF viewer should be able to
convert the "fi" ligature into two separate letters when copy-pasting.
This cannot be font designer's fault.


PS: It is 2012 ... and since a couple of recent months Opera (web
browser) still fails to display the most basic accented Latin
characters (like š & ž) even when page encoding is properly set. Yes,
it's unbelievable.

More information about the XeTeX mailing list