[XeTeX] Ligatures and searching in PDFs

Paul Foley paul at mises.com
Mon May 10 03:56:14 CEST 2010


Try the following:

\documentclass{article}
\usepackage{xltxtra}
\setmainfont[Mapping=tex-text,Numbers=OldStyle,Ligatures={Required,Common,Rare}]{Junicode}

\begin{document}
Fifty afflicted fjords.
\end{document}

Load the PDF, and search for any of the words.

The "fty", "ct" and "fj" ligatures aren't in Unicode, and the private-use
characters obviously can't be decomposed by the PDF viewer.  The same
problem will obviously occur for variant letter shapes, old-style digits,
etc.

But scanned documents in PDF often have an invisible text layer attached
which can be searched, etc.; is it possible to use the same technique to put
the decomposed letters over the visible private-use characters, so that
documents remain searchable (and copy/paste-able)?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20100510/ca876b01/attachment.html>


More information about the XeTeX mailing list