Try the following:<br><br>\documentclass{article}<br>\usepackage{xltxtra}<br>\setmainfont[Mapping=tex-text,Numbers=OldStyle,Ligatures={Required,Common,Rare}]{Junicode}<br><br>\begin{document}<br>Fifty afflicted fjords.<br>
\end{document}<br><br>Load the PDF, and search for any of the words.<br><br>The "fty", "ct" and "fj" ligatures aren't in Unicode, and the private-use characters obviously can't be decomposed by the PDF viewer. The same problem will obviously occur for variant letter shapes, old-style digits, etc.<br>
<br>But scanned documents in PDF often have an invisible text layer attached which can be searched, etc.; is it possible to use the same technique to put the decomposed letters over the visible private-use characters, so that documents remain searchable (and copy/paste-able)?<br>