[XeTeX] searcheable pdf with xetex

Jonathan Kew jonathan_kew at sil.org
Mon Nov 26 17:24:30 CET 2007


On 26 Nov 2007, at 2:57 pm, jadolov k wrote:

> I am writing a paper with XeLaTeX, and I am using extensively small
> caps letters and old styled numbers. But when I compile the PDF
> output, all the strings shaped with this font features become non
> searcheable, probably because they are not made of true glyphs but
> only of glyph variants which do not correpond to any unicode
> character.
> So, how could I make them searcheable? Or, is it a price to pay for
> using the glyph substitution via OpenType?

In principle, such text should still be searchable, as the glyphs can  
still be associated with the appropriate Unicode characters. However,  
the actual results may depend on the font used, and it's also  
possible that xdvipdfmx doesn't handle this as well as it should.

Have you tried a variety of different fonts, from different vendors,  
to see if the behavior is consistent in all cases? And have you tried  
various PDF readers, too? It's possible that they will have differing  
levels of support. This is an area I haven't looked at much, as I  
consider PDF to be primarily for viewing/printing; text extraction  
and searching has always been problematic, especially once you deal  
with non-Latin scripts....

JK



More information about the XeTeX mailing list