[XeTeX] searcheable pdf with xetex
jonathan_kew at sil.org
Mon Nov 26 17:24:30 CET 2007
On 26 Nov 2007, at 2:57 pm, jadolov k wrote:
> I am writing a paper with XeLaTeX, and I am using extensively small
> caps letters and old styled numbers. But when I compile the PDF
> output, all the strings shaped with this font features become non
> searcheable, probably because they are not made of true glyphs but
> only of glyph variants which do not correpond to any unicode
> So, how could I make them searcheable? Or, is it a price to pay for
> using the glyph substitution via OpenType?
In principle, such text should still be searchable, as the glyphs can
still be associated with the appropriate Unicode characters. However,
the actual results may depend on the font used, and it's also
possible that xdvipdfmx doesn't handle this as well as it should.
Have you tried a variety of different fonts, from different vendors,
to see if the behavior is consistent in all cases? And have you tried
various PDF readers, too? It's possible that they will have differing
levels of support. This is an area I haven't looked at much, as I
consider PDF to be primarily for viewing/printing; text extraction
and searching has always been problematic, especially once you deal
with non-Latin scripts....
More information about the XeTeX