[XeTeX] accented character ṛ within \section{ṛ}

Michiel Kamermans pomax at nihongoresources.com
Mon Apr 26 21:56:21 CEST 2010


On 4/26/2010 12:41 PM, Ross Moore wrote:
> Hi Herb,
>> Just curious... what happens when you try to do search within or a 
>> copy from a pdf which has such combined characters?
>
> PDF has the /ActualText(...)  replacement tagging feature. This allows 
> you to capture a sequence of content characters
> and declare the whole collection to be equivalent to a single (or 
> sequence of) Unicode point(s).

But, that only works if you add an /ActualText command. As far as I can 
tell, using a compound glyph as discussed here will not be a problem in 
a search, *provided* that the software you're using implemented the 
unicode collation algorithm correctly, in which case for this type of 
thing it shouldn't need the /ActualText command for searching to work.

That said, I have no idea how many PDF readers other than Adobe's 
Acrobat actually use a correctly and fully implemented unicode collation 
algorithm.

- Mike "Pomax" Kamermans
nihongoresources.com


More information about the XeTeX mailing list