[XeTeX] Copy and paste of oldstyle numbers of the latin modern font

Ulrike Fischer news2 at nililand.de
Mon May 5 18:34:25 CEST 2008


Am Mon, 05 May 2008 17:44:18 +0200 schrieb Albert Kapune:

>> Hello,
>>
>> if I run the following document and try to copy & paste from the pdf I
>> get:
>>
>> test .....
>> .
>>
>> \documentclass{article}
>> \usepackage{fontspec}
>> \setmainfont[Numbers=OldStyle]{Latin Modern Roman}
>>
>> \begin{document}%
>> test 12345
>> \end{document}
>>
>> Is there something one can do to correct this?
>>
>>
>> This is XeTeX, Version 3.1415926-2.2-0.998.1 (MiKTeX 2.7)
>> This is xdvipdfmx-0.6 svn 663 by Jonathan Kew and Jin-Hwan Cho
>>
> 
> If I compile that example it is shown as
> 	test 12345
> by Acrobat 8. When I copy that string into WORD 2003 it is presented as
> 	test kVkVkVkVkV
> (where each kV combination is a ligature in Arial Unicode).
> 
> WORD’s UTf-8 analysis identifies these ligatures as F731, F732, F733,
> F734, and F735, respectively. All of them are no valid unicode characters.

They are valid unicode chars, but from the "private use area" where
fonts can place various symbols. So the result of copying depends a lot
of the receiving font/application.

The main question is if (and how) it is possible to map them to the
normal numbers for the purpose of search and copy&paste like char U+FB01
(the fi-Ligature) is mapped to fi:

\documentclass{report}
\usepackage{fontspec}
\begin{document}
\char"FB01 
\end{document} 

-- 
Ulrike Fischer 



More information about the XeTeX mailing list