[XeTeX] Stylistic sets: search & copy-paste

Peter Dyballa Peter_Dyballa at Web.DE
Tue Jan 12 10:54:18 CET 2010


Am 12.01.2010 um 01:02 schrieb David J. Perry:

> I too would like some clarification about this.  I am familiar with  
> the CMAP table(s) that are a part of any TrueType or OpenType font,  
> but you seem to be referring to something specifically in the TeX  
> world . . . . or am I misunderstanding?


The latter comes closer to truth...

Vladimir Volovich, who is also on this list, created his CMap package  
almost 10 years ago to solve the problem that when copying from PDF  
output created by TeX and pasting this text into some text application  
“ or » became something completely unrelated and î did not become i +  
^ but something the un-ASCII DOTLESS I provoked. With a "private" CMAP  
TeX's PDF output could be copied and pasted into other applications  
without getting nonsense there. This can be extended in (La)TeX, and  
actually it was, for example for Arabic and Farsi with standard  
(La)TeX. And with "standard (La)TeX" I mean that you need to use  
pdfTeX, dvips and Ghostscript might fail, although I've seen that  
recent Ghostscript version come with many CMAP files. OTOH dvipdfmx,  
the extended version of dvipdfm for CJK use, also learned make use of  
CMAPs. This is inherited by xdvipdfmx. It always includes some  
standard CMAP, I think, and, when needed, can put atop another  
"translation." This would just contain that some many regions are  
unchanged Unicode but this, that, and some other sequence of code  
points encode for other characters.

It's not that different from what font files have. It's an adaptation  
to *TeX's internal mis-use.

--
Greetings

   Pete

UNIX is user friendly, it's just picky about who its friends are.




More information about the XeTeX mailing list