[XeTeX] Conflict between xunicode and fontspec?

Fri Feb 8 10:40:20 CET 2008

Am Fri, 8 Feb 2008 00:10:09 +0100 schrieb Peter Dyballa:

>> I don't know if this is really your problem. Perhaps you are using
>> some other package which maps your input to commands lile \^{u}. As
>> Bruno mentioned there is xunicode which handles the \^{u} case so
>> you should try it. If it doesn't help: a complete minimal example
>> and the resulting pdf would probably help to find what's going
>> wrong.
> 
> In GNU Emacs I can input the characters as themselves, so xunicode is  
> not needed. 

It has nothing to do with the editor or the input! Even the best editor
does not take a char in your tex-file and puts it unchanged in a pdf.
tex-files are processed by latex/xelatex, then by drivers like dvips or
xdvipdmfx, then by readers like a pdf-reader, then by routines of your
OS when you try to copy and paste...  and during all this processing the
chars of your inputs are moved around a lot and a lot of weird things
can happen.   

An "A" in your input can lead to a neat "A" in a pdf that copies fine.
But it can also lead to the Gettysburg address:

\documentclass{article}
\begin{document}
\begingroup
\catcode`\A\active
\def A{Four score and seven years ago \ldots}
A
\endgroup
\end{document}

> I started LaTeX at a Sun keyboard with compose key –  
> there was no real need to learn using these macros (OK, TeX was also  
> patched to accept direct input).

You don't need to learn "these macros" (I guess you mean \^{u}). But you
should be aware that the fact that you don't use them in your input
doesn't mean that the macros aren't used at all. An input like û can
lead to such macros!  

>>> At least today I cannot copy
>>> composed Unicode characters from any PDF file.

>> Does "any PDF" include arbitrary PDF's from the net?

> Yes! I have some German PDF documents, often created without pdfTeX  
> or XeTeX. (The worst of them is from pdfFactory 2.42 (Windows XP  
> Professional German) – it seems to use some encryption or scrambling  
> that what you see & copy is not what you paste & see. The names of  
> the ugly ragged fonts are hidden, except that they are TT.) But now  
> I've found a few causes/culprits ...
> 
> 	First problem is GNU Emacs – it does not like to compose.
> 	Second problem is Apple's PDFKit – it decomposes.
> 	Third problem are TextEdit and Adobe Reader – they behave correctly  
> (IMO).
> 
> When I copy from TeXShop (my preferred PDF viewer) to TextEdit the  
> characters are OK, composed as in GNU Emacs, i.e. the XeLaTeX source  
> file, or as displayed in the PDF viewer. Pasting the same into GNU  
> Emacs shows it decomposed.

On the whole your description sounds as if your pdfs doesn't use the
correct glyphs. In a correct pdf "û" is *not* a composed char but the
simple char "ugrave". I don't see any reason why and how an application
should decompose such a glyph. I think you should really make a complete
minimal example. 

-- 
Ulrike Fischer