[XeTeX] PDF properties like title, author and keywords

Peter Dyballa Peter_Dyballa at Web.DE
Wed Aug 13 00:46:05 CEST 2008


Am 11.08.2008 um 17:41 schrieb Ulrike Fischer:

>> Gr\303e" does not tell the option's intention.
>
> It works perfectly with an utf8-source and miktex


Yes, it works correctly. The PDF file contains the string (a bit  
edited by me):

	feff·0047·0072·00c3·bc·00c3·9f·0065

which can be translated to:

	ZERO WIDTH NO-BREAK SPACE
	LATIN CAPITAL LETTER G			G
	LATIN SMALL LETTER R			r
	LATIN CAPITAL LETTER A WITH TILDE   +
	VULGAR FRACTION ONE QUARTER		ü
	LATIN CAPITAL LETTER A WITH TILDE   +
	<8 bit control>				ß
	LATIN SMALL LETTER E			e

In GNU Emacs' *shell* buffer reliably the sequence of  
'üÃ<control>' is reduced to \o303 = \d195 =\xC3, which is in ISO  
Latin encodings often Ã, but since the shell runs in an UTF-8  
environment this single irregular byte is shown in octal.

Adobe Reader 9 interprets "Grüße" as: GrÃ밀쎟e. That's knee-how.  
But I think there is a bug in xdvipdfmx 0.7.3 that creates single- 
byte codes where in UTF-8 should be two. A different and UTF-8 like  
interpretation of the string formt he PDF file is:

	feff·0047·0072·00c3·bc00·c39f·0065

which leads to the HANGUL SYLLABLE MIL and the HANGUL SYLLABLE SSEH,  
i.e., "GrÃ밀쎟e."


Xdvipdfmx, although producing the best results, needs an update  
before being released on DVD as part of TeX Live 2008.
	

With pdfTeX the results are even more ridiculous – already in the  
PDF file! Even when the source file is UTF-8 encoded and the input  
encoding is chosen as utf8 or utf8x.

--
Mit friedvollen Grüßen

   Pete

They that can give up essential liberty to obtain a little temporary  
safety deserve neither liberty nor safety.
		-Benjamin Franklin, Historical Review of Pennsylvania.





More information about the XeTeX mailing list