[XeTeX] Hyperref \hyperlink and \hypertarget not working with accented characters

Philip TAYLOR (Webmaster, Ret'd) P.Taylor at Rhul.Ac.Uk
Wed Nov 2 14:11:04 CET 2011



Heiko Oberdiek wrote:

>>      ä, ö, ü, ß
>>      \bye
>>
>> is anything other than a normal, everyday, document ?
>
> The mail header of your posting, send by the list server
> contains:
>    Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>
> Then I must have received a quite abnormal mail out of norm?

No, Heiko, what you received was my mail client trying to deal with
your mail client in an encoding acceptable to both.  My original
message, with l-bar and o-ogonek, was encoded as :

	text/plain; charset="utf-8"; format=flowed

your reply, with the l-bar and o-ogonek replaced by paired
interrogatives ("arbitrary rubbisch", to use your own words),
was encoded as :

	text/plain; charset="iso-8859-1"

Thereafter, my mail client, realising that your mail client
still lived in the dark ages, fell back on a legacy encoding
that was acceptable to both.
>
>> From the last mails I found 477 lines with:
>    "Content-Type: text/plain; charset="..."
>
> us-ascii: 237
> UTF-8: 114
> ISO-8859-1: 60
> ISO-8859-2: 44
> windows-1252: 21
> windows-1256: 1

Doutbtless if you looked harder, you could find some in BIG-5
as well.  That does not mean that the world has not moved on
(at least, parts of it; others, despite their diacritic-rich
heritage, seem remarkably and inexplicably opposed to progress).

> Of course, also a font in T1, ... encoding can be used.
> Or the input encoding might differ from the font
> encoding by mapping via macros, ...

We could even agree, Humpty-Dumpty-like, that when I write
the letter "e", it means exactly what I choose it to mean,
neither more nor less.  So today, I could choose it to mean
o-ogonek; tomorrow, l-bar.  The world is one's oyster, if
one is Humpty Dumpty.
>
> Back to XeTeX:
>
> Byte string means that the string consists of bytes 0-255 (or 1..255).
> Can you write them with XeTeX in a file or use as destination names
> without using a different encoding?

I do not understand the question.  There /is/ no "encoding" in a
byte string; it is a byte string, by definition.  What am I missing ?

** P.


More information about the XeTeX mailing list