[XeTeX] XeTeX in lshort

Mon Sep 27 01:16:45 CEST 2010

Abstract: lshort needs a chapter/section about Unicode on its own.
>From what I experience here, a lot of TeX users are so brain-washed to
use \v{c} and alike that they don't even realize that it is also
possible to use Unicode (or other 8-bit encodings for that matter).
First: this is unrelated to XeTeX; UTF-8 can/should just as well be
used in pdfLaTeX. Second: I'm not sure if a special section is needed
to mention all the zillions of methods to enter Unicode characters,
though that could be covered in a separate document.

On Sun, Sep 26, 2010 at 22:08, Tobias Schoel wrote:
>
> Agreed. But one should at least give a reference link to information about
> how to input Unicode in Windoof, OS X and Linux respectively. There is no
> advantage in telling the people in lshort: "There is also XeLaTeX, which
> lets you input everything in unicode and use any OpenType or TrueType font
> on your system.", if you don't tell them how to do this or at least where to
> find information about it. These people will simply say: "What the heck is
> Unicode again? I simply press the keys on my keyboard."
>
> The "usual" Windows user has no to hardly any knowledge of input methods
> other than what he is used to. Consequently he won't see any advantage of
> _being allowed to_ enter any unicode character, if he isn't _able to_.

Wait. Unicode is not only about US users who need to typeset an accent
every now and then, but also about users who know how to use their
local keyboard, but keep using cp-1250, cp-1252, ...

Maybe I'm wrong, but (talking about a couple of years ago) I found it
much more difficult to actually *save* the file in proper encoding
(with editors defaulting to some random local encoding) than to
properly enter the character that I needed for whatever reason.

Our keyboard on windows is capable of producing most accented latin
letters, but I don't think that anyone would want to explain how to
use every single keyboard in that short section.

There are two kinds of non-ascii character usage: writing in native
language and writing foreign names. The second is a bit exceptional
and can often be dealt with copy-paste\footnote{I maintain the package
with hyphenation patterns that needs a lot of different Unicode
characters for various reasons and I have created my own keyboard
layout, but I don't have the slightest idea how to input any accented
character apart from those used in my language + German; neither in my
OS nor in my text editor; copy-paste fully serves the purpose; on
windows we had a labeled keyboard with dead keys, but now I do survive
without}. Those who need to type characters from local alphabet
usually know how to do that. After all, they need to use other
applications and their keyboard is usually configured properly.

But all these users don't necessary know anything about file encodings
(and why should they?). They just use whatever encoding works for
them. I kept using 8bit encoding for a long time after I already knew
that UTF-8 was a better choice just because editors had enormous
problems with Unicode.

My favorite TeX editor (WinEDT) didn't support Unicode at all some
years ago, even (g)VIM *still* nowadays simply defaults to cp-1250 and
it is pretty non-straightforward for a new user to convince it to use
UTF-8. My teacher who uses Mac opened a document in UTF-8, edited a
whole bunch of stuff and stored it under MacRoman encoding,
irreversibly (even though Macs are well known for their
"Unicode"-awareness and TeXShop is considered to be a solid editor,
but well ... the defaults are still 8bit).

Mojca