[XeTeX] Ugly output with ngerman when using quotation

Ross Moore ross at ics.mq.edu.au
Mon Feb 7 22:54:07 CET 2005


Hi Julius, Jonathan, and others.


On 08/02/2005, at 2:36 AM, dschuli at gmx.net wrote:

> Hello Ross, Jonathan and fellow XeTeXers,
> I've compiled some LaTeX files that test all the quotation marks I 
> could find.
> Included is frenchTest which includes typical french characters and 
> the same quotations;
> ngermanTest which includes all the german umlauts and quotations and
> germanWObabel without babel ngerman option.
> Problems occur *only* with babel german/ngerman.
> Corruption *only* occurs with straight quotations, e.g. Susy said: 
> "Shut the door.".
> All other quotation marks display correctly.

What you are seeing is the way " is defined by Babel, to produce
umlauts and other special effects.
e.g.  "u --> ü  and  "S --> SS   are quite intentional.

Of course the former is quite redundant, indeed wrong, in a 
UTF8-encoded file
as used with XeTeX.

Babel also does other things, such as define the names to use with
document section-parts, and the names of the months, perhaps
for more than one dialect:

e.g. in ngerman.ldf  (Austrian dialect):

\@namedef{captions\CurrentOption}{%
   \def\prefacename{Vorwort}%
   \def\refname{Literatur}%
   \def\abstractname{Zusammenfassung}%
   \def\bibname{Literaturverzeichnis}%
   \def\chaptername{Kapitel}%
   \def\appendixname{Anhang}%
   \def\contentsname{Inhaltsverzeichnis}%    % oder nur: Inhalt
   \def\listfigurename{Abbildungsverzeichnis}%
   \def\listtablename{Tabellenverzeichnis}%
   \def\indexname{Index}%
   \def\figurename{Abbildung}%
   \def\tablename{Tabelle}%                  % oder: Tafel
   \def\partname{Teil}%
   \def\enclname{Anlage(n)}%                 % oder: Beilage(n)
   \def\ccname{Verteiler}%                   % oder: Kopien an
   \def\headtoname{An}%
   \def\pagename{Seite}%
   \def\seename{siehe}%
   \def\alsoname{siehe auch}%
   \def\proofname{Beweis}%
   \def\glossaryname{Glossar}%
   }
\def\month at ngerman{\ifcase\month\or
   Januar\or Februar\or M\"arz\or April\or Mai\or Juni\or
   Juli\or August\or September\or Oktober\or November\or Dezember\fi}
\def\datengerman{\def\today{\number\day.~\month at ngerman
     \space\number\year}}
\def\datenaustrian{\def\today{\number\day.~\ifnum1=\month
   J\"anner\else \month at ngerman\fi \space\number\year}}


Thus it does make sense to load Babel, with language options,
as well as XeTeX.

However, you will need to deactivate the active characters,
if you want to use UTF8 quote characters in their natural form.



> This means:
> - often the character following a closing quotation mark is lost,
> - sometimes a opening quotation mark is turned into S,
> - "text" i is a problem
>
> I hope Ross will fix this soon. I hope it's not too much work!
> In the meantime I use the different quotations.


Babel has an internal macro for deactivating active characters:

\makeatletter
   \bbl at deactivate{"}
\makeatother

However, there is a catch:  you cannot deactivate " unless it
is active already, and that doesn't happen until \begin{document}.

So one solution is to add this coding so that your document has:

\begin{document}
\makeatletter
   \bbl at deactivate{"}
\makeatother



A better solution, IMHO, which applies to *all* characters that Babel
may activate, is to forget about  \bbl at deactivate  and load the
packages this way:


\usepackage{fontspec}   %  or  \usepackage{utf8accents}
\makeatletter
   \AtBeginDocument{\let\bbl at activate\@gobble}
\makeatother
\usepackage[ngerman]{babel}


This should work with current versions of Babel
and utf8accents.sty (which is loaded by  fontspec.sty ).

Beware, there may still be aspects of Babel usage that
I've not taken into account in this diagnosis and "fix".

So before I incorporate a way to build this "Babel off-switch"
into (the next version of) utf8accents.sty, would people
who regularly use Babel extensively please test it and
report any problems.


>   ... Actually comes in handy that we have a choice.

Yes. With this in mind, is is arguably better that I do *not*
incorporate the "fix" into  utf8accents.sty  as this would
then impose it always when Babel is loaded.
It may be that someone wants to keep using the active form of "
(e.g., with older manuscripts) and get proper accents.
This would require different coding to make it happen.

So I would not encourage anyone to pre-empt me and include
the above ideas into any other package. It's important to
get a response from people who want to use both XeTeX and
Babel together, so that we can work out a way to keep
everyone satisfied, as much as possible.


>
> ;-)
>
> Julius


BTW,
would Hans van Maanen please test this with Dutch,
to see if it helps there at all.



Hope this helps,

	Ross


------------------------------------------------------------------------
Ross Moore                                         ross at maths.mq.edu.au
Mathematics Department                             office: E7A-419
Macquarie University                               tel: +61 +2 9850 8955
Sydney, Australia                                  fax: +61 +2 9850 8114
------------------------------------------------------------------------



More information about the XeTeX mailing list