[XeTeX] Please help me with \XeTeXinterchartoks
Jonathan Kew
jonathan at jfkew.plus.com
Tue Dec 2 12:37:30 CET 2008
On 2 Dec 2008, at 15:08, VAFA KHALIGHI wrote:
> Dear Ross and Ulrike, thanks for your wonderful and helpful answers.
> it seems that I need to wait to see what Jonathan says.
>
> I just want to know if what I want to achive is possible or not? so
> yes or no?
Your approach didn't really make sense to me.... you don't want to
give every letter a different class, if you're doing that you might as
well make each one a distinct macro. The point of *classes* is that
you can handle a whole collection of similarly-behaved characters as a
unit.
(Note that there are some default class assignments already preloaded
in the format files; see unicode-letters.tex. (We really should have a
\newclass allocator, it just hasn't gotten done yet.) So you may want
to avoid clashing with those.)
Anyhow, here is a small example:
\documentclass{article}
\usepackage{bidi}
\usepackage[cm-default]{fontspec}
\newfontfamily{\ar}[Script=Arabic]{Scheherazade}
% classes 1-3 are used in unicode-letters.tex, so we'll put the Latin
letters in 4
\newcount\n
\n=`\A \loop \XeTeXcharclass \n=4 \ifnum\n<`\Z \advance\n by 1 \repeat
\n=`\a \loop \XeTeXcharclass \n=4 \ifnum\n<`\z \advance\n by 1 \repeat
% when we encounter class 4, we'll do \startlatin
\XeTeXinterchartoks 0 4 {\startlatin}
\XeTeXinterchartoks 255 4 {\startlatin}
% and when we encounter class 0, we'll do \finishlatin
\XeTeXinterchartoks 255 0 {\finishlatin}
\XeTeXinterchartoks 4 0 {\finishlatin}
\newif\iflatin
\newcommand{\startlatin}{\iflatin\else\bgroup\beginL\rm\latintrue\fi}
\newcommand{\finishlatin}{\iflatin\unskip\endL\egroup{ }\fi}
\XeTeXinterchartokenstate=1
\begin{document}
\setRL\ar
السلام عليكم
hello world
وعليكم السلام
\end{document}
However, I suspect you're not really going to be able to do this on a
large scale, because it will be too difficult to handle things like
punctuation and spacing at direction changes. In unidirectional text,
it may not matter whether the "language switch" happens before or
after the space (or punctuation mark), but with bidi it does matter. I
think in the end you're still going to need markup if you want to
reliably mix LR and RL scripts.
JK
More information about the XeTeX
mailing list