[XeTeX] default char classes

Jonathan Kew jonathan_kew at sil.org
Sun Mar 9 17:07:59 CET 2008


On 9 Mar 2008, at 3:18 pm, Barry MacKichan wrote:

> Yes, that is how we do it now.
>
> I don't actually write multilingual documents myself, but we sell  
> software (Scientific WorkPlace, etc.) that does, and so we are  
> looking for ways to make things simpler for our customers.
>
> The main thing I'm after is to reinforce the concept in LaTeX of  
> separating content and form. The choice of a font for a particular  
> range of unicode characters is strictly a matter of form, yet the  
> author has to do different things in his document, depending on his  
> choice of fonts.
>
> 1. If he uses a font like Minion Pro, which contains Hebrew  
> characters, he needs to do nothing.

He still needs to get \beginR....\endR (or something higher-level  
that resolves to this) around the Hebrew text somehow, doesn't he?  
That doesn't happen automatically.

Now someone will no doubt tell me that it should! Perhaps; but again,  
there's a limit to what can be done automatically. Given source text  
that contains

     latin latin HEBREW HEBREW latin latin HEBREW HEBREW latin latin.

do we have a Latin-script sentence containing two separate Hebrew  
phrases, or is that a single Hebrew phrase that itself contains an  
embedded Latin quote? There's no way to know without some kind of  
markup or higher-level information, and it matters for layout. In  
other words, there's a crucial difference between these two:

     latin latin \beginR HEBREW HEBREW \endR latin latin \beginR  
HEBREW HEBREW \endR latin latin.

     latin latin \beginR HEBREW HEBREW \beginL latin latin \endL  
HEBREW HEBREW \endR latin latin.

and only the author can tell us -- via markup -- which is intended.

Or to take a "simpler" example, if our source text is

     latin latin HEBREW HEBREW? latin latin.

are we looking at a single Latin-script sentence that contains a  
Hebrew quote that ends with a question mark, or are we looking at a  
Latin question (containing a couple of Hebrew words), and then a  
second Latin sentence? The answer to this will determine where the  
question mark appears in the reordered text -- is it part of the  
Hebrew inclusion (in which case it appears to the left), or part of  
the surrounding Latin script (and appears to the right)?

JK



More information about the XeTeX mailing list