[XeTeX] ThreeKingdomsCh1.tex example in XeLaTeX document

Fri Mar 14 01:05:02 CET 2008

Dear XeTeX mailing list members:
I am interested in using the XeLaTeX typesetter to place some vertical
Traditional Chinese in a portion of the LaTeX document class "memoir".
I was able to use the \usepackage{fontspec} ,   
\newfontfeature{Monospaced}{Text Spacing=Monospaced Text} , and
\setromanfont[Monospaced,Scale=2.0]{STSong} commands to write left-to- 
right
formatted characters, but would like to use the vertical typesetting  
that
is generated by the ThreeKingdomsCh1.tex example on the XeTeX website.
Could someone get me started or direct me to instructions or pointers  
on how
I can set up my XeLaTeX document to include the vertically formatted  
Chinese text?

Also, I am using just TeXLive 2007 and compiling from the terminal.  I  
get
the error:
## xdv2pdf: use of uninstalled fonts (specified by filename) such as
##   [/usr/local/texlive/2007/texmf-dist/fonts/opentype/public/lm/
lmroman10-regular.otf]
## is not supported; try using the xdvipdfmx driver instead.
which is discussed in :

http://email.esm.psu.edu/pipermail/macosx-tex/2007-February/028793.html

however, since I don't have a ~/Library/TeXShop/Engines
I do not know how to switch the engine that
xelatex uses. is there a configuration script I
can run to make Xelatex use xdvipdfmx instead of xdv2pdf?

Thank you for any help you might be able to provide.
David Rangel

On Mar 13, 2008, at 4:00 AM, xetex-request at tug.org wrote:

> Send XeTeX mailing list submissions to
> 	xetex at tug.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://tug.org/mailman/listinfo/xetex
> or, via email, send a message with subject or body 'help' to
> 	xetex-request at tug.org
>
> You can reach the person managing the list at
> 	xetex-owner at tug.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of XeTeX digest..."
>
>
> Today's Topics:
>
>   1. Re: Encoding of auxiliary files (Ulrike Fischer)
>   2. Re: Encoding of auxiliary files (Jonathan Kew)
>   3. Re: default char classes (Barry MacKichan)
>   4. Re: Encoding of auxiliary files (Ulrike Fischer)
>   5. Re: Encoding of auxiliary files (Jonathan Kew)
>   6. Re: default char classes (Jonathan Kew)
>   7. Re: Encoding of auxiliary files (Ulrike Fischer)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 12 Mar 2008 12:45:12 +0100
> From: Ulrike Fischer <news2 at nililand.de>
> Subject: Re: [XeTeX] Encoding of auxiliary files
> To: xetex at tug.org
> Message-ID: <s13utgendku7$.dlg at nililand.de>
> Content-Type: text/plain; charset="us-ascii"
>
> Am Wed, 12 Mar 2008 09:57:28 +0000 schrieb Jonathan Kew:
>
>>> as far as I can see Xe(La)TeX writes auxiliary files like the .aux  
>>> and
>>> the .toc-file always in utf-8. Is this true?
>>
>> Yes.
>>
>>> If yes I think the \XeTeXdefaultencoding command is a bit useless as
>>> you will run into trouble if the auxiliary files contains chars
>>> outside the ASCII-range (which is quite probable in the case of
>>> .toc).
>
>> Right; this is quite limited. The main reason it exists is for cases
>> where you need to read an existing file that uses a legacy encoding,
>> and you can't modify the actual input file to declare the proper
>> \XeTeXinputencoding, so you need to set the encoding before giving
>> the \input command.
>
> Yes that makes sense. I got it right that the setting is global, so I
> would have to reset it after the input?
>
> Btw: Is is correct that the following code
>
> \documentclass{scrreprt}
> \usepackage{fontspec}
> \begin{document}
> {\XeTeXdefaultencoding "cp1252"
> \XeTeXdefaultencoding "auto"}
>
> test
> \end{document}
>
> gives the message
> ### simple group (level 1) entered at line 5 ({)
> ### bottom level
>
> ?
> (It works fine if I add a \relax after the "auto").
>
>
> -- 
> Ulrike Fischer
>
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 12 Mar 2008 12:41:00 +0000
> From: Jonathan Kew <jonathan_kew at sil.org>
> Subject: Re: [XeTeX] Encoding of auxiliary files
> To: news2 at nililand.de,	Unicode-based TeX for Mac OS X and other
> 	platforms <xetex at tug.org>
> Message-ID: <D597EC67-A1D9-483A-AE98-DBD3315AC92E at sil.org>
> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
>
> On 12 Mar 2008, at 11:45 am, Ulrike Fischer wrote:
>
>> Am Wed, 12 Mar 2008 09:57:28 +0000 schrieb Jonathan Kew:
>>
>>>> as far as I can see Xe(La)TeX writes auxiliary files like
>>>> the .aux and
>>>> the .toc-file always in utf-8. Is this true?
>>>
>>> Yes.
>>>
>>>> If yes I think the \XeTeXdefaultencoding command is a bit useless  
>>>> as
>>>> you will run into trouble if the auxiliary files contains chars
>>>> outside the ASCII-range (which is quite probable in the case of
>>>> .toc).
>>
>>> Right; this is quite limited. The main reason it exists is for cases
>>> where you need to read an existing file that uses a legacy encoding,
>>> and you can't modify the actual input file to declare the proper
>>> \XeTeXinputencoding, so you need to set the encoding before giving
>>> the \input command.
>>
>> Yes that makes sense. I got it right that the setting is global, so I
>> would have to reset it after the input?
>
> Yes.
>
>> Btw: Is is correct that the following code
>>
>> \documentclass{scrreprt}
>> \usepackage{fontspec}
>> \begin{document}
>> {\XeTeXdefaultencoding "cp1252"
>> \XeTeXdefaultencoding "auto"}
>>
>> test
>> \end{document}
>>
>> gives the message
>> ### simple group (level 1) entered at line 5 ({)
>> ### bottom level
>>
>> ?
>> (It works fine if I add a \relax after the "auto").
>
> It is correct, though admittedly a little surprising. The issue here
> is that encoding names (which are treated like filenames as far as
> TeX's scanner is concerned) need to be terminated somehow. The quotes
> do not necessarily delimit them, because it's possible for a name to
> be constructed from several quoted fragments, as in
>
>   \def\name{"file name"}
>   \def\ext{".txt"}
>   \input \filename\ext
>
> which should read "file name.txt", despite the scanner seeing "file
> name"".txt".
>
> A space (outside the quotes) would be adequate to terminate the name,
> though \relax may be nice in that it's more visible.
>
> This is really a manifestation of the same "surprise" as you get if
> you try (in either pdftex or xetex) to say
>
>   \setbox0=\vbox{\input filename}
>
> expecting the text from "filename" to be set in a box.
>
> The lesson: always provide a space or \relax to terminate the file
> (or font or encoding) name.
>
> JK
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 12 Mar 2008 06:31:02 -0700
> From: Barry MacKichan <barry at mackichan.com>
> Subject: Re: [XeTeX] default char classes
> To: xetex at tug.org
> Message-ID: <47D7DB16.5080105 at mackichan.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Jonathan, you have convinced me that language markup is needed.
> Actually, with our mostly-WYSIWYG front end, you have to specify RTL
> when appropriate in order to keep the cursor from jumping every time  
> you
> type a space -- it gets the direction from the font but then thinks it
> has changed when it sees the space.
> What I am getting out of this discussion is that the user should not
> think that he is specifying a font with a tag -- with many Unicode  
> fonts
> this is unnecessary -- but he is specifying a language. And the  
> language
> determines much more than the font ...
>
> I am curious about Will's question. Are there efficiency concerns in
> defining lots of large token classes?
>
> --Barry
>> Message: 1
>> Date: Sun, 9 Mar 2008 16:07:59 +0000
>> From: Jonathan Kew <jonathan_kew at sil.org>
>> Subject: Re: [XeTeX] default char classes
>> To: barry.mackichan at mackichan.com,	Unicode-based TeX for Mac OS X and
>> 	other platforms <xetex at tug.org>
>> Message-ID: <320757AA-4287-4530-BDE5-AD6E330BD57E at sil.org>
>> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
>>
>> On 9 Mar 2008, at 3:18 pm, Barry MacKichan wrote:
>>
>>
>>> Yes, that is how we do it now.
>>>
>>> I don't actually write multilingual documents myself, but we sell
>>> software (Scientific WorkPlace, etc.) that does, and so we are
>>> looking for ways to make things simpler for our customers.
>>>
>>> The main thing I'm after is to reinforce the concept in LaTeX of
>>> separating content and form. The choice of a font for a particular
>>> range of unicode characters is strictly a matter of form, yet the
>>> author has to do different things in his document, depending on his
>>> choice of fonts.
>>>
>>> 1. If he uses a font like Minion Pro, which contains Hebrew
>>> characters, he needs to do nothing.
>>>
>>
>> He still needs to get \beginR....\endR (or something higher-level
>> that resolves to this) around the Hebrew text somehow, doesn't he?
>> That doesn't happen automatically.
>>
>> Now someone will no doubt tell me that it should! Perhaps; but again,
>> there's a limit to what can be done automatically. Given source text
>> that contains
>>
>>     latin latin HEBREW HEBREW latin latin HEBREW HEBREW latin latin.
>>
>> do we have a Latin-script sentence containing two separate Hebrew
>> phrases, or is that a single Hebrew phrase that itself contains an
>> embedded Latin quote? There's no way to know without some kind of
>> markup or higher-level information, and it matters for layout. In
>> other words, there's a crucial difference between these two:
>>
>>     latin latin \beginR HEBREW HEBREW \endR latin latin \beginR
>> HEBREW HEBREW \endR latin latin.
>>
>>     latin latin \beginR HEBREW HEBREW \beginL latin latin \endL
>> HEBREW HEBREW \endR latin latin.
>>
>> and only the author can tell us -- via markup -- which is intended.
>>
>> Or to take a "simpler" example, if our source text is
>>
>>     latin latin HEBREW HEBREW? latin latin.
>>
>> are we looking at a single Latin-script sentence that contains a
>> Hebrew quote that ends with a question mark, or are we looking at a
>> Latin question (containing a couple of Hebrew words), and then a
>> second Latin sentence? The answer to this will determine where the
>> question mark appears in the reordered text -- is it part of the
>> Hebrew inclusion (in which case it appears to the left), or part of
>> the surrounding Latin script (and appears to the right)?
>>
>> JK
>>
>>
>>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 12 Mar 2008 18:12:13 +0100
> From: Ulrike Fischer <news2 at nililand.de>
> Subject: Re: [XeTeX] Encoding of auxiliary files
> To: xetex at tug.org
> Message-ID: <tl0m8zixz75p.dlg at nililand.de>
> Content-Type: text/plain; charset="us-ascii"
>
> Am Wed, 12 Mar 2008 12:41:00 +0000 schrieb Jonathan Kew:
>
>>> Btw: Is is correct that the following code
>
>>> {\XeTeXdefaultencoding "cp1252"
>>> \XeTeXdefaultencoding "auto"}
>
>>> gives the message
>>> ### simple group (level 1) entered at line 5 ({)
>>> ### bottom level
>
>> It is correct, though admittedly a little surprising. The issue here
>> is that encoding names (which are treated like filenames as far as
>> TeX's scanner is concerned) need to be terminated somehow. The quotes
>> do not necessarily delimit them, because it's possible for a name to
>> be constructed from several quoted fragments,
>
> That's the problem when you do to much LaTeX, you forget that  
> arguments
> are delimited differently somehow when you use primitives. ;-)
>
> But if the quotes don't terminate the filename, then why using them at
> all? \XeTeXdefaultencoding cp1252\relax and  \XeTeXdefaultencoding
> auto\relax seems to work fine.
>
>
> -- 
> Ulrike Fischer
>
>
>
> ------------------------------
>
> Message: 5
> Date: Wed, 12 Mar 2008 18:08:12 +0000
> From: Jonathan Kew <jonathan_kew at sil.org>
> Subject: Re: [XeTeX] Encoding of auxiliary files
> To: news2 at nililand.de,	Unicode-based TeX for Mac OS X and other
> 	platforms <xetex at tug.org>
> Message-ID: <057F8813-7239-4D9F-BF4F-833D3972D3D3 at sil.org>
> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
>
> On 12 Mar 2008, at 5:12 pm, Ulrike Fischer wrote:
>
>> Am Wed, 12 Mar 2008 12:41:00 +0000 schrieb Jonathan Kew:
>>
>>>> Btw: Is is correct that the following code
>>
>>>> {\XeTeXdefaultencoding "cp1252"
>>>> \XeTeXdefaultencoding "auto"}
>>
>>>> gives the message
>>>> ### simple group (level 1) entered at line 5 ({)
>>>> ### bottom level
>>
>>> It is correct, though admittedly a little surprising. The issue here
>>> is that encoding names (which are treated like filenames as far as
>>> TeX's scanner is concerned) need to be terminated somehow. The  
>>> quotes
>>> do not necessarily delimit them, because it's possible for a name to
>>> be constructed from several quoted fragments,
>>
>> That's the problem when you do to much LaTeX, you forget that
>> arguments
>> are delimited differently somehow when you use primitives. ;-)
>>
>> But if the quotes don't terminate the filename, then why using them  
>> at
>> all? \XeTeXdefaultencoding cp1252\relax and  \XeTeXdefaultencoding
>> auto\relax seems to work fine.
>
> Right; they'd only really be necessary if you have a space in the
> encoding name. (Which probably doesn't apply to codepage names, but
> it applies to files and especially font names, of course. And they're
> all scanned in the same way by xetex. So I'm in the habit of putting
> quotes around any and all "filenames" in my source.)
>
> JK
>
>
>
> ------------------------------
>
> Message: 6
> Date: Wed, 12 Mar 2008 18:10:25 +0000
> From: Jonathan Kew <jonathan_kew at sil.org>
> Subject: Re: [XeTeX] default char classes
> To: barry.mackichan at mackichan.com,	Unicode-based TeX for Mac OS X and
> 	other platforms <xetex at tug.org>
> Message-ID: <CB740A65-16EB-4AF3-ACA0-BF22355B9EF2 at sil.org>
> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
>
> On 12 Mar 2008, at 1:31 pm, Barry MacKichan wrote:
>
>> Jonathan, you have convinced me that language markup is needed.
>
> :-)
>
> There are, of course, simple cases where it's possible to get away
> without it, and cases where "magic" font-switching would be handy for
> specific purposes. But it's very hard to design a universal, robust
> system.
>
>> I am curious about Will's question. Are there efficiency concerns in
>> defining lots of large token classes?
>
> The main concern I'd have is that I suspect that in most cases, users
> of character class and inter-char tokens will really only be
> interested in a couple of scripts, and certain classes of characters
> within those scripts (e.g., opening and closing punctuation). So it's
> simplest for them if they define the specific classes that matter for
> their application, and leave everything else in a default "other"  
> class.
>
> If we pre-assign all the Unicode characters to several dozen (at
> least) classes, based on script and on other character categories --
> in fact, we might easily hit 100 classes or more -- then packages
> like zhspacing that care about a certain script, and consider
> everything else "other", will have a lot of extra class-pairs to
> consider, for no obvious benefit. That seems like an extra burden on
> users/macro writers.
>
> What we probably should do, as part of the xetex and xelatex formats,
> is create a \newcharclass allocator (like plain TeX's \newcount,
> etc), to help people manage class numbers without conflict.
>
> If someone does want to try and implement comprehensive multi-script
> automatic font switching (despite my reservations!), there's nothing
> to stop them assigning all the Unicode chars to classes based on
> script, and even precompiling this into a format file. (The unicode-
> letters.tex file, and the Perl script that generates it -- found in
> the xetex source tree -- could give some ideas how to go about this.)
>
> JK
>
>
>
> ------------------------------
>
> Message: 7
> Date: Wed, 12 Mar 2008 19:42:14 +0100
> From: Ulrike Fischer <news2 at nililand.de>
> Subject: Re: [XeTeX] Encoding of auxiliary files
> To: xetex at tug.org
> Message-ID: <15bxomn0zvf48.dlg at nililand.de>
> Content-Type: text/plain; charset="us-ascii"
>
> Am Wed, 12 Mar 2008 18:08:12 +0000 schrieb Jonathan Kew:
>
>>> But if the quotes don't terminate the filename, then why using  
>>> them at
>>> all? \XeTeXdefaultencoding cp1252\relax and  \XeTeXdefaultencoding
>>> auto\relax seems to work fine.
>>
>> Right; they'd only really be necessary if you have a space in the
>> encoding name. (Which probably doesn't apply to codepage names, but
>> it applies to files and especially font names, of course. And they're
>> all scanned in the same way by xetex. So I'm in the habit of putting
>> quotes around any and all "filenames" in my source.)
>
> German users tend to avoid double quotes. They break with the  
> shortcut "
> used by babel in ngerman/german.
>
> And I never use spaces in file names. ;-)
>
>
> -- 
> Ulrike Fischer
>
>
>
> ------------------------------
>
> _______________________________________________
> XeTeX mailing list
> XeTeX at tug.org
> http://tug.org/mailman/listinfo/xetex
>
>
> End of XeTeX Digest, Vol 48, Issue 23
> *************************************