[XeTeX] Re: XeTeX & Unicode vs. standard LaTeX

Ross Moore ross at maths.mq.edu.au
Mon Oct 11 00:10:57 CEST 2004


Hi Christopher,

On 11/10/2004, at 5:52 AM, christopher ciotti wrote:

> On Oct 10, 2004, at 3:26 PM, Jonathan Kew wrote:
>
>> Hi Zsolt,
>>
>> Thanks for your message. A couple of comments below. (Copied to XeTeX 
>> list with Zsolt's permission, as I think the response will be of 
>> wider interest.)
>>
>
> I have been wondering about putting together a unicode version of 
> textcomp.sty.

This is already incorporated into  utf8accents.sty , along with a lot 
more
commands from the various other font-encoding (.enc) files.

>  I have a rudimentary collection of \[re]newcommand statements in a 
> sty file to make common text stuff easy after the other day when I ran 
> into trouble with quotes and the $ sign.  I'm not really sure about 
> how to properly implement this stuff but if anyone is interested in 
> what I have I'll post it.  At the very least, it might save some 
> typing.

Just simply making \renewcommand  definitions isn't really the right 
approach.
It's fine for a single document using just one kind of font.
But if you are mixing different fonts (e.g. because of mathematics,
computer code, multiple languages, etc.) then you may need a high-level 
macro
such as \textdollar to result in a different character depending upon 
the font
being used in the particular context.
(Is it a tfm-based CM or Euler, or an AAT or OTF font ?)


Thus you want the high-level definition to be done in such a way that
the current \fontencoding  is taken into account.

LaTeX provides commands for this:
   \DeclareTextCommand   \DeclareTextSymbol   \DeclareTextAccent
and 'Default' versions:
   \DeclareTextCommandDefault   \DeclareTextSymbolDefault   
\DeclareTextAccentDefault
as well as
   \DeclareTextComposite   and   \DeclareTextCompositeCommand
and
   \DeclareTextFontCommand  for defining font-switching macros.


These are the commands that should be used, wherever possible.
Alternatively study the innards of how these work, and mimic that.

The latter is what is done in  utf8accents.sty  with its commands

   \DeclareUTFcharacter
     (for a Unicode version of \DeclareTextCharacter)

and

  \DeclareEncodedCompositeCharacter
  \DeclareEncodedCompositeAccents

for handling accents and other composite-pair constructions.


Thus many issues of backwards-compatibility with existing (La)TeX
practices are solved for XeTeX simply by loading  utf8accents.sty .

As there have been quite a few requests for this lately,
here it is again (in version v0.4).

-------------- next part --------------
A non-text attachment was scrubbed...
Name: utf8accents.sty
Type: application/octet-stream
Size: 118053 bytes
Desc: not available
Url : http://tug.org/pipermail/xetex/attachments/20041011/47b4ada0/utf8accents-0001.obj
-------------- next part --------------




However  utf8accents.sty  doesn't solve the ligature problems,
which are of a quite different character (sic).
That's why the following is such great news ...

>> However, we obviously cannot expect mainstream font vendors to add 
>> support for TeX's unique keying conventions to their font tables. 
>> Therefore, I have just implemented a "font mapping" scheme (this was 
>> first suggested on the XeTeX list by Ross Moore, IIRC), which allows 
>> an arbitrary mapping of Unicode character sequences to be associated 
>> with a particular font. So having defined a mapping "tex-text" that 
>> includes entries such as:
>>
>>     U+002D U+002D         >  U+2013 ; endash
>>     U+002D U+002D U+002D  >  U+2014 ; emdash
>>     U+0060 U+0060         >  U+201C ; opening double quote
>>     ; etc....
>>
>> I can then load a font with a command like
>>
>>     \font\pal = "Palatino:mapping=tex-text" at 12pt
>>
>> and whenever this font is used, XeTeX will pass the Unicode character 
>> sequence to be typeset (at the lowest level, after all macro 
>> expansion, etc.) through this mapping, and the standard TeX ligatures 
>> will work as expected.
>>
>> This was just implemented on Friday, and seems to be working well. It 
>> will be present in the next release of XeTeX (along with that 
>> OpenType ligature bug-fix, and perhaps another feature or two). Stay 
>> tuned! :-)


With this, and Will's new .fd  files, and  utf8accents.sty ,
we should be very close to having full backward compatibility
with legacy LaTeX documents.

By this I mean that it should be possible to apply a new selection
of (Macintosh) fonts to old LaTeX documents, just by making
minimal changes to which packages are loaded in the preamble.

I'd urge everyone to try this with some of your old documents,
and report back to the list on special cases that are not being
processed correctly.


>>>  this sounds fantastic. Is this substitution scheme going to have a 
>>> syntax permitting the use of character ranges and maybe even 
>>> replacement patterns? So that one might be able to reorder character 
>>> positions saying something like (assuming syntax resembling grep):
>>>
>>>  ([U+0915-U+0939]) (U+0930) > \2\1
>>>
>>>  I suppose one could spell out these substitutions for each case, 
>>> but it would save time...
>>
>> Yes. For more on the mapping engine (primarily focused on 
>> byte<->Unicode encoding conversion, but being used here to do 
>> transformations of a Unicode text stream), see:
>>
>> 	http://scripts.sil.org/teckit
>>
>> The software currently there is primarily for Windows, but I'll post 
>> OS X versions too.
>>

  ... and this aspect should open up a whole new ball-game
for handling transliterations.




All the best,

	Ross



>>
>> Jonathan
>>
>> _______________________________________________
>> XeTeX mailing list
>> postmaster at tug.org
>> http://tug.org/mailman/listinfo/xetex
>>
>>
> -- 
> chris ciotti <chris_ciotti at yahoo.com>
> http://www.keyserver.net/en/
> Key ID: 0x0BD2B97A
> _______________________________________________
> XeTeX mailing list
> postmaster at tug.org
> http://tug.org/mailman/listinfo/xetex
>
------------------------------------------------------------------------
Ross Moore                                         ross at maths.mq.edu.au
Mathematics Department                             office: E7A-419
Macquarie University                               tel: +61 +2 9850 8955
Sydney, Australia                                  fax: +61 +2 9850 8114
------------------------------------------------------------------------



More information about the XeTeX mailing list