[XeTeX] Re: New feature request for XeTeX
Ross Moore
ross at maths.mq.edu.au
Tue Jul 27 13:10:09 CEST 2004
Hi Jonathan,
On 27/07/2004, at 6:11 PM, Jonathan Kew wrote:
>
> On 27 Jul 2004, at 3:21 am, Ross Moore wrote:
>> Another, more mundane, is a simple solution for the "--" and "---"
>> *lack of ligature* problem:
>>
>> "---" could be mapped to ^^^^2014 (em dash)
>> "--" could be mapped to ^^^^2013 (en dash)
>>
>> and similarly for other TeX-specific ligature sequences.
>>
>
> This all depends what level of mapping is provided. My initial
> impression from your suggestion, Ross, was that we'd have a simple
> remapping of individual character codes, somewhat analogous to the
> .map files used by ttf2pt1, for example. This would allow each
> individual legacy codepoint to be mapped to a desired Unicode
> character in the font; but it wouldn't provide a solution for
> ligatures. And it might well be inadequate for many kinds of
> transliteration, where the relationship between scripts is not always
> a simple one-to-one correspondence.
A 1--1 mapping would indeed be easiest to implement, and would be
sufficient
for T3 encoding, as in tipa.sty , which was my initial motivating
example.
A 1--many mapping would then be a trivial extension; e.g. for accents,
or struck-through characters constructed using the Unicode 'combining'
combinations: e.g., (hypothetical examples only)
B --> B^^^^0338 (B with slash)
b --> b^^^^0337 (b with slash)
where there is no single code-point to do the job.
A many--1 or many--many map is certainly a bit harder.
It requires:
(i) the rules to be ordered (e.g. --- must be tested and applied
before -- )
(ii) proper integration with the hyphenation routine.
For (ii) a definite rule is that there cannot be hyphenation between the
code-points returned by a *--many mapping replacement.
Apart from (i) and (ii), I don't see much difficulty in this,
and ligatures could then be handled very easily.
Of course I'm not looking from the same view-point as you; so defer
to your experience in programming this kind of thing for a TeX engine.
> I'm also toying with ideas of more powerful mappings; it happens that
> I have a character-mapping engine, TECkit (see
> http://scripts.sil.org/teckit) that we could perhaps press into
> service. TECkit supports many-to-many mappings,
> contextually-determined mappings, even reordering of code sequences.
Wow; that's more than I was requesting --- at this stage!
> It was developed primarily to support complex mappings between legacy
> byte encodings and Unicode, but can also operate as a transducer
> entirely within Unicode, which is what would be needed here as XeTeX
> will already have interpreted the input text as Unicode when it was
> initially read from the file.
Looking at your TECkit overview, you seem to have already addressed the
kind of problem
that I'm trying to solve. (No, I didn't already know of this work!)
>
> Anyway, it's an interesting concept and may actually be implementable,
> too.... stay tuned.... but don't hold your breath. :-)
It's more than just interesting.
I think that it is indispensable, for the preservation of the ability
to interpret
old documents, and continued use of well-established encoding formats.
(i) New software needs to have the ability to interpret old data
formats.
(ii) Old, easy-to-use encoding formats will retain a life, so long as
the
computers that they were designed-for are still in service and/or
people who know how to use them are still active.
That may be for decades hence. Compatibility with more modern
systems
will be needed by archivers & researchers for even longer than
this.
I find your response most promising indeed.
In the TeX world, holding your breath is never a good policy. :-)
All the best,
Ross
>
> Jonathan
>
> _______________________________________________
> XeTeX mailing list
> postmaster at tug.org
> http://tug.org/mailman/listinfo/xetex
>
------------------------------------------------------------------------
Ross Moore ross at maths.mq.edu.au
Mathematics Department office: E7A-419
Macquarie University tel: +61 +2 9850 8955
Sydney, Australia fax: +61 +2 9850 8114
------------------------------------------------------------------------
More information about the XeTeX
mailing list