[XeTeX] anti-xunicode ;-)

Will Robertson wspr81 at gmail.com
Fri Jul 21 11:34:55 CEST 2006


Hi Adam,

Your expertise is very welcome :)

On 7/21/06, Adam Twardoch <list.adam at twardoch.com> wrote:
> For example, the sequences \u0045\u0323\u0301, \u0045\u0301\u0323,
> \u00C9\u0323, \u1EB8\u0301 are all valid Unicode representations of
> LATIN CAPITAL LETTER E WITH DOT BELOW AND COMBINING ACUTE ACCENT, which
> does not have a precomposed Unicode codepoint. Even for a common
> character such as LATIN CAPITAL LETTER E WITH ACUTE (\u00C9), the valid
> representations are both \u00C9 and \u0045\u0301.

This doesn't help us so much in XeTeX, since by the time you get to
\u0301, the other stuff has already been typeset :( This needs a
previously discussed (by me, but not here) feature of some future TeX
extension to query the last node (like \lastbox for individual
characters) and remove it. E.g., pseudocode
  \def^^^^0301{%
    \edef\temp{\unhbox\lastnode}% i'll bet THIS is fragile!
    \unnode
    \fakeaccent{0301}{\temp}}

(I don't know how this would work in practise. I'm pretty poor with
such low level TeX. I'm assuming that \lastnode simply puts its
material in a box in lieu of having a new data type to store such
information.)

> Whenever a typesetting application finds a sequence of encoded
> characters that involve combining accents, it can have a multitude of
> options on how to produce the final rendered glyph. For example, if
> there the sequence \u0045\u0301 in the stream, the application might:
[snip]
> (d) apply "heuristic positioning" in other ways.
[snip]
> All the heuristic positioning is certainly not specified by either
> Unicode or OpenType but the application is free to try and optimize the
> final appearance of the text that way.

Theoretically, Jonathan, is this something that could be handled by an
extension to XeTeX? That is, during the process taken to compose
\u0045\u0301, if no standard method is found, it could be sent back
for further macro processing? Or is this step too late in the output
routine?

Regards,
Will


More information about the XeTeX mailing list