[XeTeX] Bug fixes and new features related to Unicode character codes, surrogates, etc

David Carlisle d.p.carlisle at gmail.com
Tue May 5 10:29:45 CEST 2015


On 4 May 2015 at 16:27, Jonathan Kew <jfkthame at gmail.com> wrote:

> ...
>
> A fix for this bug, so that \string generates single Unicode characters
> even for values above U+FFFF, is currently on the utf16-issues branch in
> the XeTeX repository on sourceforge.[1]
>
> A bug with characters above U+FFFF within \scantokens[2] is also fixed on
> this branch.
>
>
> There are also a couple of new primitives available in this branch:
>
> (1) \Uchar <number>
>
>     where <number> is a number in the range 0.."10FFFF
>
> is an expandable command that produces a character token with the given
> Unicode value, and catcode=12 (other character). This is different from
> TeX's \char primitive from a macro-programming point of view, in that it
> expands to a character token rather than being a typesetting command.
>
> (I believe this is similar to the \Uchar command available in luatex.)
>
>
> (2) \Ucharcat <number1> <number2>
>
>     where <number1> is a number in the range 0.."10FFFF
>     and <number2> is a number in the ranges 1..4, 6..8, 10..12
>
> is an expandable command that produces a character token with Unicode
> value <number1> and catcode <number2>. This allows macro programmers to
> create character tokens with various catcode assignments much more easily
> than is otherwise possible.
>
>
> Feedback and testing is invited; but note that currently this will require
> pulling the code from sourceforge and building the new xetex, as binary
> packages are not available.
>
> If testing in the next day or two doesn't uncover any alarming problems,
> these fixes/features will be merged to the master branch and to TeXLive, in
> preparation for the TL2015 release.
>
> JK
>
>

Thanks for this!

I've build the version from this branch and it does appear to address all
the test cases I had for characters above "FFFF, and \Uchar(cat) will be
incredibly useful in defining expandable operations on token lists, and for
code that should be compatible with both luatex and xetex.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20150505/a7b95299/attachment.html>


More information about the XeTeX mailing list