[tex-live] Fwd: Duplicate Thai patterns reported by XeTeX in TL 2015 pretest
Jonathan Kew
jfkthame at gmail.com
Wed Apr 15 10:17:51 CEST 2015
On 15/4/15 02:01, Dohyun Kim wrote:
> This issue might be related to the weird behaviour of xetex I have found:
>
> \toks0{유
> 술
> }\showthe\toks0
> \bye
>
> $ xetex test.tex
> This is XeTeX, Version 3.14159265-2.6-0.99992 (TeX Live 2015/dev)
> (preloaded format=xetex)
> restricted \write18 enabled.
> entering extended mode
> (./test.tex
>> \par .
> l.5 }\showthe\toks0
>
> ?
> )
> No pages of output.
> Transcript written on test.log.
>
> As shown, characters `유' and `술' are gone away and nothing is printed.
> Note that Unicode codepoints of these characters are U+C720 and U+C220
> respectively, with 0x20 in their lower bytes.
>
Note also that in the original report about Thai patterns, the error
messages show lines from the patterns file with their last character
U+0E20 (ภ) missing. Again, 0x20 in the lower byte (which will be the
leading byte in the buffer on little-endian platforms).
It looks to me like there could be something broken fairly early in the
input-scanning process (but after conversion from UTF-8 to UTF-16)
whereby a line-final UTF-16 character that begins with 0x20 is being
discarded as though it were a <space>. But this apparently doesn't
affect the Win32 binary, as the testcases work fine for Akira. (Is it
ONLY on OS X, or has this been observed on other platforms?)
Unfortunately, I don't have a current development build handy for
debugging purposes. Anyone....?
JK
More information about the tex-live
mailing list