[luatex] UTF-8 byte sequence 0xEF, 0xBF, 0xBD causes invalid sequence error whereas ^^^^fffd does not
luigi scarso
luigi.scarso at gmail.com
Sat Jan 14 10:27:39 CET 2023
On Fri, 13 Jan 2023 at 22:36, Vítek Novotný <witiko at mail.muni.cz> wrote:
> Dear LuaTeX developers,
>
> assume the following plain TeX document `example.tex`:
>
> \newwrite\outfile
> \openout\outfile\jobname.out
> \write\outfile{^^^^fffd}
> \closeout\outfile
> \bye
>
> Running `luatex example` will correctly produce file `example.out` with the
> UTF-8 encoding of U+FFFD: 0xEF, 0xBF, and 0xBD.
>
> $ hexdump -C
> 00000000 ef bf bd 0a |....|
> 00000004
>
> Now, let's change `example.tex` as follows:
>
> \input\jobname.out
> \bye
>
> Running `luatex example` produces the following error:
>
> ! String contains an invalid utf-8 sequence.
>
> I would expect that LuaTeX would treat ^^^^fffd and the byte sequence 0xEF,
> 0xBF, and 0xBD the same. This issue was co-discovered by @lostenderman at
> <https://github.com/lostenderman/markdown/issues/34>.
>
>
hm, checking it now.
--
luigi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/luatex/attachments/20230114/a377bbf1/attachment.html>
More information about the luatex
mailing list.