[XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

Ross Moore ross.moore at mq.edu.au
Sun Feb 21 22:42:59 CET 2021

Hi Ulrike,

On 22 Feb 2021, at 7:52 am, Ulrike Fischer <news3 at nililand.de<mailto:news3 at nililand.de>> wrote:

Am Sun, 21 Feb 2021 20:26:04 +0000 schrieb Ross Moore:

> Once you have encountered the (correct) comment character,
> what follows on the rest of the line is going to be discarded,
> so its encoding is surely irrelevant.
> Why should the whole line need to be fully tokenised,
> before the decision is taken as to what part of it is retained?

Well you need to find the end of the line to know where to stop with
the discarding don't you? So you need to inspect the part after the
comment char until you find something that says "newline”.

My understanding is that this *is* done first.
Similarly to TeX's  \read  to  <csname>  which grabs a line of input from a file,
before doing the tokenisation and storing the result in the <csname>.
   page 217 of The TeXbook

If I’m wrong with this, for high-speed input, then yes you need to know where to stop.
But that’s just as easy, since you stop when a byte is to be tokenised
as an end-of-line character, and these are known.
You need this anyway, even when you have tokenised every byte.

So all we are saying is that when handling the bytes between
a comment and its end-of-line, just be a bit more careful.

It’s not necessary for each byte to be tokenised as valid for UTF-8.
Maybe change the (Warning) message when you know that you are within
such a comment, to say so.  That would be more meaningful to a package-writer,
and to an author who uses the package, looks in the .log file, and sees the message.

None of this is changing how the file is ultimately processed;
it’s just about being friendlier in the human interface.

Ulrike Fischer

All the best.


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.moore at mq.edu.au<mailto:ross.moore at mq.edu.au>
[cid:image001.png at 01D030BE.D37A46F0]
CRICOS Provider Number 00002J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/xetex/attachments/20210221/8f524be8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 4605 bytes
Desc: image001.png
URL: <https://tug.org/pipermail/xetex/attachments/20210221/8f524be8/attachment-0001.png>

More information about the XeTeX mailing list.