[XeTeX] problem with discretionary

jfbu jfbu at free.fr
Sun Dec 3 11:58:34 CET 2017


Thanks Zdeněk!

Should I thus conclude from this that polyglossia + French is currently broken ?
indeed the file gloss-french.ldf uses hardcoded 255 at various locations.

I am a bit lost though because my test mwe

\catcode`@ 11
\XeTeXinterchartokenstate=1
\newXeTeXintercharclass\french at punctthin 
\XeTeXcharclass `\; \french at punctthin
     \XeTeXinterchartoks 255 \french at punctthin = {\nobreak\thinspace}%
\catcode`;\active
\def;{\discretionary{\char`\;}{}{\char`\;}}
a;b
\bye

compiles fine with current XeTeX, but not with TL2015 XeTeX.

(the @ thing is only to stay close to control sequence names from gloss-french.ldf)

To clarify, the \def;{\discretionary{\char`\;}{}{\char`\;}} is analogous to
the kind of things Sphinx does in verbatim listings to allow linebreaks,
but isn't the exact thing.

Anyway, it does not originate from polyglossia nor
gloss-french.ldf but is a Sphinx add-on inside code listings.

If the problem can be solved by a patch at macro level, that would
be best, because it would allow the CPython internationalization
team to build their PDF docs without worrying about which XeTeX
they use, I notice some of their team uses Debian 2013.

Best

Jean-François

Le 3 déc. 2017 à 11:01, Zdenek Wagner <zdenek.wagner at gmail.com> a écrit :

> Hi,
> 
> please, notice that the number of character classes was increased from 256 to 4096, so 255 no longer works as a boundary but 4095 must be used. I use the following code that I took from some other package:
> 
> \edef\CSat{\the\catcode`\@} % in order to work in plain XeTeX
> \catcode`\@=11
> \ifdefined\e at alloc@intercharclass at top
>   \chardef\CSboundary=\e at alloc@intercharclass at top
> \else
>   \ifdefined\XeTeXinterwordspaceshaping
>     \chardef\CSboundary=4095 %
>     \def\newXeTeXintercharclass{%
>       \e at alloc\XeTeXcharclass\chardef
>               \xe at alloc@intercharclass\m at ne\@ucharclass at boundary}
>   \else
>     \chardef\CSboundary=255
>   \fi
> \fi
> \catcode`\@=\CSat
> 
> Afterwards I use \CSboundary instead of a fixed number. It thus works both with the old and new XeTeX.
> 
> 
> Zdeněk Wagner
> http://ttsm.icpf.cas.cz/team/wagner.shtml
> http://icebearsoft.euweb.cz
> 
> 2017-12-03 10:19 GMT+01:00 jfbu <jfbu at free.fr>:
> Hi,
> 
> I need some help to identify which XeTeX release fixed
> that problem, the mwe is
> 
> \catcode`@ 11
> \XeTeXinterchartokenstate=1
> \newXeTeXintercharclass\french at punctthin
> \XeTeXcharclass `\; \french at punctthin
>      \XeTeXinterchartoks 255 \french at punctthin = {\nobreak\thinspace}%
> \catcode`;\active
> \def;{\discretionary{\char`\;}{}{\char`\;}}
> a;b
> \bye
> 
> In  real life it appeared in a Polyglossia+French context
> with the semi-colon make active to insert a \discretionary
> similar to the above. There is no issue in lualatex.
> 
> It is currently seen at Python upstream (CPython) when
> they try to build French docs (via Sphinx)
> 
> https://bugs.python.org/issue31589
> 
> and it would be nice to pinpoint which XeTeX release
> precisely is ok. I know 0.99992 is bad and 0.99996 is good,
> but can't easily bisect.
> 
> Best,
> 
> Jean-François
> 
> 
> 
> 
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20171203/1ab4a444/attachment.html>


More information about the XeTeX mailing list