[XeTeX] xetex crash: interaction between interchar and linebreaklocale mechanisms

Shree Devi Kumar shreeshrii at gmail.com
Wed Sep 6 18:32:10 CEST 2023


You can try https://github.com/Pomax/ucharclasses

I have used it in past with Devanagari, Tamil, Gujarati scripts and English.

On Wed, Sep 6, 2023, 11:23 AM Andrew Goldstone <andrew.goldstone at gmail.com>
wrote:

> Hello: I am attempting to assist a colleague, who is new to TeX, in
> typesetting a text which includes many passages in which Burmese and Latin
> scripts are closely intermixed. I wanted to make it possible for my
> colleague to enter his text fairly naturally, as he is used to doing in
> Word, by simply mixing the scripts, rather having to type a macro to switch
> languages/fonts at nearly every word. On tex.stackexchange I found a
> suggestion to use XeTeX's interchar mechanism for this purpose and adapted
> the code example to my own purposes.
>
> Though this works fine on its own, it leads to problems, and sometimes
> crashes, in conjunction with two other desirable XeTeX features, namely its
> linebreak-locale and interword space-shaping mechanisms. The example below
> my signature demonstrates the following three-way interaction:
>
> (A) XeTeXlinebreaklocale="my"
> (B) XeTeXinterwordspaceshaping=2
> (C) XeTeXinterchartokenstate=1 (and accompanying char. class definitions)
>
> A       some ligatures render incorrectly, e.g. lla လ္ +လ
> B       ok, but must use explicit \selectlanguage{burmese}
> C       ok, but Burmese lines only broken on spaces (unidiomatic)
> A+B     ok, but must use explicit \selectlanguage{burmese}
> A+C     ligature renders incorrectly
> B+C     segfault if more than one switch to Burmese
> A+B+C   segfault if more than one switch to Burmese
>
> My system is macOS 13.5 on Apple M1 Pro, XeTeX 3.141592653-2.6-0.999995
> (TeX Live 2023).
>
> I can certainly help my colleague work around the crashing bug by
> postprocessing his source with a script to insert \selectlanguage{} next to
> the appropriate Unicode range, but the crash is frustrating. I believe this
> is the same issue as was raised on StackExchange in 2019
>
>
> https://tex.stackexchange.com/questions/503498/trouble-with-stacked-consonants-burmese-script
>
> but I couldn't find any further discussion of a fix for the crash.
>
> Many thanks for any help: perhaps I've come at this all wrong. My own
> XeTeX experience has almost all been in the Latin alphabet. Best,
> Andrew Goldstone
>
> PS my example script--forgive the verbosity. The two Burmese words are
> just taken at random from my colleague's sample text, with the first
> repeated to fill out a line.
>
> \documentclass[draft,12pt]{article}
> \usepackage[english]{babel}
> \babelprovide[import]{burmese}
> \babelfont[burmese]{rm}{Noto Serif Myanmar Regular}
>
> \XeTeXlinebreaklocale "my"     % (A)
> \XeTeXinterwordspaceshaping=2  % (B)
>
> % (C)...
>
> \newXeTeXintercharclass\burmesesub
> \newcount\myCount
> \myCount="1000
> \loop\ifnum\myCount<"109F
>   \XeTeXcharclass\myCount=\burmesesub
>   \advance\myCount by 1
> \repeat
>
> \XeTeXinterchartoks 0 \burmesesub = {\begingroup\selectlanguage{burmese}}
> \XeTeXinterchartoks 4095 \burmesesub =
> {\begingroup\selectlanguage{burmese}}
> \XeTeXinterchartoks \burmesesub 0 = {\endgroup}
> \XeTeXinterchartoks \burmesesub 4095 = {\endgroup}
>
> \XeTeXinterchartokenstate=1
>
> % ...(C)
>
> \begin{document}
>
>
> ထက်လုလ္လ
> thak·lulla
> ထက်လုလ္လ
> thak·lulla
> ထက်လုလ္လ
> thak·lulla
> ထက်လုလ္လ
> thak·lulla
>
> သည် ၊ saññ·|
>
> \end{document}
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/xetex/attachments/20230906/0b1019a0/attachment.htm>


More information about the XeTeX mailing list.