[XeTeX] Hyphenation of strings of more than 63 characters

Jonathan Kew jfkthame at gmail.com
Thu Mar 17 09:52:28 CET 2016


On 17/3/16 05:16, Peter Mukunda Pasedach wrote:
> Any news on this? If it's just one constant whose value I would have
> to increase in my private copy of the code, before recompling, for
> testing purposes, which one would that be?

Unfortunately, it's not just a single constant; there are a number of 
hardcoded values that all need to be changed in a consistent way.

I'll try to get an experimental patch ready shortly. Or, of course, 
someone else is welcome to try. I don't think it's very hard, but it is 
more than just a single number.

JK

>
> Peter
>
> On Wed, Mar 16, 2016 at 12:37 AM, Zdenek Wagner <zdenek.wagner at gmail.com> wrote:
>> 2016-03-16 0:06 GMT+01:00 Jonathan Kew <jfkthame at gmail.com>:
>>>
>>> On 15/3/16 18:04, Doug McKenna wrote:
>>>>
>>>> There could be some subtle problems that simply changing the character
>>>> count constant causes.
>>>>
>>>> In particular, the allocation size of a "whatsit" language node might
>>>> also need changing, which would require adjusting other code in the core
>>>> engine that assumes a default small size for that language node sub-type of
>>>> a "whatsit".
>>>>
>>>> Or not.  I can't tell from the TeX source what the bit sizes of these
>>>> node fields are.  But if they're too small to fit a pair of enhanced
>>>> character count limits for hyphenation, there will likely be bugs elsewhere
>>>> due to truncation or wraparound in the arithmetic.
>>>>
>>>> FWIW,
>>>>
>>>> Doug McKenna
>>>>
>>>
>>>
>>> AFAICS, this would only become an issue if we allow \lefthyphenmin and
>>> \righthyphenmin to be given larger values; currently, they're limited to the
>>> range 0..63 (actually, it's possible to set the parameters to larger values,
>>> but they'll be clamped when stored in a language node).
>>>
>>> But I don't think that's needed here; surely there's no realistic use-case
>>> that requires setting the *-min values greater than 63, even when
>>> hyphenating 1000-letter Sanskrit words.
>>
>>
>> The contrary is true. These long strings are not single words, several words
>> are glued together. For instance, Bhagavdgita contains श्रीभगवानुवाच which
>> are in fact three words श्री भगवान् उवाच. It makes no sense to increase
>> values of \lefthyphenmin and \righthyphenmin.
>>>
>>>
>>>
>>> JK
>>>
>>>
>>
>>
>> Zdeněk Wagner
>> http://ttsm.icpf.cas.cz/team/wagner.shtml
>> http://icebearsoft.euweb.cz
>>
>>
>>>
>>>
>>> --------------------------------------------------
>>> Subscriptions, Archive, and List information, etc.:
>>>   http://tug.org/mailman/listinfo/xetex
>>
>>
>>
>>
>>
>> --------------------------------------------------
>> Subscriptions, Archive, and List information, etc.:
>>    http://tug.org/mailman/listinfo/xetex
>>
>
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>    http://tug.org/mailman/listinfo/xetex
>



More information about the XeTeX mailing list