Keno Wehr wehr at
Wed May 15 22:40:24 CEST 2019

Am 15.05.19 um 19:53 schrieb Arthur Reutenauer:
> On Tue, May 14, 2019 at 10:55:32PM +0200, Keno Wehr wrote:
>> Is it possible to adapt patgen for such huge lists?
>    If you’re able to compile patgen yourself, it should be enough to
> change trie_size and triec_size in, currently set to
> 10,000,000 and 5,000,000 respectively.  It is possible that the
> percentages still will look silly because they’re computed as
> 	100 * good_count / ((double) good_count + miss_count)
> so that the numerator could result in an integer overflow considering
> the orders of magnitude we’re talking about: with 11 million entries,
> good_count could easily be over 22 million, which multiplied by a
> hundred will be more than can fit in a signed 32-bit integer.

Thank you for your advice. I will make a try.
It is perhaps better to use brackets for the calculation to avoid the 

	100 * (good_count / ((double) good_count + miss_count))

> I am
> however not able to test it myself because the public repository for
> Classical Latin hyphenation currently only produce a list of a little
> over 2 million entries (I suppose you’re running patgen from the script
> in

The correct location is
All you need is the script "" (and lua5.3 installed).
Unfortunately, I did not push the most recent change, which extends the 
list by a factor of 4. I have done that now.


More information about the tex-live mailing list