[luatex] how many bytes for fontdimens?

Tue Aug 3 12:49:03 CEST 2021

On 8/3/2021 9:45 AM, jfbu wrote:
> Hi,
> 
>> Le 3 août 2021 à 09:09, Hans Hagen <j.hagen at xs4all.nl 
>> <mailto:j.hagen at xs4all.nl>> a écrit :
>>
>> On 8/2/2021 9:37 PM, jfbu wrote:
>>> forgot to mention that I am aware a \fontdimen is limited to 2**30 
>>> strictly anyhow
>>> but my question is whether such « arrays » are stored 32bits or 
>>> 64bits itemwise
>> it happens to be an array of 32 bit integers (that grows on demand) 
>> but such implementaiton details are unspecified (could as well have 
>> been a sparse array in which case each entry that is actually set has more
>>
>> also, the fact that it grow is a sort of side effect of the fact that 
>> tfm fonts can have 7 or more, but 7 are used, for text upto more for 
>> math fonts
>>
>> so, i wouldn't rely on these properties too much
> 
> Thanks!
> 
> Reason I asked is because I contributed an Eratosthenes Prime sieve to a 
> github site comparing a whole bunch of langages and it is asked there to 
> specify whether the « arrays » use 1bit, 8bits, 32bits, or 64bits (or 
> « unknown ») per (potential) prime.
> 
> https://github.com/PlummersSoftwareLLC/Primes/blob/drag-race/CONTRIBUTING.md#flag-storage 
> <https://github.com/PlummersSoftwareLLC/Primes/blob/drag-race/CONTRIBUTING.md#flag-storage>

ha, i've seen that one a few weeks ago (after some yt video) and to be 
honnest it's one of these useless speed comparisons (can be fun, but 
useless as one compares languages with different objectives and doing 
some prime stuff is hardly representative for usage)

> I am using luatex for the benchmark because even setting pdftex’s 
> font_mem_size at its maximum TeXLive setting, the memory is at risk of 
> being exhausted on current personal computers  from the condition that 
> the benchmark must iterate at least until a duration of 5 seconds and 
> each iteration re-in allocates a \fontdimen « array » (in the case at 
> hand about 500,000 entries are needed to sieve up to 1,000,000 and 
> memory will get exhausted before the 300th pass)
> 
> Also it seems luatex runs comparatively faster once the sieving range is 
> large enough (the instantiation step which requires extending 
> dynamically does take some time).

hm, just allocate the largest fontdimen you need first, that will make 
the fontdimen array grow at the beginning only (instead of at each step)

> I will thus modify the « bit count » tag of my « solution » from unknown 
> to 32bits, thanks to your answer, knowing though that this remains 
> officially unspecified. But the Dockerfile which I was asked to include, 
> and which their benchmarking uses, pulls a texlive-minimal based image 
> dating back to 2018.
> 
> Perhaps someone here will be interested into contributing a genuine 
> luatex (i.e. using Lua) solution (my code uses only Knuth TeX; there is 
> also a LaTeX3 code also on the github site).

a lua solution in luatex is just a lua solution -)

> There is already at least one Lua contribution. I don’t know if a 
> genuine luatex would have to be categorized under « PrimeTeX » or 
> « PrimeLua » ...
> 
> ... in particular a LuaTeX genuine solution may have a way to use an 
> « array » not based on font dimension parameters.

mixing lua and tex will also introduce lua call overhead so there is no 
gain there (maybe let lua do the sqrt but then you can well do all in lua)

my guess is that the sqrt is the bottleneck

fontdimens are actually bnto that slow not that slow because they are 
(1) global so no save stack overhead, and (2) directly accessible 
because they are part of the font structure (so no tex dimen access 
overhead)

also, using etex \dimexpr is also slower than the simple operators

> One particular point I don’t know is whether LuaTeX would allow a 
> « faithful » solution: this seems to mean roughly a class-encapsulated 
> one (it is hard to understand what they precisely mean in their 
> guidelines), which I could not really emulate in my code due to global 
> nature of fontdimen assignments.

hm, do you really need local?

if you use csnames, then you can also consider using \chardef's for 
numbers (these obey grouping)

> (I also experimented with  a csname based approach but never could reach 
> comparable speed to fontdimen arrays ; and this required extending other 
> parts of the memory)

in luatex csname is costly because of the serialization (pdftex is 
probably faster because there is no utf related overhead)

> Here is a link to how the various implementations sort out currently on 
> one specific machine:
> 
> https://plummerssoftwarellc.github.io/PrimeView/?sc=dt&sd=True&rc=30 
> <https://plummerssoftwarellc.github.io/PrimeView/?sc=dt&sd=True&rc=30>
the lua solution they post is not only somewhat slow but also makes some 
(imo wrong, but who am i to claim) assumptions about how lua stores data 
so it was not that hard to make a variant that was over 200 times faster

because i have a relative old laptop i can't compare with the numbers 
for e.g. c there (of course lua will be slower) but as i consider these 
shootouts useles anyway, i didn't want to spend more time on it (all 
that docker stuff and such) nor comment on the posted lua code (i never 
comment on code anyway, unless I know someone well and we can discuss 
specific issues out of mutual interest)

(messing with bits and storing efficiently in lua probably costs more 
than it saves, and the same might be true in tex)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------