[luatex] fio library byte order
Reinhard Kotucha
reinhard.kotucha at web.de
Mon Jun 29 19:45:17 CEST 2020
On 2020-06-28 at 13:52:56 +0200, Hans Hagen wrote:
> On 6/28/2020 3:26 AM, Reinhard Kotucha wrote:
>
> > > that adds passing parameters and checking them for each call
> > > ... you can then as well use lua's 'read' function and
> > > convert with string.byte/char which is then about equally
> > > fast
> >
> > This is what I actually did. It took 14 s to process a PNM file,
> > way too much if I have to process hundreds or thousands files. I
> > ported the script to C and could process the file within 270 ms.
> > I can't imagine that obeying a variable in C can slow down
> > everything so much.
>
> how big a file ... also, i bet you do more than just reading, you
> don't define what 'process' is (270 ms for 100K files is still not
> fast I guess)
96MiB per file. Processing means to apply a lookup table and a 3×3
color matrix, quite inexpensive operations. What takes most of the
time is to extract single bytes with string.sub() and to convert them
to integers. Finally I have to convert everything back to uint16.
In C I convert to host byte order with ntohs(3) and access the color
triplets by pushing a pointer around. In both cases I read the file
line by line (30024 bytes per line).
> > I'm not very familiar with C programming. You say that it's expensive
> > to pass arguments to a function. What I had in mind is that functions
> > obey a global variable at runtime which denotes whether byte order
> > conversion is necessary or not.
>
> passing variables in c is no issue (also because compilers are
> smart enough to deal with it)
>
> a global variable would not work because one can read several files
> a the same time interleaved with different properties
>
> i'm talking of picking up some optional argument passed by lua
> (passed on stack, checking needed, etc)
>
> anyway, there's nothing wrong with writing and using a c program if
> that is more suitable esp when you need to process that many files
> ... opening closing in lua is slower than in c, as is storing all
> your read bytes in lua variables (and i'm not even talking about
> the fact that a file metatable has to be looked up and type being
> checked for every read) plus some garbage collection every now and
> then
>
> as you can compile c, you can also write a dedicated library and add
> that to luatex (assuming you need to do this runtime from luatex)
>
> (you could consider using ffi)
Thanks for the info. I wasn't aware that reading bytes into lua
variables is expensive too. Maybe it's better indeed to stay with C.
> I downloaded the 3.7 GB texlive iso and read integers from that one
>
> -- 360 sec : one byte integers + counting
> -- 224 sec : two byte integers + counting
> -- 166 sec : four byte integers + counting (160 no counting)
>
> But that's a lot of lua calls.
Reading 96MB as two byte integers would then take 6 seconds, much more
than I expected.
> Then I downloaded the tug logo from the website
>
> -- string : .55 sec for 1000 times (including opening / loading)
> -- file : .67 sec for 1000 times (including opening / loading)
>
> So, that's milliseconds per file.
>
> Finally I processed the 3414 files in the 268M context distribution and
> read 2 byte integers from those till end of file which took 15 seconds
> for the lot. So, no complaints from my end.
This means that file opening is quite fast:
3700/224 = 16.518
268/15 = 17.867
> I think it's not the file handling that is your bottleneck.
Yes, thanks for the info.
Regards,
Reinhard
--
------------------------------------------------------------------
Reinhard Kotucha Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover mailto:reinhard.kotucha at web.de
------------------------------------------------------------------
More information about the luatex
mailing list.