norbert at preining.info
Fri Apr 22 14:09:10 CEST 2022
please Cc, I am not subscribed to texhax.
On Thu, 21 Apr 2022, texhax-bounces at tug.org wrote:
> Today I started installing texlive and the install time will approximately
> be 3 hours and 30 minutes. It seems it is not sped up, compared to 5 years
> ago when I needed it as well.
Yes, that is long. THere can be many reasons for that:
- you are not using LWP but wget/curl out of whatever reason
LWP keeps one open connection and reuses it, which is considerably
faster than creatinga new connection for each single package, which
amounts to about 6000 or so connections.
- your internet connection is bad, or hasn't improved considerable
compared to 5 years ago, while TL has increased considerably in size
- any combination of the above and other factors.
> I am wondering how complex it would be to speed up the installation.
Very complex, in particular because nobody has come up with a clear
indication where the bottleneck is.
> For example by leaving as much of the codebase as is but once the
> configuration is known (through the gui or cli) transferring the install
> control to a multithreaded / asynchronous script, for example in python,
> that installs the necessary packages and returns status updates.
That would of course help, but realizing it is not trivial, in
particular for a "spare time project". Furthermore, keep in mind that
the solution has to work on Windows, Mac, *BSD, Linux, ... - and that
also means that the programming language of choice has to be available
in all of them.
I considered Python back then - but it was non-standard and not by
default available in many cases. Since then my trust in Python *as*
*stable* programming language has dropped even further (and I have
written quite a lot of Python code as ML/AI engineer), since every major
version, and often even minor version changes break backward
compatibility. So my guess is that using Python would be a hell of a
problem producer instead of solver. Throw in concurrency, which in
Python is anyway in most of the cases fake (due to GIL).
> I have some experience in python, and python multithreading / asynchronous
> operations, so I would be glad to help with this.
Despite the above, we can still try it out! You can start any time, the
basics are trivial: Input is:
- a list of packages/package names
- a destination directory
and what the script has to do is:
- download the respective containers
- verify the checksum
- unpack them into the destination directory
and afterwards return reasonable error codes so that we can do
If you are interested, I will be more than happy to review some code
that implements this logic. Of course I can help with finding the
correct URLs etc etc.
Reminder - please cc, I am not subscribed.
All the best
PREINING Norbert https://www.preining.info
Mercari Inc. + IFMGA Guide + TU Wien + TeX Live
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
More information about the texhax