[tlbuild] zstd as package compressor

Norbert Preining preining at logic.at
Thu Mar 5 00:49:56 CET 2020


Hi Henri,

> > looks like by far the most time is spent in TLUtils::download_file.  The

One more thing I realized, that puzzled me:
2769	9153	9.70ms			  my $success = 0;
2770	9153	13.4ms			  for my $downtype (@downloader_trials) {
2771	12834	5.18ms			    if ($downtype eq 'lwp') {
2772	9153	21.3ms	9153	14063s	      if (_download_file_lwp($url, $dest)) {
# spent 14063s making 9153 calls to TeXLive::TLUtils::_download_file_lwp, avg 1.54s/call
2773	5472	5.61ms			        $success = $downtype;
2774	5472	8.36ms			        last;
2775					      }
2776					    }
2777	7362	8.82ms	7362	8.10ms	    if ($downtype eq "custom" || TeXLive::TLUtils::member($downtype, @{$::progs{'working_downloaders'}})) {
# spent  8.10ms making 7362 calls to TeXLive::TLUtils::member, avg 1µs/call
2778	3681	10.8ms	3681	16420s	      if (_download_file_program($url, $dest, $downtype)) {
# spent 16420s making 3681 calls to TeXLive::TLUtils::_download_file_program, avg 4.46s/call
2779	3681	16.0ms			        $success = $downtype;
2780	3681	37.2ms			        last;
2781					      }
2782					    }
2783					  }

Why are some files downloaded with lwp (line 2772), which obviously
didn't succeed in quite some cases, and then downloaded again with
whatever program is set.

This really hurts. We see that the average speed of the LWP download is
quite fast (the http connection is already established, so there is
considerably less overhead), while the download programs (wget, curl)
need to set up a new connection each time.
	lwp	1.54s/call
	other	4.46s/call

Looking into the _download_file_lwp function
https://www.henrimenke.com/files/nytprof/TLUtils-pm-42-line.html#2793
I see that in 3671 cases, the check
	defined($::tldownload_server) && $::tldownload_server->enabled
failed. The further on numbers show this is due to
	->enabled
being false. In
https://www.henrimenke.com/files/nytprof/TLDownload-pm-75-line.html#109
we see that 6 times the download error occurred, thus the error count
is increased. 

So my guess is the following:
- first LWP is used for quite some time (5479)
- then at some point LWP threw errors (6 times)
- LWP becomes disabled and never reenabled again
- curl/wget is used for the rest of the session

Indeed, we could do better here ... namely, if the LWP session breaks
down, we shut it down and reinitialize it, that should be faster then
doing wget/curl downloads all the time after LWP collapsed once.

Thanks for your testing that was very insightful!

Best

Norbert

--
PREINING Norbert                              https://www.preining.info
Accelia Inc. + IFMGA ProGuide + TU Wien + JAIST + TeX Live + Debian Dev
GPG: 0x860CDC13   fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13


More information about the tlbuild mailing list.