[tlbuild] wget in TL now needs https

Henri Menke henri at henrimenke.de
Tue Apr 27 09:01:34 CEST 2021


On 22/04/21, 16:36, Karl Berry wrote:
> Mojca, like all of us, I am just struggling to do the best we can to
> support as many people as we can.
> 
> I don't think it's appropriate to ask CTAN to expend their time
> maintaining a separate list of http mirrors purely for TL download
> purposes on systems that don't have an lwp or wget or curl that supports
> ssl. Feel free to ask them yourself if you wish.
> 
> I have no problem with leaving any or all of our TL wget binaries that
> only support http in place so that, as you say, users can explicitly
> give an http (or ftp for that matter) mirror.
> 
> We could pass --no-check-certificates when calling our tl-provided wget
> to avoid maintaining a cert database (which would be untenable). I am
> somewhat surprised that we do not already do that.

I see another possible issue here, which is HSTS (HTTP Strict Transport
Security). If the HTTP header

    strict-transport-security: max-age=<NUMBER>;

is present in the response from the server, it tells the client to ONLY
connect to that server through HTTPS for the next <NUMBER> seconds. If
you try to access the server through HTTP during that time and it
doesn't redirect to HTTPS it should produce an error.

These mirrors were determined to employ HSTS using a handy bash
"one-liner"

    $ curl -sSfL https://ctan.org/mirrors |
    > sed -n '/ *<a href="\([^"]*\)">https<\/a>/s//\1/gp' |
    > while read -r mirror; do
    >     if curl -s -I -X GET "$mirror" | grep strict-transport-security; then
    >         echo "$mirror"
    >     fi
    > done
    strict-transport-security: max-age=31536000;
    https://mirror.marwan.ma/ctan/
    strict-transport-security: max-age=63072000; preload
    https://mirrors.nju.edu.cn/CTAN/
    strict-transport-security: max-age=31536000
    https://mirrors.tuna.tsinghua.edu.cn/CTAN/
    strict-transport-security: max-age=63072000; includeSubdomains; preload
    https://mirror.unpad.ac.id/ctan/
    strict-transport-security: max-age=31536000
    https://mirror.datacenter.by/pub/mirrors/CTAN/
    strict-transport-security: max-age=31536000
    https://ctan.kako-dev.de/
    strict-transport-security: max-age=15768000; includeSubDomains
    https://ctan.mc1.root.project-creative.net/
    strict-transport-security: max-age=31536000
    https://mirror.dogado.de/tex-archive/
    strict-transport-security: max-age=15552000; includeSubDomains; preload
    https://ctan.javinator9889.com/
    strict-transport-security: max-age=31536000
    https://mirrors.chevalier.io/CTAN/
    strict-transport-security: max-age=15768000; includeSubDomains; preload
    https://ctan.ijs.si/tex-archive/
    strict-transport-security: max-age=63072000; includeSubDomains
    https://ctan.math.illinois.edu/

In my personal opinion, all HTTPS sites should use HSTS and connecting
over HTTPS without a valid certificate should simply fail.  With a
default --no-check-certificates, HTTPS is just worthless and not worth
the hassle.

That said, why do CTAN mirrors even need HTTPS? The tlmgr database is
signed and the signature is checked before doing anything, so even if
someone managed to MITM a mirror, there is no way to inject malicious
binaries, because the signing key is not available.

The only reason I can think of is privacy, because when I request

    http://mirrors.ctan.org/graphics/pgf/contrib/pgfplots/doc/pgfplots.pdf

over HTTP, any intermediary will be able to see the full request,
whereas over HTTPS, the request will be encrypted and only

    https://mirrors.ctan.org/

will be visible in clear text.

Cheers, Henri

> tlmgr tries (system) LWP first, then (system) curl, before (either
> system or tl) wget. Although it may test for the presence of all of
> them, that's the order in which they'll actually be used, as far as I
> know.
> 
> Users can always specify their own programs, their own arguments, and/or
> adjust preferences using the envvars TEXLIVE_DOWNLOAD,
> TL_DOWNLOAD_PROGRAM, TL_DOWNLOAD_ARGS, TEXLIVE_PREFER_OWN.
> 
> https://tug.org/texlive/doc/tlmgr.html#ENVIRONMENT-VARIABLES
> (or http :) --best, karl.


More information about the tlbuild mailing list.