[tex-live] Re: ftp vs http

Nelson H. F. Beebe beebe@math.utah.edu
Sat, 16 Nov 2002 17:01:47 -0700 (MST)

Roozbeh Pournader <roozbeh@sharif.edu> writes on Sat, 16 Nov 2002
10:09:19 +0330 (IRT):

>> ...
>> On Fri, 15 Nov 2002, Kaja P. Christiansen wrote:
>> > The issue of accessing tlprod via http has come up on more than one
>> > occasion. Why not use ftp?
>> Because FTP has is not network friendly. It puts unnecessary network load
>> on the network, the server, and the client. This is specially important
>> for very large files. (The possiblity of a security breach of the server
>> because of a bad implementation is also higher, compared to solid http
>> servers like Apache.)
>> ...

I cannot let this misinformation pass unchallenged.

(1) ftp in fact puts less load on the network for smaller transfers,
    since, unlike http, it is not stateless, and requires fewer round
    trips for communication.

    For transfers of large amounts of data, both are comparable, since
    essentially the same bytes are sent in one direction; this can be
    readily confirmed by timing the grabbing of large files with
    ncftpget and wget.

(2) The security-breach point is a red herring.  Security problems
    have been found in both Web and FTP daemons, from all vendors.
    For example, a search at http://www.cert.org/ selecting categories

	* Advisories
	* Incident Notes
	* Security Improvement Modules
	* Vulnerability Notes

    found 230 reports for Apache, and 80 for wu-ftpd (one of the more
    widely-used FTP daemons, and the one that we have run for more
    than a decade on ftp://ftp.math.utah.edu ==

    Statistics in a report that I prepared a few days ago showed that
    we transfer about ten times as many files by http than by ftp, but
    we transfer ten times the amount of data by ftp than by http.  By
    ftp, we transfer an average of 56GB/day (ranging from 14GB/day to
    185GB/day), with some weeks having over a terabyte of traffic.

(3) Many ftp sites, including ours, support batch archive retrievals
    of directory trees, something that is not usually possible with
    http: it has to rely on hypertext links to find files, since the
    protocol does not provide a directory-listing service if an index
    file is present. Thus a recursive wget invocation is NOT
    equivalent to an ftp archive get.

    By contrast, with ftp, I can do this:

	% ncftp ftp://ftp.math.utah.edu
	ncftp> cd /pub
	ncftp> get bibnet.tar.gz

    to retrieve the BibNet Project bibliography archive; the .tar.gz
    file is created on-the-fly, and I could have instead asked for
    .jar, .tar, .tar.Z, .tar.bz2, .trz, .tgz, .zip, or .zoo formats.

(4) With ftp, clients can do a directory listing and get time stamps
    and file sizes.  This is not usually possible with Web
    connections, because that information is hidden from the user.
    Good ftp clients preserve the time stamps, which is critically
    important for filesystem mirroring.

(5) Many ftp servers support the "quote site index" command to locate
    files.  Here is an example:

	% ncftp ftp://ftp.tex.ac.uk/
	ncftp / > quote site index bibclean
	index bibclean
	NOTE. This index shows at most 20 lines. for a full list of files,
	retrieve /pub/archive/FILES.byname
	1997/02/27 |      18375 | biblio/bibtex/utils/bibclean/bibclean-2.11.3.tar-lst
	1997/02/27 |    1464494 | biblio/bibtex/utils/bibclean/bibclean-2.11.3.tar.gz
	 (end of 'index bibclean')

    The same sort of thing works at ftp://ftp.math.utah.edu/,  but
    alas, not at ftp://ftp.tug.org/.  [Can someone get this working?
    I can provide guidance if needed.]

- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- Center for Scientific Computing       FAX: +1 801 581 4148                  -
- University of Utah                    Internet e-mail: beebe@math.utah.edu  -
- Department of Mathematics, 110 LCB        beebe@acm.org  beebe@computer.org -
- 155 S 1400 E RM 233                       beebe@ieee.org                    -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe  -