[tex-live] Re: ftp vs http
Nelson H. F. Beebe
beebe@math.utah.edu
Sat, 16 Nov 2002 17:01:47 -0700 (MST)
Roozbeh Pournader <roozbeh@sharif.edu> writes on Sat, 16 Nov 2002
10:09:19 +0330 (IRT):
>> ...
>> On Fri, 15 Nov 2002, Kaja P. Christiansen wrote:
>>
>> > The issue of accessing tlprod via http has come up on more than one
>> > occasion. Why not use ftp?
>>
>> Because FTP has is not network friendly. It puts unnecessary network load
>> on the network, the server, and the client. This is specially important
>> for very large files. (The possiblity of a security breach of the server
>> because of a bad implementation is also higher, compared to solid http
>> servers like Apache.)
>> ...
I cannot let this misinformation pass unchallenged.
(1) ftp in fact puts less load on the network for smaller transfers,
since, unlike http, it is not stateless, and requires fewer round
trips for communication.
For transfers of large amounts of data, both are comparable, since
essentially the same bytes are sent in one direction; this can be
readily confirmed by timing the grabbing of large files with
ncftpget and wget.
(2) The security-breach point is a red herring. Security problems
have been found in both Web and FTP daemons, from all vendors.
For example, a search at http://www.cert.org/ selecting categories
of
* Advisories
* Incident Notes
* Security Improvement Modules
* Vulnerability Notes
found 230 reports for Apache, and 80 for wu-ftpd (one of the more
widely-used FTP daemons, and the one that we have run for more
than a decade on ftp://ftp.math.utah.edu ==
ftp://ctan.math.utah.edu).
Statistics in a report that I prepared a few days ago showed that
we transfer about ten times as many files by http than by ftp, but
we transfer ten times the amount of data by ftp than by http. By
ftp, we transfer an average of 56GB/day (ranging from 14GB/day to
185GB/day), with some weeks having over a terabyte of traffic.
(3) Many ftp sites, including ours, support batch archive retrievals
of directory trees, something that is not usually possible with
http: it has to rely on hypertext links to find files, since the
protocol does not provide a directory-listing service if an index
file is present. Thus a recursive wget invocation is NOT
equivalent to an ftp archive get.
By contrast, with ftp, I can do this:
% ncftp ftp://ftp.math.utah.edu
ncftp> cd /pub
ncftp> get bibnet.tar.gz
to retrieve the BibNet Project bibliography archive; the .tar.gz
file is created on-the-fly, and I could have instead asked for
.jar, .tar, .tar.Z, .tar.bz2, .trz, .tgz, .zip, or .zoo formats.
(4) With ftp, clients can do a directory listing and get time stamps
and file sizes. This is not usually possible with Web
connections, because that information is hidden from the user.
Good ftp clients preserve the time stamps, which is critically
important for filesystem mirroring.
(5) Many ftp servers support the "quote site index" command to locate
files. Here is an example:
% ncftp ftp://ftp.tex.ac.uk/
ncftp / > quote site index bibclean
index bibclean
NOTE. This index shows at most 20 lines. for a full list of files,
retrieve /pub/archive/FILES.byname
1997/02/27 | 18375 | biblio/bibtex/utils/bibclean/bibclean-2.11.3.tar-lst
1997/02/27 | 1464494 | biblio/bibtex/utils/bibclean/bibclean-2.11.3.tar.gz
(end of 'index bibclean')
The same sort of thing works at ftp://ftp.math.utah.edu/, but
alas, not at ftp://ftp.tug.org/. [Can someone get this working?
I can provide guidance if needed.]
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- Center for Scientific Computing FAX: +1 801 581 4148 -
- University of Utah Internet e-mail: beebe@math.utah.edu -
- Department of Mathematics, 110 LCB beebe@acm.org beebe@computer.org -
- 155 S 1400 E RM 233 beebe@ieee.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe -
-------------------------------------------------------------------------------