Helmut Kopka's interpretation of the TDS

Michel LAVAUD twg-tds@tug.cs.umb.edu
Fri, 22 Nov 1996 13:09:54 +0100


To: twg-tds@tug.cs.umb.edu
Subject: Re: Helmut Kopka's interpretation of the TDS

> ML> I think TDS is adapted to Unix and not adapted to PC because of
> ML> (at least) two technical reasons, that are bound to each other:
> ML> 1 - there is no equivalent of "ln" Unix command, neither in DOS,
> ML> nor OS/2 nor Windows 95 nor Windows NT. Therefore, users can have
> ML> several TDS hierarchies if they have several disks.
> 
> Why do you need links? Do you want to turn a well-structured file
> system into a brush?
 
I do not want to enter into any quarrel about Unix and DOS file
systems or environment variables. I use both and I like both.
However: Unix filesystems are most of the time long trees, while
PC filesystems are most of the time made of many short trees
under the root of each disk. That's a fact of reality.
I agree it is possible to implement PC-like filesystems on Unix
machines and Unix-like filesystems on PC. But this is not the
habit, and my analysis of this difference is: this is mainly
because of the existence of "ln" in Unix, and its absence in DOS.
Maybe I am wrong, maybe there are other reasons.

Anyway, a filesystem is intimately linked with the OS or the
software used: on hard disks, there is just a set of cylinders
and sectors, no filesystem. And for each OS, I think there are
habits that reflect "macroscopically" the internal "microscopic"
functioning of the OS. One can organize a perfectly well
structured tree of files from 10000 files regrouped into one
single directory, by using a software with hypertext
possibilities: HTML browsers are an example, and a software for
PC I had described in articles in TeX confs in Prague and Aston
is another. The Unix OS is also one of these software, and DOS is
another of these software. 

If you use Unix-like filesystems on PC, you have many chances
that something will break somewhere. The most well-known reason
is the 8+3 limit of DOS, and this pb has been dealt with TDS. But
there are others, and the pb with env variables is indirectly
linked to that (still in my analysis, you may disagree with it).
One can violate the habits when one knows what to do, on one's
personal installation. But when writing a generic software,
supposed to be usable on different machines on which you have no
control, it is best, in my opinion, to write it so as not to
violate the habits. Otherwise the software is more fragile - i.e.
even if it is completely bug-free, it can break down more easily
by some manipulations of the user.

It is very difficult to write bug-free and efficient software,
with good and detailed documentation. I think emTeX enters in
this category. It works great as it is, and thousands of users
seem to be happy with it. So, I do not want to do all my best to
transform it it into a sloppy software just by trying to force it
into TDS. The situation may change if Eberhard changes his
hierarchy. 

Add other non technical reasons, maybe unimportant for
implementors but indispensable for users: emTeX has about 300
pages of well written documentation. Try to look at it and find
out how many places would have to be changed. And AsTeX
Association has translated all into French: the same problem,
multiplied by two. And for my own documentation (several hundreds
of pages) this is the same.

So, the result of switching to TDS would have been: much less
tested software, hundreds of pages of docs becoming suddenly
inaccurate, longer directories leading to more fragile
installations. As I am not an irreducible fan of TDS with the
eyes of Chimene for his lover, it sounded to me unreasonable to
switch to TDS. This is just an info on the present: everything is
already done, on the net, on diskettes and Plug & Play CD-ROM and
on printed books. If Eberhard switches to TDS I will certainly
follow him, but my objections about length of environment
variables would still be there.

> ML> 2 - Length allowed for environment variables is very short (127
> ML> characters in DOS and DOS emulators of OS/2 and Windows95/NT). 
> 
> Why do you need long environment variables?
> Does your TeX support no config files and no recursive searches?

The problem is not that *I*, as an individual, uses or not config
files or recursive search. The problem is that *users* can use
long environment variables, this is allowed. So your argument is
more or less the same as "to avoid car accidents, forbid people
to use their cars". It is a consequence of Murphy's law that hard
disks are almost always *full*, so that users might have several
TDS trees on their HD, and environment variables very often *too
long* and truncated without notice. For environment variables,
the rule of thumb is : the shortest, the safest. TDS makes some
pathes much longer than emTeX, especially for fonts.

And you forget that config files may have the same limit length
as environment variables. This is the choice of Eberhard, as far
as I know, and this is the most reasonable in my opinion, to
avoid bad mixings of env vars and config files (allowed for
users). You forget also that recursive search makes things often
much slowwwwwer. Rapidity is a very important issue for users.
This is offically the reason why Sparc/Risc stations were
invented (although there may also be commercial reasons?). 

BTW, is somebody planning to port TeX from C to Java, for
complete independence from platforms and to have only one set of
files, valid on all platforms ? This might be a useful complement
to TDS, and would eliminate the multitude of binaries for various
Unix platforms which encumber the TeXLive CD-ROM, and which are
useless for PC and Mac users.

I would propose that the TDS commitee would promote TDS, not as
"the" filesystem that all implementors must adopt, willing or
unwilling, in their implementation (or they go to
hell/jail/goulag:-); but rather as the analogous of the Physics
Abstracts classification, for scientific articles in Physics :

A tree where developers of macros or programs can put their own
work, according to the well-documented description of TDS for the
various places in this tree, with several "keywords" (i.e.
directories, for TDS) proposed by the author to put his work in
the tree. So, for each platform, each implementor has just to
write an "index" to put files in his own tree, i.e. a script
consisting of instructions "copy files in TDS tree into their
corresponding place in the tree specific to the implementation". 
This is the mechanism I adopted for my P&P CD-ROM, to deal with
the files that are not ISO9660 compliant, and thus cannot be run
directly from the CD-ROM - nor renamed, except for storage
purposes.

I think this proposal, which will probably appear as a big step
backwards to some (all?) people on this list, could be in
practice a step forward to transform TDS from a source of
confusion for PC users (as it is now, in my opinion) to a source
of useful information. And it would be harmless for teTeX and
other implementations.

A last remark: I am pretty sure no physicist uses the Physics
Abstracts classification for his own bibliography : for his own
subject, it is not enough detailed; for the subjects he is not a
specialist, this is far too detailed, and the classification is
very often inadapted to his personal classification, for easy
retrieval of an article. So, it would be more harmful than useful
for Science to force every physicist to adopt the PA
classification rather than his own.

But, once again, this is just my opinion, flames are welcome :-)
And please consider that my question about porting TeX to Java is
just a bad joke. I know it would make TeX unusable and sloppy, at
least in the present state of Java.

Michel Lavaud  (lavaud@univ-orleans.fr)