[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Tutorial on file names and usage with IBM systems



On Tue, 5 Mar 1991 10:03 PST Don Hosek (DH>), Nelson Beebe (NB>), and
others wrote about file naming conventions under various operating
systems.  The information for the IBM systems VM/CMS and MVS contained
some minor errors.
 
Along with the file naming conventions comes the issue of accessing the
file from programs.  This is more of a concern for how fonts are grouped
and how searches are performed than it is for the names themselves, but
I feel that it is a reasonable topic for discussion on this list.  If the
topic is to be discussed, it would probably be beneficial to have the site
coordinators for different systems/environments join the list and take part.
 
NB> IBM CMS allows 8 + 4;
DH> 8+8 actually. No subdirectories.
 
VM/CMS does have directories in a sense.  Under VM, disk space is divided
into "mini-disks."  Mini-disks are assigned to a particular userid, but
various options exist for sharing them with other users.  Of interest here
is that a TeX account could own many disks (for different sets of fonts or
something) with the ability for other users to access the disks in read-
only mode.
 
To use a mini-disk, it is necessary for a user to link to and access it.
(This is typically done in an EXEC file, but can also be done from a
program.)  When accessing the mini-disk, the user (or EXEC) assigns a
single letter of the alphabet to refer to files on that mini-disk.  Thus,
a user may have up to 26 mini-disks accessed at any moment.  A reference
to a particular file would be written something like:
 
   CMR10 300PK A
 
which refers to a 300 dpi PK file for font cmr10 residing on the disk
currently accessed as under disk A.  (The final letter, called a file-
mode, can also contain a single digit from 0 through 6 indicating a
type of a file.  However, this is not relevant to the discussion here.)
 
When a new disk is accessed, most EXECs assign the next unused letter in
the alphabet to it.  Thus, a mini-disk which is accessed as disk C by
one user might be called as disk F by another.  Because of this, CMS
allows a reference to a file to be written as:
 
   CMR10 300PK *
 
which calls for a 300 dpi PK file for cmr10 from the first mini-disk
in alphabetical order which contains the file.
 
Most programs use a filemode of * so that all accessed mini-disks are
searched.  An advantage of this is that the order in which the disks are
searched can be controlled by the letters assigned as the filemodes;
someone working on, say, tuning the blacker and fillin values for the
cm fonts could place the newly-created GF or PK file on his or her A
disk.  The new files would then be used in place of an older version
of the font.
 
By dividing fonts into different mini-disks based on the output device,
or font foundry, or something, mini-disks serve the same function as
subdirectories on other systems.  Use of * as a filemode provides a
user-controllable search path mechanism which is external to programs.
 
 
DH> MVS allows 8*8 (that is their can be up to eight parts of the
DH> file name, each eight characters long. No element may start with
DH> a number).
 
Hmm.  Mostly correct, but with an important aspect not mentioned.
 
Dataset names under MVS consist of up to eight levels of up to eight
characters each.  The first character of each level must be a letter, @
#, or $ while the others can be letters, digits, @, #, or $.  The data-
set name is written with the levels separated by periods.  The whole
name including periods cannot exceed 44 characters.
 
Under MVS, each dataset occupies a whole number of tracks on disk.  For
the disks we use here, this means that datasets come in increments of
48K byte allocations!  For things like TFM files, this would be a gross
waste of disk space.  This leads to the aspect not mentioned: "partitioned
datasets" (or PDSs).
 
A PDS is simply a library of multiple "members."  In a sense, it is like
a subdirectory.  Member names follow the standard MVS pattern of being
>From one to eight characters with the first being a letter, @, #, or $ and
the remaining being letters, digits, @, #, or $.  Since multiple members
can be written to the same track, this results in a considerable savings
of disk space.
 
When datasets or members are referenced from within a program, the dataset
name is typically *not* used.  Instead, the dataset is associated (externally
to the program) with a one-to-eight character "DD name."  A program then
refers either to the DD name, or the DD name and a member name.  (This
makes names under MVS be 8+8.)  The DD name is fixed, but the program can
select different member names as necessary.  The exact syntax varies with
the programming language and library routines used, but typical syntax
would be something like:
 
   PK300(CMR10)
 
which references member CMR10 from the dataset defined under the DD name
of PK300.  The actual dataset name might be something like:
 
   USR.X066.TX.FONTS.XEROX.PK300
 
Although it is possible to access datasets and members by dataset name
dynamically within a program (ie, without pre-defining them externally),
this would normally not be done when working with the various font files:
it is less efficient and it disables an external search path capability.
 
A search path capability exists in that up to sixteen PDSs may be
associated with a single DD name.  When the system searches for a given
member, it searches the directories of each PDS in succession and selects
the first one that it finds.  This gives MVS an external search path
capability similar to that available under CMS.
 
---Tom Reid