This document describes how to install and use the programs in the Web2c implementation of the TeX system, especially for Unix systems. It corresponds to Web2c version 7.5.7, released in July 2008.
This manual corresponds to version 7.5.7 of Web2c, released in July 2008.
Web2c is the name of a TeX implementation, originally for Unix, but now also running under DOS, Amiga, and other operating systems. By TeX implementation, we mean all of the standard programs developed by the Stanford TeX project directed by Donald E. Knuth: Metafont, DVItype, GFtoDVI, BibTeX, Tangle, etc., as well as TeX itself. Other programs are also included: DVIcopy, written by Peter Breitenlohner, MetaPost and its utilities (derived from Metafont), by John Hobby, etc.
General strategy: Web2c works, as its name implies, by translating the WEB source in which TeX is written into C source code. Its output is not self-contained, however; it makes extensive use of many macros and functions in a library (the web2c/lib directory in the sources). Therefore, it will not work without change on an arbitrary WEB program.
Availability: All of Web2c is freely available—“free” both in the sense of no cost (free ice cream) and of having the source code to modify and/or redistribute (free speech). See unixtex.ftp, for the practical details of how to obtain Web2c.
Different parts of the Web2c distribution have different licensing
terms, however, reflecting the different circumstances of their
creation; consult each source file for exact details. The main
practical implication for redistributors of Web2c is that the executables
are covered by the GNU General Public License, and therefore anyone
who gets a binary distribution must also get the sources, as explained
by the terms of the GPL (see Copying). The
GPL covers the Web2c executables, including tex, because the Free
Software Foundation sponsored the initial development of the Kpathsea
library that Web2c uses. The basic source files from Stanford, however,
have their own copyright terms or are in the public domain, and are not
covered by the GPL.
History: Tomas Rokicki originated the TeX-to-C system in 1987, working from the first change files for TeX under Unix, which were done primarily by Howard Trickey and Pavel Curtis. Tim Morgan then took over development and maintenance for a number of years; the name changed to Web-to-C somewhere in there. In 1990, Karl Berry became the maintainer. He made many changes to the original sources, and started using the shorter name Web2c. In 1997, Olaf Weber took over. Dozens of other people have contributed; their names are listed in the ChangeLog files.
Other acknowledgements: The University of Massachusetts at Boston (particularly Rick Martin and Bob Morris) has provided computers and ftp access to me for many years. Richard Stallman at the Free Software Foundation employed me while I wrote the original path searching library (for the GNU font utilities). (rms also gave us Emacs, GDB, and GCC, without which I cannot imagine developing Web2c.) And, of course, TeX would not exist in the first place without Donald E. Knuth.
Further reading: See References.
(A copy of this chapter is in the distribution file web2c/INSTALL.)
Installing Web2c is mostly the same as installing any other Kpathsea-using program. Therefore, for the basic steps involved, see Installation. (A copy is in the file kpathsea/INSTALL.)
One peculiarity to Web2c is that the source distribution comes in two files: web.tar.gz and web2c.tar.gz. You must retrieve and unpack them both. (We have two because the former archive contains the very large and seldom-changing original WEB source files.) See unixtex.ftp.
Another peculiarity is the MetaPost program. Although it has been
installed previously as mp, as of Web2c 7.0 the installed name is
now mpost, to avoid conflict with the mp program that does
prettyprinting. This approach was recommended by the MetaPost author,
John Hobby. If you as the TeX administrator wish to make it
available under its shorter name as well, you will have to set up a link
or some such yourself. And of course individual users can do the same.
For solutions to common installation problems and information on how to report a bug, see the file kpathsea/BUGS (see Bugs). See also the Web2c home page, http://www.tug.org/web2c.
Points worth repeating:
configure time,
as described in the first section below.
configure optionsThis section gives pointers to descriptions of the ‘--with’ and
‘--enable’ configure arguments that Web2c accepts. Some are
specific to Web2c, others are generic to all Kpathsea-using programs.
For a list of all the options configure accepts, run
‘configure --help’. The generic options are listed first, and the
package-specific options come last.
For a description of the generic options (which mainly allow you to
specify installation directories) and basic configure usage,
see Running configure scripts, a copy is in the file kpathsea/CONFIGURE.
mktextex.
configure does its best to guess). See Optional Features. A copy is in kpathsea/CONFIGURE.
In addition to the configure options listed in the previous
section, there are a few things that can be affected at compile-time
with C definitions, rather than with configure. Using any of
these is unusual.
To specify extra compiler flags (‘-Dname’ in this case), the simplest thing to do is:
make XCFLAGS="ccoptions"
You can also set the CFLAGS environment variable before
running configure. See configure environment.
Anyway, here are the possibilities:
Web2c has several Make targets besides the standard ones. You can invoke these either in the top level directory of the source distribution (the one containing kpathsea/ and web2c/), or in the web2c/ directory.
fmts, bases, and
mems variables. See the top of web2c/Makefile for the
possibilities.
To validate your TeX, Metafont, and MetaPost executables, run ‘make triptrap’. This runs the trip, trap, and mptrap “torture tests”. See the files triptrap/tripman.tex, triptrap/trapman.tex, and triptrap/mptrap.readme for detailed information and background on the tests.
The differences between your executables' behavior and the standard values will show up on your terminal. The usual differences (these are all acceptable) are:
Any other differences are trouble. The most common culprit in the past has been compiler bugs, especially when optimizing. See TeX or Metafont failing.
The files trip.diffs, mftrap.diffs, and mptrap.diffs in the triptrap directory show the standard diffs against the original output. If you diff your diffs against these files, you should come up clean. For example
make trip >&mytrip.diffs
diff triptrap/trip.diffs mytrip.diffs
To run the tests separately, use the targets trip, trap,
and mptrap.
To run simple tests for all the programs as well as the torture tests, run ‘make check’. You can compare the output to the distributed file tests/check.log if you like.
Besides the configure- and compile-time options described in the previous sections, you can control a number of parameters (in particular, array sizes) in the texmf.cnf runtime file read by Kpathsea (see Config files).
Rather than exhaustively listing them here, please see the last section of the distributed kpathsea/texmf.cnf. Some of the more interesting values:
hash_extra.
Of course, ideally all arrays would be dynamically expanded as necessary, so the only limiting factor would be the amount of swap space available. Unfortunately, implementing this is extremely difficult, as the fixed size of arrays is assumed in many places throughout the source code. These runtime limits are a practical compromise between the compile-time limits in previous versions, and truly dynamic arrays. (On the other hand, the Web2c BibTeX implementation does do dynamic reallocation of some arrays.)
Many aspects of the TeX system are the same among more than one program, so we describe all those pieces together, here.
To provide a clean and consistent behavior, we chose to have all these
programs use the GNU function getopt_long_only to parse command
lines. However, we do use in a restricted mode, where all the options
have to come before the rest of the arguments.
By convention, non-option arguments, if specified, generally define the name of an input file, as documented for each program.
If a particular option with a value is given more than once, it is the last value that counts.
For example, the following command line specifies the options ‘foo’, ‘bar’, and ‘verbose’; gives the value ‘baz’ to the ‘abc’ option, and the value ‘xyz’ to the ‘quux’ option; and specifies the filename -myfile-.
-foo --bar -verb -abc=baz -quux karl --quux xyz -- -myfile-
All of these programs accept the standard GNU ‘--help’ and ‘--version’ options, and several programs accept ‘--verbose’. Rather than writing identical descriptions for every program, they are described here.
TeX, Metafont, and MetaPost have a number of additional options in common:
initex
resp. inimf resp. inimpost, although these variants
are no longer typically installed.
KPATHSEA_DEBUG environment variable (for all Web2c programs).
(The command line value overrides.) The most useful value is ‘-1’,
to get all available output.
All of the Web2c programs, including TeX, which do path searching use
the Kpathsea routines to do so. The precise names of the environment
and configuration file variables which get searched for particular file
formatted are therefore documented in the Kpathsea manual
(see Supported file formats). Reading
texmf.cnf (see Config files), invoking
mktex... scripts (see mktex scripts), and so on are all handled by Kpathsea.
The programs which read fonts make use of another Kpathsea feature: texfonts.map, which allows arbitrary aliases for the actual names of font files; for example, ‘Times-Roman’ for ‘ptmr8r.tfm’. The distributed (and installed by default) texfonts.map includes aliases for many widely available PostScript fonts by their PostScript names.
All the programs generally follow the usual convention for output files. Namely, they are placed in the directory current when the program is run, regardless of any input file location; or, in a few cases, output is to standard output.
For example, if you run ‘tex /tmp/foo’, for example, the output will be in ./foo.dvi and ./foo.log, not /tmp/foo.dvi and /tmp/foo.log.
You can use the ‘-output-directory’ option to cause all output files that would normally be written in the current directory to be written in the specified directory instead. See Common options.
If the current directory is not writable, and ‘-output-directory’
is not specified, the main programs (TeX, Metafont, MetaPost, and
BibTeX) make an exception: if the config file value
TEXMFOUTPUT is set (it is not by default), output files are
written to the directory specified.
TeX, Metafont, and MetaPost have a number of features in common. Besides the ones here, the common command-line options are described in the previous section. The configuration file options that let you control some array sizes and other features are described in Runtime options.
The TeX, Metafont, and MetaPost programs each have two main
variants, called initial and virgin. As of Web2c 7, one
executable suffices for both variants, and in fact, the ini...
executables are no longer created.
The initial form is enabled if:
The virgin form is the one generally invoked for production use. The first thing it does is read a memory dump (see Determining the memory dump to use), and then proceeds on with the main job.
The initial form is generally used only to create memory dumps (see the next section). It starts up more slowly than the virgin form, because it must do lengthy initializations that are encapsulated in the memory dump file.
In typical use, TeX, Metafont, and MetaPost require a large number of macros to be predefined; therefore, they support memory dump files, which can be read much more efficiently than ordinary source code.
The programs all create memory dumps in slightly idiosyncratic (thought
substantially similar) way, so we describe the details in separate
sections (references below). The basic idea is to run the initial
version of the program (see Initial and virgin), read the source
file to define the macros, and then execute the \dump primitive.
Also, each program uses a different filename extension for its memory dumps, since although they are completely analogous they are not interchangeable (TeX cannot read a Metafont memory dump, for example).
Here is a list of filename extensions with references to examples of creating memory dumps:
When making memory dumps, the programs read environment variables and configuration files for path searching and other values as usual. If you are making a new installation and have environment variables pointing to an old one, for example, you will probably run into difficulties.
The virgin form (see Initial and virgin) of each program always reads a memory dump before processing normal source input. All three programs determine the memory dump to use in the same way:
%&dump, and
dump is an existing memory dump of the appropriate type,
dump is used.
The first line of the main input file can also specify which character
translation file is to be used: %&-translate-file=tcxfile
(see TCX files).
These two roles can be combined: %&dump
-translate-file=tcxfile. If this is done, the name of the dump
must be given first.
By default, memory dump files are generally sharable between
architectures of different types; specifically, on machines of different
endianness (see Byte order). (This is a
feature of the Web2c implementation, and is not true of all TeX
implementations.) If you specify ‘--disable-dump-share’ to
configure, however, memory dumps will be endian-dependent.
The reason to do this is speed. To achieve endian-independence, the reading of memory dumps on LittleEndian architectures, such as PC's and DEC architectures, is somewhat slowed (all the multibyte values have to be swapped). Usually, this is not noticeable, and the advantage of being able to share memory dumps across all platforms at a site far outweighs the speed loss. But if you're installing Web2c for use on LittleEndian machines only, perhaps on a PC being used only by you, you may wish to get maximum speed.
TeXnically, even without ‘--disable-dump-share’, sharing of .fmt files cannot be guaranteed to work. Floating-point values are always written in native format, and hence will generally not be readable across platforms. Fortunately, TeX uses floating point only to represent glue ratios, and all common formats (plain, LaTeX, AMSTeX, ...) do not do any glue setting at .fmt-creation time. Metafont and MetaPost do not use floating point in any dumped value at all.
Incidentally, different memory dump files will never compare equal byte-for-byte, because the program always dumps the current date and time. So don't be alarmed by just a few bytes difference.
If you don't know what endianness your machine is, and you're curious,
here is a little C program to tell you. (The configure script
contains a similar program.) This is from the book C: A Reference
Manual, by Samuel P. Harbison and Guy L. Steele
Jr. (see References).
main ()
{
/* Are we little or big endian? From Harbison&Steele. */
union
{
long l;
char c[sizeof (long)];
} u;
u.l = 1;
if (u.c[0] == 1)
printf ("LittleEndian\n");
else if (u.c[sizeof (long) - 1] == 1)
printf ("BigEndian\n");
else
printf ("unknownEndian");
exit (u.c[sizeof (long) - 1] == 1);
}
TeX, Metafont, and MetaPost all (by default) stop and ask for user intervention at an error. If the user responds with e or E, the program invokes an editor.
Specifying ‘--with-editor=cmd’ to configure sets the
default editor command string to cmd. The environment
variables/configuration values TEXEDIT, MFEDIT, and
MPEDIT (respectively) override this. If ‘--with-editor’ is
not specified, the default is vi +%d %s.
In this string, ‘%d’ is replaced by the line number of the error, and ‘%s’ is replaced by the name of the current input file.
\input filenames
TeX, Metafont, and MetaPost source programs can all read other source
files with the \input (TeX) and input (MF and MP)
primitives:
\input name % in TeX
The file name can always be terminated with whitespace; for
Metafont and MetaPost, the statement terminator ‘;’ also works.
(LaTeX and other macro packages provide other interfaces to
\input that allow different notation; here we are concerned only
with the primitive operation.)
As of Web2c version 7.5.3, double-quote characters can be used to include spaces or other special cases. In typical use, the ‘"’ characters surround the entire filename:
\input "filename with spaces"
Technically, the quote characters can be used inside the name, and can enclose any characters, as in:
\input filename" "with" "spaces
One more point. In LaTeX, the quotes are needed inside the braces, thus
\input{a b} % fails
\input{"a b"} % ok
This quoting mechanism comes into play after TeX has tokenized and expanded the input. So, multiple spaces and tabs may be seen as a single space, active characters such as ‘~’ are expanded first, and so on. (See below.)
On the other hand, various C library routines and Unix itself use the null byte (character code zero, ASCII NUL) to terminate strings. So filenames in Web2c cannot contain nulls, even though TeX itself does not treat NUL specially. In addition, some older Unix variants do not allow eight-bit characters (codes 128–255) in filenames.
For maximal portability of your document across systems, use only the characters ‘a’–‘z’, ‘0’–‘9’, and ‘.’, and restrict your filenames to at most eight characters (not including the extension), and at most a three-character extension. Do not use anything but simple filenames, since directory separators vary among systems; instead, add the necessary directories to the appropriate search path.
Finally, the present Web2c implementation does ‘~’ and ‘$’ expansion on name, unlike Knuth's original implementation and older versions of Web2c. Thus:
\input ~jsmith/$foo.bar
will dereference the environment variable or Kpathsea config file value ‘foo’ and read that file extended with ‘.bar’ in user ‘jsmith’'s home directory. You can also use braces, as in ‘${foo}bar’, if you want to follow the variable name with a letter, numeral, or ‘_’.
(So another way to get a program to read a filename containing whitespace is to define an environment variable and dereference it.)
In all the common TeX formats (plain TeX, LaTeX, AMSTeX),
the characters ‘~’ and ‘$’ have special category codes, so to
actually use these in a document you have to change their catcodes or
use \string. (The result is unportable anyway, see the
suggestions above.) The place where they are most likely to be useful
is when typing interactively.
TeX is a typesetting system: it was especially designed to handle complex mathematics, as well as most ordinary text typesetting.
TeX is a batch language, like C or Pascal, and not an interactive “word processor”: you compile a TeX input file into a corresponding device-independent (DVI) file (and then translate the DVI file to the commands for a particular output device). This approach has both considerable disadvantages and considerable advantages. For a complete description of the TeX language, see The TeXbook (see References). Many other books on TeX, introductory and otherwise, are available.
tex invocation
TeX (usually invoked as tex) formats the given text and
commands, and outputs a corresponding device-independent representation
of the typeset document. This section merely describes the options
available in the Web2c implementation. For a complete description of
the TeX typesetting language, see The TeXbook
(see References).
TeX, Metafont, and MetaPost process the command line (described here) and determine their memory dump (fmt) file in the same way (see Memory dumps). Synopses:
tex [option]... [texname[.tex]] [tex-commands]
tex [option]... \first-line
tex [option]... &fmt args
TeX searches the usual places for the main input file texname
(see Supported file formats), extending
texname with .tex if necessary. To see all the
relevant paths, set the environment variable KPATHSEA_DEBUG to
‘-1’ before running the program.
After texname is read, TeX processes any remaining
tex-commands on the command line as regular TeX input. Also,
if the first non-option argument begins with a TeX escape character
(usually \), TeX processes all non-option command-line
arguments as a line of regular TeX input.
If no arguments or options are specified, TeX prompts for an input file name with ‘**’.
TeX writes the main DVI output to the file basetexname.dvi, where basetexname is the basename of texname, or ‘texput’ if no input file was specified. A DVI file is a device-independent binary representation of your TeX document. The idea is that after running TeX, you translate the DVI file using a separate program to the commands for a particular output device, such as a PostScript printer (see Introduction) or an X Window System display (see xdvi(1)).
TeX also reads TFM files for any fonts you load in your document with
the \font primitive. By default, it runs an external program
named mktextfm to create any nonexistent TFM files. You can
disable this at configure-time or runtime (see mktex configuration). This is enabled mostly for the
sake of the EC fonts, which can be generated at any size.
TeX can write output files, via the \openout primitive; this
opens a security hole vulnerable to Trojan horse attack: an unwitting
user could run a TeX program that overwrites, say, ~/.rhosts.
(MetaPost has a write primitive with similar implications). To
alleviate this, there is a configuration variable openout_any,
which selects one of three levels of security. When it is set to
‘a’ (for “any”), no restrictions are imposed. When it is set to
‘r’ (for “restricted”), filenames beginning with ‘.’ are
disallowed (except .tex because LaTeX needs it). When it is set
to ‘p’ (for “paranoid”) additional restrictions are imposed: an
absolute filename must refer to a file in (a subdirectory) of
TEXMFOUTPUT, and any attempt to go up a directory level is
forbidden (that is, paths may not contain a ‘..’ component). The
paranoid setting is the default. (For backwards compatibility, ‘y’
and ‘1’ are synonyms of ‘a’, while ‘n’ and ‘0’ are
synonyms for ‘r’.)
In any case, all \openout filenames are recorded in the log file,
except those opened on the first line of input, which is processed when
the log file has not yet been opened. (If you as a TeX administrator
wish to implement more stringent rules on \openout, modifying the
function openoutnameok in web2c/lib/texmfmp.c is intended
to suffice.)
The program accepts the following options, as well as the standard ‘-help’ and ‘-version’ (see Common options):
\mubyte. This can be used
to support Unicode UTF-8 input encoding. See
http://www.olsak.net/enctex.html.
These options are available only if the ‘--enable-ipc’ option was
specified to configure during installation of Web2c.
INITEX (see Initial and virgin), enable MLTeX
extensions such as \charsubdef. Implicitly set if the program
name is mltex. See MLTeX.
\write. (If you as a TeX administrator wish to implement
more stringent rules on what can be executed, you will need to modify
tex.ch.)
Using the first form of this option, the ‘\special’ commands are inserted automatically.
In the second form of the option, string is a comma separated list of the following values: ‘cr’, ‘display’, ‘hbox’, ‘math’, ‘par’, ‘parend’, ‘vbox’. You can use this list to specify where you want TeX to output such commands. For example, ‘-src-specials=cr,math’ will output source information every line and every math formula.
These commands can be used with the appropriate DVI viewer and text editor to switch from the current position in the editor to the same position in the viewer and back from the viewer to the editor.
This option works by inserting ‘\special’ commands into the token stream, and thus in principle these additional tokens can be recovered or seen by the tricky-enough macros. If you run across a case, let us know, because this counts as a bug. However, such bugs are very hard to fix, requiring significant changes to TeX, so please don't count on it.
Redefining ‘\special’ will not affect the functioning of this option. The commands inserted into the token stream are hard-coded to always use the ‘\special’ primitive.
TeX does not pass the trip test when this option is enabled.
The initial form of TeX is invoked by ‘tex -ini’. It
does lengthy initializations avoided by the “virgin” (vir)
form, so as to be capable of dumping ‘.fmt’ files (see Memory dumps). For a detailed comparison of virgin and initial forms,
see Initial and virgin. In past releases, a separate program
initex was installed to invoke the initial form, but this is
no longer the case.
For a list of options and other information, see tex invocation.
Unlike Metafont and MetaPost, many format files are commonly used with TeX. The standard one implementing the features described in the TeXbook is ‘plain.fmt’, also known as ‘tex.fmt’ (again, see Memory dumps). It is created by default during installation, but you can also do so by hand if necessary (e.g., if an update to plain.tex is issued):
tex -ini '\input plain \dump'
(The quotes prevent interpretation of the backslashes from the shell.) Then install the resulting plain.fmt in ‘$(fmtdir)’ (/usr/local/share/texmf/web2c by default), and link tex.fmt to it.
The necessary invocation for generating a format file differs for each format, so instructions that come with the format should explain. The top-level web2c Makefile has targets for making most common formats: plain latex amstex texinfo eplain. See Formats, for more details on TeX formats.
TeX formats are large collections of macros, often dumped
into a .fmt file (see Memory dumps) by tex -ini
(see Initial TeX). A number of formats are in reasonably
widespread use, and the Web2c Makefile has targets to make the versions
current at the time of release. You can change which formats are
automatically built by setting the fmts Make variable; by default,
only the ‘plain’ and ‘latex’ formats are made.
You can get the latest versions of most of these formats from the CTAN archives in subdirectories of CTAN:/macros (for CTAN info, see unixtex.ftp). The archive ftp://ftp.tug.org/tex/lib.tar.gz (also available from CTAN) contains most of these formats (although perhaps not the absolute latest version), among other things.
TeX supports most natural languages. See also TeX extensions.
Multi-lingual TeX (mltex) is an extension of TeX originally
written by Michael Ferguson and now updated and maintained by Bernd
Raichle. It allows the use of non-existing glyphs in a font by
declaring glyph substitutions. These are restricted to substitutions of
an accented character glyph, which need not be defined in the current
font, by its appropriate \accent construction using a base and
accent character glyph, which do have to exist in the current font.
This substitution is automatically done behind the scenes, if necessary,
and thus MLTeX additionally supports hyphenation of words containing
an accented character glyph for fonts missing this glyph (e.g., Computer
Modern). Standard TeX suppresses hyphenation in this case.
MLTeX works at .fmt-creation time: the basic idea is to
specify the ‘-mltex’ option to TeX when you \dump a
format. Then, when you subsequently invoke TeX and read that
.fmt file, the MLTeX features described below will be enabled.
Generally, you use special macro files to create an MLTeX .fmt
file.
The sections below describe the two new primitives that MLTeX defines. Aside from these, MLTeX is completely compatible with standard TeX.
\charsubdef: Character substitutions
The most important primitive MLTeX adds is \charsubdef, used
in a way reminiscent of \chardef:
\charsubdef composite [=] accent base
Each of composite, accent, and base are font glyph numbers, expressed in the usual TeX syntax: `\e symbolically, '145 for octal, "65 for hex, 101 for decimal.
MLTeX's \charsubdef declares how to construct an accented
character glyph (not necessarily existing in the current font) using two
character glyphs (that do exist). Thus it defines whether a character
glyph code, either typed as a single character or using the \char
primitive, will be mapped to a font glyph or to an \accent glyph
construction.
For example, if you assume glyph code 138
(decimal) for an e-circumflex
and you are using the Computer Modern fonts, which have the circumflex
accent in position 18 and lowercase `e' in the usual ASCII position 101
decimal, you would use \charsubdef as follows:
\charsubdef 138 = 18 101
For the plain TeX format to make use of this substitution, you have
to redefine the circumflex accent macro \^ in such a way that if
its argument is character `e' the expansion \char138 is used
instead of \accent18 e. Similar \charsubdef declaration
and macro redefinitions have to be done for all other accented
characters.
To disable a previous \charsubdef c, redefine c
as a pair of zeros. For example:
\charsubdef '321 = 0 0 % disable N tilde
(Octal '321 is the ISO Latin-1 value for the Spanish N tilde.)
\charsubdef commands should only be given once. Although in
principle you can use \charsubdef at any time, the result is
unspecified. If \charsubdef declarations are changed, usually
either incorrect character dimensions will be used or MLTeX will
output missing character warnings. (The substitution of a
\charsubdef is used by TeX when appending the character node
to the current horizontal list, to compute the width of a horizontal box
when the box gets packed, and when building the \accent
construction at \shipout-time. In summary, the substitution is
accessed often, so changing it is not desirable, nor generally useful.)
\tracingcharsubdef: Substitution diagnosticsTo help diagnose problems with ‘\charsubdef’, MLTeX provides a
new primitive parameter, \tracingcharsubdef. If positive, every
use of \charsubdef will be reported. This can help track down
when a character is redefined.
In addition, if the TeX parameter \tracinglostchars is 100 or
more, the character substitutions actually performed at
\shipout-time will be recorded.
TCX (TeX character translation) files help TeX support direct input of 8-bit international characters if fonts containing those characters are being used. Specifically, they map an input (keyboard) character code to the internal TeX character code (a superset of ASCII).
Of the various proposals for handling more than one input encoding, TCX files were chosen because they follow Knuth's original ideas for the use of the ‘xchr’ and ‘xord’ tables. He ventured that these would be changed in the WEB source in order to adjust the actual version to a given environment. It turns out, however, that recompiling the WEB sources is not as simple a task as Knuth may have imagined; therefore, TCX files, providing the possibility of changing of the conversion tables on on-the-fly, have been implemented instead.
This approach limits the portability of TeX documents, as some implementations do not support it (or use a different method for input-internal reencoding). It may also be problematic to determine the encoding to use for a TeX document of unknown provenance; in the worst case, failure to do so correctly may result in subtle errors in the typeset output. But we feel the benefits outweigh these disadvantages.
This is entirely independent of the MLTeX extension (see MLTeX):
whereas a TCX file defines how an input keyboard character is mapped to
TeX's internal code, MLTeX defines substitutions for a
non-existing character glyph in a font with a \accent
construction made out of two separate character glyphs. TCX files
involve no new primitives; it is not possible to specify
that an input (keyboard) character maps to more than one character.
Information on specifying TCX files:
%& -translate-file=tcxfile
WEB2C path.
The Web2c distribution comes with a number of TCX files. Two important ones are il1-t1.tcx and il2-t1.tcx, which support ISO Latin 1 and ISO Latin 2, respectively, with Cork-encoded fonts (a.k.a. the LaTeX T1 encoding). TCX files for Czech, Polish, and Slovak are also provided.
One other notable TCX file is empty.tcx, which is, well, empty. Its purpose is to reset Web2C's behavior to the default (only visible ASCII being printable, as described below) when a format was dumped with another TCX being active—which is in fact the case for everything but plain TeX in the TeX Live and other distributions. Thus:
latex somefile8.tex
⇒ terminal etc. output with 8-bit chars
latex --translate-file=empty.tcx somefile8.tex
⇒ terminal etc. output with ^^ notation
src [dest [prnt]]
Finally, here's what happens: when TeX sees an input character with code src, it 1) changes src to dest; and 2) makes the dest code “printable”, i.e., printed as-is in diagnostics and the log file rather than in ‘^^’ notation.
By default, no characters are translated, and character codes between 32 and 126 inclusive (decimal) are printable.
Specifying translations for the printable ASCII characters (codes
32–127) will yield unpredictable results. Additionally you shouldn't
make the following characters printable: ^^I (TAB), ^^J
(line feed), ^^M (carriage return), and ^^? (delete),
since TeX uses them in various ways.
Thus, the idea is to specify the input (keyboard) character code for src, and the output (font) character code for dest.
By default, only the printable ASCII characters are considered printable by TeX. If you specify the ‘-8bit’ option, all characters are considered printable by default. If you specify both the ‘-8bit’ option and a TCX file, then the TCX can set specific characters to be non-printable.
Both the specified TCX encoding and whether characters are printable are saved in the dump files (like tex.fmt). So by giving these options in combination with ‘-ini’, you control the defaults seen by anyone who uses the resulting dump file.
When loading a dump, if the ‘-8bit’ option was given, then all characters become printable by default.
When loading a dump, if a TCX file was specified, then the TCX data from the dump is ignored and the data from the file used instead.
Patgen creates hyphenation patterns from dictionary files for use with TeX. Synopsis:
patgen dictionary patterns output translate
Each argument is a filename. No path searching is done. The output is written to the file output.
In addition, Patgen prompts interactively for other values.
For more information, see Word hy-phen-a-tion by com-puter by Frank Liang (see References), and also the patgen.web source file.
The only options are ‘-help’ and ‘-version’ (see Common options).
(Sorry, but I'm not going to write this unless someone actually uses this feature. Let me know.)
This functionality is available only if the ‘--enable-ipc’ option
was specified to configure during installation of Web2c
(see Installation).
If you define IPC_DEBUG before compilation (e.g., with ‘make
XCFLAGS=-DIPC_DEBUG’), TeX will print messages to standard error
about its socket operations. This may be helpful if you are, well,
debugging.
The base TeX program has been extended in many ways. Here's a partial list. Please send information on extensions not listed here to the address in Reporting bugs.
Metafont is a system for producing shapes; it was designed for producing complete typeface families, but it can also produce geometric designs, dingbats, etc. And it has considerable mathematical and equation-solving capabilities which can be useful entirely on their own.
Metafont is a batch language, like C or Pascal: you compile a Metafont program into a corresponding font, rather than interactively drawing lines or curves. This approach has both considerable disadvantages (people unfamiliar with conventional programming languages will be unlikely to find it usable) and considerable advantages (you can make your design intentions specific and parameterizable). For a complete description of the Metafont language, see The METAFONTbook (see References).
mf invocation
Metafont (usually invoked as mf) reads character definitions
specified in the Metafont programming language, and outputs the
corresponding font. This section merely describes the options available
in the Web2c implementation. For a complete description of the Metafont
language, see The Metafontbook (see References).
Metafont processes its command line and determines its memory dump (base) file in a way exactly analogous to MetaPost and TeX (see tex invocation, and see Memory dumps). Synopses:
mf [option]... [mfname[.mf]] [mf-commands]
mf [option]... \first-line
mf [option]... &base args
Most commonly, a Metafont invocation looks like this:
mf '\mode:=mode; mag:=magnification; input mfname'
(The single quotes avoid unwanted interpretation by the shell.)
Metafont searches the usual places for the main input file mfname
(see Supported file formats), extending
mfname with .mf if necessary. To see all the relevant
paths, set the environment variable KPATHSEA_DEBUG to ‘-1’
before running the program. By default, Metafont runs an external
program named mktexmf to create any nonexistent Metafont source
files you input. You can disable this at configure-time or runtime
(see mktex configuration). This is mostly
for the sake of the EC fonts, which can be generated at any size.
Metafont writes the main GF output to the file basemfname.nnngf, where nnn is the font resolution in pixels per inch, and basemfname is the basename of mfname, or ‘mfput’ if no input file was specified. A GF file contains bitmaps of the actual character shapes. Usually GF files are converted immediately to PK files with GFtoPK (see gftopk invocation), since PK files contain equivalent information, but are more compact. (Metafont output in GF format rather than PK for only historical reasons.)
Metafont also usually writes a metric file in TFM format to basemfname.tfm. A TFM file contains character dimensions, kerns, and ligatures, and spacing parameters. TeX reads only this .tfm file, not the GF file.
The mode in the example command above is a name referring to a
device definition (see Modes); for example, localfont or
ljfour. These device definitions must generally be precompiled
into the base file. If you leave this out, the default is proof
mode, as stated in The Metafontbook, in which Metafont outputs at
a resolution of 2602dpi; this is usually not what you want. The
remedy is simply to assign a different mode—localfont, for
example.
The magnification assignment in the example command above is a
magnification factor; for example, if the device is 600dpi and you
specify mag:=2, Metafont will produce output at 1200dpi.
Very often, the magnification is an expression such as
magstep(.5), corresponding to a TeX “magstep”, which are
factors of
After running Metafont, you can use the font in a TeX document as usual. For example:
\font\myfont = newfont
\myfont Now I am typesetting in my new font (minimum hamburgers).
The program accepts the following options, as well as the standard ‘-help’ and ‘-version’ (see Common options):
inimf is the “initial” form of Metafont, which does lengthy
initializations avoided by the “virgin” (vir) form, so as to
be capable of dumping ‘.base’ files (see Memory dumps). For
a detailed comparison of virgin and initial forms, see Initial and virgin. In past releases, a separate program inimf was
installed to invoke the initial form, but this is no longer the case.
For a list of options and other information, see mf invocation.
The only memory dump file commonly used with Metafont is the default ‘plain.base’, also known as ‘mf.base’ (again, see Memory dumps). It is created by default during installation, but you can also do so by hand if necessary (e.g., if a Metafont update is issued):
mf -ini '\input plain; input modes; dump'
(The quotes prevent interpretation of the backslashes from the shell.) Then install the resulting plain.base in ‘$(basedir)’ (/usr/local/share/texmf/web2c by default), and link mf.base to it.
For an explanation of the additional modes.mf file, see Modes. This file has no counterpart in TeX or MetaPost.
In the past, it was sometimes useful to create a base file cmmf.base (a.k.a. cm.base), with the Computer Modern macros also included in the base file. Nowadays, however, the additional time required to read cmbase.mf is exceedingly small, usually not enough to be worth the administrative hassle of updating the cmmf.base file when you install a new version of modes.mf. People actually working on a typeface may still find it worthwhile to create their own base file, of course.
Running Metafont and creating Metafont base files requires information that TeX and MetaPost do not: mode definitions which specify device characteristics, so Metafont can properly rasterize the shapes.
When making a base file, a file containing modes for locally-available devices should be input after plain.mf. One commonly used file is ftp://ftp.tug.org/tex/modes.mf; it includes all known definitions.
If, however, for some reason you have decreased the memory available in
your Metafont, you may need to copy modes.mf and remove the
definitions irrelevant to you (probably most of them) instead of using
it directly. (Or, if you're a Metafont hacker, maybe you can suggest a
way to redefine mode_def and/or mode_setup; right now, the
amount of memory used is approximately four times the total length of
the mode_def names, and that's a lot.)
If you have a device not included in mode