# [OS X TeX] Building new formats (MacTeX)

Rowland McDonnell rjmm-lists1 at fireflyuk.net
Sun Sep 17 22:23:50 CEST 2006

> Hi Rowland,
>

Fine by me - if it results in a nice clear explanation of something that
helps me understand something, I'll lap it up.

Thank you for putting the time in to try to help clear things up.

> the wrong question, please be as exact as you possibly can in asking

I've tried to do that.

> So what do I think your question is? How do I customise my TeX
> install to include the UK hyphenation patterns?

Not quite: what I'm trying to do is customize my TeX installation so
that I get UK hyphenation /by default/ in all documents, while deviating
from standard practice as little as possible.

I suspect the sensible way to do this is build my new formats without
Babel being involved at all, and use my existing hyphen.cfg file.
\language=0 US English, \language=1 UK English, \language=1 is what you
get for typesetting by default.

> Since I normally use
> i-Installer to install and customise my tex, I had to do quite a bit
> of digging to answer this without the short but for some reason
> unacceptable (to you anyway) "Just run the i-Installer configure
stage".

<puzzled>  For some reason'?  If you've got as far as understanding
that I don't want to use i-installer, surely you've got as far as
understanding the reasons?  The short reason is:

I cannot control i-installer.  Specifically, I do not have the ability
to use i-installer to configure TeX to do what I want.

I do not have the ability to learn what I need to learn about how
i-installer works, so I have to find out how to do the job some other
way.

[I'm not going to use some software that does what i-installer does:
make large changes to the data on my computer without me having the
faintest idea what it's doing.

Isn't that normal prudent computer management?]

> The answer it seems comes in two stages: fmtutil and its
> configuration, and the language configuration files for the various
> formats. Yes, careful digging reveals that there is more than one,
> and they are not all called "language.dat"
>
> So, one step at a time: fmtutil will take the appropriate actions for
> you to build a format.

Do you know how I can find out what it does?  I'm a Mac user, not a Unix
expert of any sort.  I need to know what steps are followed so that I
understand what's going on.  I might not be a Unix expert, but I *am* a
bit of a TeXnician.

> I would advise against doing it manually, even
> though pdfetex -ini and some other parameter calls is certainly
> possible.

I read the documentation, which does talk about using iniTeX and doesn't
mention fmtutil in any obvious places, so I decided to take the route
that the documentation suggested.

fmtutil does look like a better idea - but how can I learn about it?

>TeTeX however includes some management utilities to make
> life easier, just use them.

This is very easy to say, but it's very hard to do in practice - what
management tools' does teTeX come with?  How does one use them?  There
is no documentation to explain that I can find, so I must say that these
tools are certainly not included with teTeX for the convenience of the
ordinary Mac user who wishes to look after a MacTeX installation -
perhaps they are there for the convenience of established expert Unix

I have printed out and read the documentation which tells me about the
manual ways of doing the job.  I can't find the documentation to even
tell me what management tools' teTeX comes with, let alone how to use
them!

These management tools' (I've come across things like updmap, fmtutil,
and a couple of others) are barely mentioned in any documentation, let
alone adequately documented - but since they're more widely used than
i-installer, I'm taking the line that I might be able to find out how to
use some of them (if only because <news://comp.text.tex> is less hostile
than this mailing list).

> OK fmtutil. This tool is a shell script, so in principle, one can
> figure out what it does.

In principle, one can figure out what any software does with a
decompiler and plenty of time and the time and manuals and other
wherewithal to turn oneself into an expert programmer, and so on.

I'm a Mac user, and no sort of Unix expert.  You might as well tell me

> There is an --edit option to the tool, that
> you can use to edit the file,

How does the --edit option work?

> however, there are some permissions
> checks on that, and for me this doesn't work.

Er?

> Luckily, the default
> name for the configuration file can be found in the script:
> "fmtutil.cnf",

thing *is* to read the source code - is that what I'm expected to do

>and kpsewhich can be used to find which one is used:

kpsewhich' returns only:

Missing argument. Try kpsewhich --help' for more information.

> /
> usr/local/teTeX/share/texmf/web2c/fmtutil.cnf.

Which command line incantation did you use to get that answer?

kpsewhich <options> fmtutil.cnf

is what you used, correct?  Which options, and why?

> There are some notes at the top of the file, and I copied an example:
>
> # The format of the table is:
> # format  engine      pattern-file    arguments
> # The last part of "arguments" must be the name of the file to run
> # initex (or another "ini"-engine) on.
>
> pdflatex  pdfetex     language.dat    -translate-file=cp227.tcx
> *pdflatex.ini
>
> so the pdflatex format uses the pdfetex engine, uses langage.dat for
> the language configuration, and needs some codepage translation. The
> macros themselves are loaded from pdflatex.ini.

Okay - I've not come across these *.ini files before.  I've read the
teTeX manual, and it doesn't mention them.

Can you suggest where I can look to find out about these files and how
they fit in to the system of TeX (ini/La/pdf/whatnot) that I know about?

Having read latex.ini', I worked out that when you wrote the following
(which caused me a good deal of bother when I read it because I had no
idea what you were referring to):

The macros themselves are loaded from pdflatex.ini.'

you meant that the file pdflatex.ini is the file which loads the LaTeX
macros to build the format, using \input latex.ltx.

> The fmtutil loops
> over all formats that are not commented out, and uses these
> parameters to create the format in the right location.

What is fmtutil's idea of the right location', and how do I find out
what it is?

> So, for formats where there is a format and a language file listed
> here, it is as easy as finding the pattern descriptions file (mostly
> language.dat, sometimes it is something else), and ask which one to
> use with:
>
> kpsewhich -progname="engine" pattern-file
>
> where engine and pattern-file refer to the table columns given above.

Okay.  Which engine does one use as -progname?  It seems to me that
since iniTeX is used to build all formats, there's only one language.dat
file used, and that's the one that applies to iniTeX in all instances of
format building.

But that's just a logical interpretation of the way the system appears
to work from what I've been able to infer from the very limited amount
of information I've been able to find out - one seems to be expected to
work it all out as logical inferences' from a few hints, and I can't.

> I seem to recall you wanted babel in plain tex as well. On Mac OS X I
> assume this to mean that you want babel support in pdftex.

I don't /need/ it - it's just that what I read indicated that
everything' seemed to be using Babel, so I thought I investigate using
doing it the standard way' for a change instead of just rolling my
own' as I've done in the past.

Now I've spent some time trying to find out how to get Babel to do what
I want, I am reminded of why I've always rolled my own': it's so damned
hard to find out enough about quite a lot of the standard methods to do
anything with 'em.

Writing your own solution might produce something less optimal in an
abstract, theoretical sense, but at least you know what it does and can
get it to do what you want, more or less.  This makes it a far more fit
for purpose than the standard methods, which just don't work' due to my
inability to learn how to use 'em.

>This is a
> bit more tricky, as the following fragment shows. Well, at least the
> instructions are there…
>
> # Change "tex.ini -> bplain.ini" and "- -> language.dat"
> # if you want babel support in tex. Add -translate-file=cp227.tcx
> before tex.ini
> # if you want to make all characters directly "printable" for
> # any \write (instead of ^^xy).

if you want to make all characters directly "printable" for

Erm.  Do I want that?  What does it mean?

> So to add babel to tex and pdftex, change the tex, resp. pdftex
> format lines to:
>
> tex   tex language.dat    -translate-file=cp227.tcx bplain.ini
> pdftex    pdfetex language.dat    -translate-file=cp227.tcx bplain.ini
>
> I hope that takes care of the format creation.

Well, not really, but it's a start - at least you've pointed me in the
right direction by letting me know that I need to learn about fmtutil
and related issues.  I don't know how I'm going to do that learning yet
- let's see if I can learn enough to proceed with that method, eh? ;-)

> Now on to he language definition files. This can be slightly
> different, depending on the exact flavour of the tex format itself. I
> won't go into Context here, partly because I don't know the finer
> details, partly because I didn't see you mention it.
>
> The formats you seem to be using are plain tex and latex.

At the moment, I mostly use LaTeX with occasional Plain TeX.  I'd like
to look at ConTeXt one day, when I've got everything else working, and I
might well look at other formats.  But I've got to get things set up and
working first.

> They both
> use language.dat (at least after the changes listed above), which
> shortens the discussion. You'll want to have the language.dat file in
> the texmf.local tree.

No, I want to have my language.dat file(s?) somewhere else.

I want them in my local texmf tree, not texmf.local where they're likely
to be over-written by something automatic in the future.

> Copy the language.dat file given by kpsewhich
> to /usr/local/teTeX/share/texmf.local/tex/generic/config/language.dat
> and open it in your text editor. If you want to deal with other
> formats, please copy the appropriate language files into the local
> tree before editing as well. First of all it gives you a backup, and
> secondly it will prevent i-Installer to trample all over your
> changes, should you decide later on to use it anyway.
>
> For UK English, we find in that file: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> % UK english, TWO LINES! To enable these lines, remove %! and the
space.
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> %%%%%
> %! british    ukhyphen.tex % unavailable in teTeX due to license
problem!
> %! =UKenglish
>
> The license problem has been commented upon before. This defines the
> (babel) name british and loads the patterns given in ukhyphen.tex
> under that label. The second line defines the name UKenglish to be
> the same as british. All you need to do here is follow the
> instructions: remove '%! ' at the start of the two lines.

Yes - that bit's easy enough.

> There is another bit of instruction in the file: %%%%%%%%%%%%%%%%%%%%%
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> % CAUTION: the first language will be the default if no style-file
> %          (e.g. german.sty) is used.
> % Since version 3.0 of TeX, hyphenation patterns for multiple
> languages are
> % possible. Unless you know what you are doing, please let the
american
> % english patterns be the first ones. The babel system allows you to
> % easily change the active language for your texts. For more
> information,
> % have a look to the documentation in texmf/doc/generic/babel.
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> %%%%%
> activate the UK english hyphenation patterns.

That text you quote does not appear in the language.dat files that came
with MacTeX that I'm looking at.

And I'm not sure what you're advising me to do.  You refer to using the
Babel package to activate the UK English hyphenation patterns.  Okay,
but in what way?  I want to use the Babel package at format-building
time to set up LaTeX so that it defaults to UK English hyphenation, but
with all other hyphenations usually available, also available and set
up pretty much as normal'.

Are you saying that I would be committing some kind of LaTeX sin by
setting up Babel to give me UK English hyphenation patterns as the
default?  What I'd like to do is have US English as language 0 (as is
standard), with UK English as A.N. Other' language number, and the
language number for UK English being the default.  This maximises my
personal convenience: I want my computer to speak my language by
default.

That's my current setup - done with this hyphen.cfg file for LaTeX:

=============================================================
\language=0% Make bloody sure that the language is set right.
%
\InputIfFileExists{hyphen.tex}%
\language=0
\lefthyphenmin=2 \righthyphenmin=3 }%
{\errhelp{The configuration for hyphenation is incorrectly
installed.^^J%
If you don't understand this error message you need
\errmessage{OOPS! I can't find any hyphenation patterns for
US english.^^J \space Think of getting some or the
latex2e setup will never succeed}\@@end}
%
%
%
\language=1% Make bloody sure that the language is right.
\InputIfFileExists{ukhyphen.tex}%
\language=1
\lefthyphenmin=2 \righthyphenmin=3 }%
{\errhelp{The configuration for hyphenation is incorrectly
installed.^^J%
If you don't understand this error message you need
\errmessage{OOPS! I can't find any hyphenation patterns for
UK english.^^J \space Something is wrong.}\@@end}
%
%
%
\language=1
\message{UK hyphenation patterns installed as language 1; US hyphenation
as language 0.  Default language set to 1.}
\endinput
=============================================================

Would I be better off removing Babel and sticking with this?

> Enable all other languages you care to use, and save the file.
>
> Now run sudo -H -u root fmtutil-sys --all to re-create the format,
> with the configuration you just specified. The sudo is needed for the
> permissions, the -H -u root options for sudo set the username and
> home directory to the actual root user so that files in your ~/
> Library/texmf directory are not found. the rest is the actual command
> itself.

Righto.

> I hope this clears thing up for you. As to the other implied
> question: how do you figure this out?

That's what I'd really like to learn.

>Some experience with the
> cryptic output of --help, -? and -h of the various tools helps. A
> constant reminder that man pages list only what you need to know, and
> not a comma extra, and some experience with maintaining a Linux
> system at work.

Man pages contain much less than I need to know.  I know this because
most man pages don't make much sense to me at all.  I don't have any
experience with the tools, or with running any sort of Unix system.  I'm
a Mac user coming to this cold, with no experience of anything relevant.

Man pages tell you what you need to know if you are a Unix expert - but
only if you are a Unix expert.  It seems to me that they man pages are
in general useless unless you already know about whatever the man page
is explaining.

It seems that my suspicion was right: one /does/ have to be a Unix
sysadmin to learn how to set up and maintain a modern Mac TeX
distribution.  :-/  Ho hum.  I was hoping to avoid learning all that
Unix stuff.

> There are two files in most text distributions (well, unix-
> distributions anyway) that control everything: texmf.cnf and
> fmtutil.cnf.

Except, of course, that there are multiple files called texmf.cnf and
fmtutil.cnf, aren't there?  So the number is higher than two, and
confusion reigns as a result.

> Some starting text can be found in /usr/local/teTeX/
> share/texmf.gwtex/README.howtexfindsfiles.txt, with quite a few
> details in the comments in both texmf.cnf and fmtutil.cnf

Righto - so one has to ferret around the documentation, find things out,
and generally work like a flippin' detective to discover basic
information about using and configuring the software?  Oh joy.

> Oh, and
> some experience with programming helps: this system was developed by
> programmers, and as you have figured out by now: user friendliness
> wasn't at the top of their list of priorities.

Given that I have no experience programming modern computers, and have
been using Macs exclusively - computers noted for user-friendliness -
since about 1994 - I'm a bit stuffed, aren't I?

[Some decent documentation clearly needs to be written.  What's the
point of writing software without paying attention to usability?
Programmers shouldn't be let out on their own, that's what I reckon.

Ho hum.]

Thanks for your help.  It's just going to be horribly slow and painful
for me to learn, isn't it?

Cheers!
Rowland

> Regards,
>
> Maarten
>
> PS, I've added some minor remarks without explanation below.

Thank you.

> Right
> near the end there is a question on how MacTeX deals with a pre-
> existing texmf.local tree when updating: is it overwritten or not?

I haven't a clue.  It seems that MacTeX is meant to be updated by using
i-installer.   I infer that since MacTeX installs i-installer.  There is
no information supplied with MacTeX that I can see which explains how
one is meant to update any part of it.

I was intending to upgrade MacTeX manually, since I could see no other
method of updating it with any sort of idea what was going to be
modified.  Since texmf.local is filled with stuff when you install
MacTeX, it's reasonable to assume that any updating will modify that
directory, so system-wide local additions must be kept somewhere else.

MacTeX has no provision for such additions, so I had to roll my own
solution (with help off-list from Gerben Wierda).

> On 15-sep-2006, at 22:32, Rowland McDonnell wrote:
>
> > According to a note in the language.dat file, ukhyphen.tex is
> > unavailable in teTeX due to licensing problems.  I'd guess the same
> > applies to other distributions - are you sure UKEnglish is included
> > with TeXLive?
>
> Yes, in gwtex. Thomas Esser and Gerben Wierda have slightly different
> opinions on what can or can't be included.

Righto.

>Otherwise you probably can
> obtain it from CTAN: http://www.ctan.org/search.html

I've had the file for years, which should be obvious since I've been
saying I want to get my new TeX set up to work like the old TeX.  Access
to the file ukhyphen.tex is not really an issue.

> >>> I was thinking about modifying the Babel setup so that I could
> >>> have the existing languages plus the one I need, with the one I
> >>> need set up as the default.
> >>>
> >>> I see that to do this, I need to edit the appropriate
> >>> language.dat file.
> >>
> >> a simple find command in the terminal gives that these are all
> >> language .dat files in the texmf trees.
> >
> > Okay - I've printed out the find' man page and learnt how to do
> > this (details on how to do so at the end of this email).  But how
> > can I tell which file is used for which format?
>
> The one returned by kpsewhich, since that is the same routine used by
> tex itself.

kpsewhich' returns:

It turns out that to use kpsewhich in an intelligent fashion, you need
to learn a lot about what it is that you're looking for.  I've not yet
learnt that, so I cannot find out how to find out which language.dat
file gets used by what for what.

I do know that:

Hattie:teTeX rowland$kpsewhich language.dat /usr/local/teTeX/share/texmf.gwtex/tex/generic/config/language.dat and Hattie:teTeX rowland$ find . -name "language.dat"
./share/texmf.gwtex/tex/generic/config/language.dat
./share/texmf.tetex/tex/generic/config/language.dat
./share/texmf.tetex/tex/lambda/config/language.dat
./share/texmf.tetex/tex/platex/config/language.dat

which doesn't help me much.  What gets used for what, and when, and why?
I don't know and can't find out yet.

You need to understand teTeX thoroughly to use kpsewhich, and I don't.
I suspect the only people who can use kpsewhich properly are those
capable of identifying the file they're after manually.

> > At some point, iniTeX will run, and I need to make sure it'll read
> > the appropriate file when it runs - for every format I'm
> > rebuilding. Finding language.dat files isn't the problem: the
> > problem is making sure that the one I want to be read, is the one
>
> Sorry, I was barking up the wrong tree, I hope the discussion of
> fmtutil above did in fact help.

Not a lot, but at least I now know what I need to find out about.

> >> /usr/local/teTeX/share/texmf.local/tex/generic/config/language.dat
> >> /usr/local/teTeX/share/texmf.gwtex/tex/generic/config/language.dat
> >> /usr/local/teTeX/share/texmf.tetex/tex/generic/config/language.dat
> >> /usr/local/teTeX/share/texmf.tetex/tex/lambda/config/language.dat
> >> /usr/local/teTeX/share/texmf.tetex/tex/platex/config/language.dat
> >>
> >> The first one shadows the next two for all tex formats, except
> >> lambda and platex.
> >
> > I don't understand what you mean by this.  Could you tell me where
> > it's explained in the documentation (if anywhere)?
>
> The howtexfindsfiles mentioned above, plus comments in texmf.cnf. Or
> the output of kpsewhich, which is the routine actually used by tex &
> friends.

I see.  Well, I've read all those, and I don't understand what you
wrote.  I'll just keep at it.  I might crack the code one day.  If
that's the only documentation that's available, I think it's quite
reasonable for me to be as baffled as I am.

> >> So it seems that in practice, you can just limit yourself
> >> to the first one.
> >
> > What I actually get is this:
> >
> > Hattie:teTeX rowland\$ find . -name "language.dat"
> > ./share/texmf.gwtex/tex/generic/config/language.dat
> > ./share/texmf.tetex/tex/generic/config/language.dat
> > ./share/texmf.tetex/tex/lambda/config/language.dat
> > ./share/texmf.tetex/tex/platex/config/language.dat
> >
> > [snip]
> >
> > [Gerben Wierda says elsewhere:}
> >> The first one probably does not exists on a pristine install of
> >> MacTeX. The second one does, but it should be copied to the
> >> location of the first one before being edited or it will be
> >> overwritten on a next install.
> > [end Gerben Wierda]
> >
> > What I'd like to do is have a single local language.dat file that is
> > always read when I run iniTeX to rebuild all formats.
>
> See discussion of fmtutil.cnf

I have done.  Doesn't help much.

> > (and some sort of babel config file - damned if I can find any info
> > on how to set up Babel to give me a default language other than US
> > English)
>
> Babel is not part of the ...TeX format, so there is no default there.
> Load the package with \usepackage[english]{babel} and in latex you'll
> get the right patterns.

That is not what I want to do.  In any event, surely with english', I'd
get US English?

> For pdftex (plain), you'll need to \input
> some file, and \def some things, but I'm at a loss here. However, at
> least I now know how to includekpsewhich the languages in the first
> place.

Righto - well, I'd got the idea that things had been set up so that
Plain TeX used Babel as a semi-routine matter', but it seems not.  I
might as well stick with my methods than use Babel for Plain TeX.

[snip]

> >> The choice of the name 'format' in this context is a bit
> >> unfortunate: you want the -programme argument, not the -format
> >> argument.
> >
> > If I knew why you were explaining this, I might be able to follow
> > you better.
> >
> > Could you explain why you are telling me that there are -
> > programme' and -format' options?
> >
> > I'm completely baffled by this.  Ah!  No, having read my original
> > email on the subject, I am now less baffled.   Perhaps you could
> > [snip] less of the original text?
>
> I'd rather not, you can always read your original message, but
> snipping too little makes for rather tedious reading, and makes it a
> lot harder to follow the line of my explanation.

You are wrong: the fact is that the excessive snipping you did made it
impossible for me to follow the line of your explanation, until I
thought to re-read my original message (and just finding that message
took time - I don't have a threading email client).

The point is that you snip too much, and by doing so, you make it much

were saying, which is why I asked you to snip less.

Of course I can always read my original message, but the point of having
quoted text in a reply is so that I don't have to, to make it easier for
me to follow what you have to say.

[snip]

> > I need to find out how to replace all the format files used for
> > Plain TeXing and LaTeXing with the machete distribution, and do so
> > using my local language.dat (and perhaps other config) files.
> >
> >> The format of the language.dat file is documented in the file
> >> itself
> >
> > There is no explanation of the format of the file that I can see -
> > not an inadequate explanation: absolutely no explanation of any
> > sort.
>
> No, not for the format. Because I hardly ever do this by hand, I'm
> stumbling as I go along. That is the reason this message is somewhat
> delayed:

I don't see the time it took to arrive as delayed' - any large lump of
text takes time to write, and - well <shrug> you're offering free help
to a stranger.  It's welcome, whenever it arrives.

> decided to dive deeper into fmtutil.

Righto.

> >> (the languages are all there, but with most commented out, so
> >> there is no need to figure out what a language should be called).
> >
> > It's not the modifying of the file that's the problem, but working
> > out where to put my copy of the modified file so that it's 1) used
> > and 2) not overwritten; then I need to work out how to re-build all
> > the formats.
>
> 1) use kpsewhich with the engine and file-name for see if the file
> you had in mind is indeed found.

Yes, but I don't understand exactly what engine' means in this context
- which engine needs specifying for any given search, and why?  I don't

> 2) copy that file to reside within the texmf.local tree, and try (1)
> again to see if it is found instead.

Very bad idea - texmf.local is full of other peoples' stuff' when you
install MacTeX.

In my case, I'll copy to within /Users/Shared/texmf.rjmm/ in my case.

> Now, depending on how exactly you are going to update your tex: i-
> Installer will leave texmf.local alone. I don't know about MacTeX,
> perhaps someone else can comment on this.

I can't find anything at all to suggest how one might do updates using
MacTeX.  I'm going to worry about that later.

>You may want to copy your
> fmtutil.cnf into the texmf.local tree as well.

Definitely not, for the reasons I give above.

> With the commands I
> gave it is not possible to store the configuration the your home
> directory.

Why not?  Not that that's how I'm going to do things.

> > You say that using the fmutil command is the right thing to do - is
> > there anything to explain what I can expect to see happen if I use
> > it as directed on a default MacTeX installation?  The fmtutil man
> > page is an unusually terse piece of work, and I can find no other
> > documentation for fmtutil.
>
> See above, with some explanation on how I figured this out myself

Righto.  I'll have to look elsewhere for information, I suppose.
Cheers!
Rowland.
------------------------- Info --------------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
& FAQ: http://latex.yauh.de/faq/
TeX FAQ: http://www.tex.ac.uk/faq
List Archive: http://tug.org/pipermail/macostex-archives/