[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: raw font encodings

To: alanje@cogs.susx.ac.uk
Subject: Re: raw font encodings
From: mackay@cs.washington.edu (Pierre MacKay)
Date: Fri, 8 Apr 1994 16:22:02 -0700
Cc: tex-fonts@math.utah.edu, unixtex.u.washington.edu@cs.washington.edu, ridgeway@blackbox.hacc.washington.edu
In-Reply-To: Alan Jeffrey's message of Fri, 8 Apr 94 14:27 BST <m0ppGbG-00003QC@csrj.crn.cogs.susx.ac.uk>

   a) in some fonts, the composite glyphs aren't the same as the glyphs
   built from the appropriate simplexes, for example <Aacute> and
   <aacute> may use a different sized <acute> accent.

This is true for a very small number of fonts.  I have Rynnings's observation
that it happens to be true for Adobe Garamond and Adobe Caslon, but
I have not myself seen either of these.  For the foreseeable future,
I suspect that the pattern I see in my 300 or so text fonts (I get all
the Monotype package deals, even though I would not dream of using most
of what they contain) is the norm.  Is it really possible to make the special
cases the basis of a general practice.  

   b) in any Type 1 font with hints, there can be no hints between glyphs
   built from simplexes, so the <cedilla> and <c> in <ccedilla> may end
   up colliding, which they wouldn't in a hinted <ccedilla>.

I specifically recommend keeping Ccedilla and ccedilla as simplex, for
this and other reasons.  I would be quite happy to change the recommendation
to an ukaz.

   c) the PostScript produced by using virtual composite characters is
   longer.  Printing out the Cork alphabet with dvips uses 1.3K with raw
   composite characters, and 2.1K with virtual ones.  So I'd guess that a
   long PostScript document with heavy accenting would be 20--50% longer
   if the accents are produced virtually.

Undoubtedly true, but is it really a severe impediment.

   One possibility (suggested off-line by Don Hosek) would be to have the
   raw font encoded with the same encoding as the virtual font, so for T1
   (Cork) encoded Adobe Times the raw font would contain all the T1
   glyphs which Adobe provide, and then the VF would `fill in the gaps'.

For the single example of Cork encoding to Cork encoding that works.
I started out that way, but found that I got seriously tangled
up when I needed to make adjustments to supply the two dozen characters
which Near Eastern Language scholarship has used for more than a century
and which are unavailable even in Unicode.  The reason for adopting
something as close to Adobe Standard Encoding as possible was that
I could then make up virtuals with reasonable assurance that I knew
where  the original raw characters were

   The problem with this is that if you want to use the same font with a
   number of different virtual encodings, you need a different raw font
   each time.  

Exactly my reason for using ASE(X) for raw fonts.  As I say, I got
hopelessly entangled in conflicting codings.  Now I know that all
type1 test fonts are encoded in the raw form in exactly the same way.


   a) Provide a raw encoding containing all the Adobe Standard and ISO
      Latin-1 glyphs.

   b) Have the T1 VFs point to a sparse T1-encoded raw font.

The advantage of sparse coding is that if new characters get added
in standard or quasi-standard fonts, there is room to fit them
in.  A fully populated map can be very frustrating. 

   c) Don't include composite glyphs in the raw encoding.

Yup.

   The respective problems are:

   a) Requires the use of slots 0--31, which some previewers can't
      handle.

This is surprising, if true.  The avoidance of 0--31 and, indeed,
much of the oddity of the sparse coding of 128--256 have a lot to
do  with half-witted keyboard input modules, and  so  affect the
operation of word-processors, but when you get to a previewer that
is capable of handling DVI at all, I don't see how you can
lose all those code positions.  I find it hard to imagine that any
marketable postscript interpreter would prevent you from using
the entire encoding vector.  I should think Adobe might complain about
its being given the PostScript name in that case.  Early HP printers
made you dodge around some holes in the range 0--255, but if we are
talking about type1 fonts, we are talking about PostScript, and surely
the HP postscript (which is far from faultless) no longer has that
problem.

   b) Requires (at least) twice as much disk space for raw fonts,
      produces PostScript which uses more raw fonts.

That is very true, and it isn't just the use of space.  It is the
complexity it adds to the problem of managing a huge font library.  An
additional advantage gained by restricting the size and number of raw
fonts is that if they are few and small they can be downloaded into a
reasonably capacious lot of ram in the printer, and remain resident
for a day's work.

   c) Doesn't allow hinting or designed glyphs, produces longer
      PostScript.

OK, you can't hint, which is not usually much of a problem at hard-copy
resolutions, but can cause difficulties at display resolutions.
You can modify glyphs though.  For example, I use one font family
for both Turkish and Western European documentation.  It doesn't 
have designed accent composites and isn't likely to get them. You will hardly
ever see squatty caps and modified cap accents in Turkish books, partly
because the choice of accents is limited, and they can be and are pulled
down very close to the underlying letter.  By contrast, acute, grave and
worst of all ring accents can either adjust leading in undesirable ways
or risk bleeding into the descenders of the previous line.  I keep
different virtual fonts for the two instances.  In the West European font
the caps for composites are subjected to a transform that makes them
squat down just a bit, the accents are tilted to a flatter angle, and
the circumflex (and hacek) are slightly flattened.  (A similar trick
could be used to eliminate the rather unfortunate ovoid fullstop and
comma in slanted Computer Modern, incidentally.  Not easy with the comma,
but doable.)  All this is done with inline postscript, and involves
yet longer PostScript files of course.  It also involves repeated makefont
operations, which one is usually advised against, but I have timed
the operation, and in an admittedly very low-volume operation I
find that the hit is not all that serious.  PostScript remembers the
old transform, and uses very little time after the first effort.

   ... are there other options? ...

I have undoubtedly gone on too long already.

Pierre

References:
- Re: raw font encodings
  - From: alanje@cogs.susx.ac.uk (Alan Jeffrey)

Prev by Date: Re: raw font encodings
Next by Date: Re: raw font encodings
Prev by thread: Re: raw font encodings
Next by thread: Re: raw font encodings
Index(es):
- Date
- Thread