`limitations' of OzTeX (was: fontinst with 8y.etx)

Melissa O'Neill oneill@cs.sfu.ca
Wed, 17 Jun 1998 12:44:03 -0700 (PDT)


Using data from my own tools, I wrote: 
>> My own custom encodings, which are based on the PDFDocEncoding
>> wouldn't work particulally well on these Windows DVI previewers. If we
>> rate an encoding's compatibility with Windows ANSI as N+M, where N is
>> the number of slot clashes, and M is the number of glyphs that map to
>> empty slots in Windows ANSI (and thus lower numbers are better), we get:
>> 
>> TeXBase1Encoding (aka 8r):      4+21
>> TeXnANSIEncoding (aka LY1/8y):  7+35
>> PDFDocEncoding:                 23+16
>> my current custom encoding:     28+26
>> ECEncoding (roughly T1):        63+41
>> 
>> Thus we can see that 8r is most conciliatory towards these Windows
>> programs.

... leading Berthold to reply:
> I am not sure what those numbers mean, since the obvious interpretations
> leads to some contradictions :-)
> 
> First of all, 8r and 8y have  the same overall glyph complement, so the
> numbers must be wrong.  Each includes the 15 glyphs missing from 
> Windows ANSI yet found in the standard 228 of typical text fonts.
> Each also includes ff, ffi, ffl, dotlessj.  So the +21 for 8r and +35 for 8y
> should be BOTH +19 by my counting.  I wonder whether you have
> added into the 8y total the glyphs that are repeated for convenience?

8r and 8y *don't* have the same glyph complement 8y includes `cwm',
`nbspace' and `sfthyphen', which are not in 8r. Also, I suppose I
shouldn't have said ``the number of glyphs that map to empty slots''
and instead said ``the number of empty slots into which glyphs are
placed''.

> As for conflicts, both 8r and 8y take the `standard' departure from 
> Latin 1 (and hence Windows ANSI) by replacing quotesingle in 39
> with quoteright, and replacing grave accent with quoteleft in 92.
> They also both have asciicircum instead of circumflex accent in 94,
> and asciitilde instead of tilde accent in 126.  So I count 4 conflicts
> with Windows ANSI in *both* 8r and 8y.

Perhaps my tools were overly zealous in their reporting here. This
problem again stems from those three glyphs above. There seems to be
disagreement on what the three glyphs that 8y calls `cwm', `nbspace'
and `sfthyphen' should be called.  Adobe refers to `nbspace' as
`nobreakspace' (e.g. in Adobe's chsttabl.pdf) but usually replaces it
with space in encodings, `sfthyphen' is sometimes called `softhyphen'
(e.g.  on the HP LaserJet 4000) and `cwm'(*) is sometimes called
`compwordmark' (although I can't remember where right now).

	  * In fact, I've never seen a `compwordmark'/`cwm' glyph in any 
	    font, so I have no idea what it might look like, or whether it
	    really needs to be in 8y.

Another issue is that I don't have a definative source for the Windows
ANSI Encoding. One source is WinLatin1Encoding present in my HP LaserJet
4000, and another is the WinAnsiEncoding described in the PDF 1.1
specification -- and neither has sfthyphen, nbspace or cwm.
 
Given the amount of naming diagreement, I wonder if there is any good
reason to include these glyphs in 8y.

    Melissa.

P.S. I've enclosed the output of my encoding comparison tool, so you can
see the kind of data tool I was working with.

Enc.

unix% compare-encodings 8r.enc texnansi.enc
TeXBase1Encoding vs. TeXnANSIEncoding

232 glyphs common to both encodings
 19 slot clashes
210 slot matches
 21 empty  slots in TeXBase1Encoding filled in TeXnANSIEncoding
  4 filled slots in TeXBase1Encoding empty  in TeXnANSIEncoding
  3 glyphs only in TeXnANSIEncoding
        cwm nbspace sfthyphen

Slot    TeXBase1Encoding            TeXnANSIEncoding
---------------------------------------------------------------
01      dotaccent
02      fi
03      fl
05      hungarumlaut                dotaccent
06      Lslash                      hungarumlaut
07      lslash                      ogonek
08      ogonek                      fl
09      ring
0A                                  cwm
0B      breve                       ff
0C      minus                       fi
0E      Zcaron                      ffi
0F      zcaron                      ffl
10      caron                       dotlessi
11      dotlessi                    dotlessj
12      dotlessj                    grave
13      ff                          acute
14      ffi                         caron
15      ffl                         breve
16                                  macron
17                                  ring
18                                  cedilla
19                                  germandbls
1A                                  ae
1B                                  oe
1C                                  oslash
1D                                  AE
1E      grave                       OE
1F      quotesingle                 Oslash
5E      asciicircum                 circumflex
7E      asciitilde                  tilde
7F                                  dieresis
80                                  Lslash
81                                  quotesingle
8D                                  Zcaron
8E                                  asciicircum
8F                                  minus
90                                  lslash
91                                  quoteleft
92                                  quoteright
9D                                  zcaron
9E                                  asciitilde
A0                                  nbspace
AD      hyphen                      sfthyphen