[Fontinst] Mapping all diacritics to actual glyphs rather than composites

Lars Hellström Lars.Hellstrom at residenset.net
Sun Jan 17 23:59:09 CET 2010


Christopher Adams skrev:
> I've finally pieced together that the solution is to write my own encoding
> vector.
> 
> Is there a good tutorial about how to do this?
> 
> As a test I simply made a copy of 8r.enc and 8r.etx. In the .enc file I
> replaced one of the /.notdef's with /eogonek, and then added a \setslot for
> this glyph in the .etx file. Then I changed \reencodefont{8r} to refer to my
> renamed, modified copy, ran the files through TeX and updated all the map
> files.
> 
> Sure enough, the correct *eogonek* comes out in the PDF (it's even
> serchable). Perfect!
> 
> I finally understand the reason that *eogonek* wasn't working is that it
> isn't defined by 8r.
> 
> I still have some questions.
> 
> 1) What's the best way to do what I need?

Depends on your priorities. If they are to get what you need done with 
as little effort as possible, then the "substitute a few glyphs" 
approach you later seemed to decide on is probably optimal. If they 
instead are best quality -- to produce the full font support for 
Eastern/Central European languages using the latin script -- then it 
might rather be to embark on the project of producing a T1A (or 
whatever) encoding, that can be used for those languages where T1 isn't 
quite sufficient.

> 2) If I need a glyph that is not in T1, like *iogonek*, in addition to the
> above, do I then simply have to modify t1.etx and add a slot for that glyph?

As far as fontinst is concerned, yes, and if using the new 
multislot.sty package (which I'm not sure I've ever announced 
properly), you can write an ETX which *only* sets the slots that are 
different from T1, and then goes \inputetx{t1}, thus significantly 
reducing the amount of editing you need to do.

When we get to LaTeX however, things are quite different.

> 3) If I need a glyph that doesn't even have a TeX command, such as
> scommaccent, what do I have to do to get access to it in my latex document?
> I know someone has written a \cb{} command that fakes comma below. Can I
> write my own \cb{} command? What would it look like? I really only need to
> access to scommaaccent and tcommaaccent.

This is a matter of the encoding-specific "text commands", which are 
described in the "Encodings" section of fntguide.tex (part of LaTeX 
distro) and Section 7.11 of The LaTeX Companion (2nd edition). You 
would probably want to do something like

   \DeclareTextComposite{\k}{L01}{s}{...}

if following Hilmar's suggestion of using \k also for the comma accent; 
the above would (if ... is replaced by the right slot number) make 
\k{s} typeset an scommaaccent glyph.

What you can't get around is however the need to declare a new LaTeX 
encoding for your fonts, since you'll need some text commands to end up 
doing something different than they would under T1. In the example 
above I used L01 (local encoding 1) for this new encoding, as you'll 
(at least initially) probably only be using it privately, but in the 
long run I think a lot of people would benefit if someone stepped up to 
designing an encoding which fully covers the Eastern European languages 
T1 only provides a partial solution for. Hilmar?

> Fortunately, because I'm doing book work, I can make room in the encoding
> vector by discarding some math symbols and analphabetics.

Are you talking about 8r here? There isn't very much nonalphabetic 
material in T1 to get rid of. Getting something new in pretty much 
necessitates losing some of the precomposed letters...



Christopher Adams skrev:
 > Hi Hilmer,
 >
 > Thank you again for your thoughtful replies.
 >
 > 2010/1/17 Hilmar Schlegel
[who likes to keep his e-mail secret, wrote]
 >
>> Well, another question would be: What do you want to do?
>> Seriously, all depends on what you really need (and not
>> necessarily what you  believe now you might need).
 >
 > At this point I can define my goal quite narrowly: I need an
 > *a/eogonek*rendered as single outlines, as well as t/scommaccent
 > (without loosing
 > scedilla). I see now this should be quite easy to achieve.
 >
>>> 2) If I need a glyph that is not in T1, like /iogonek/, in
>>> addition to the above, do I then simply have to modify t1.etx and
>>> add a slot for  that glyph?
 >>
>> That is a practicable solution for some special glyphs. Since T1
>> is  full, those are replacements of other glyphs you don't need.
 >
> Since my target glyphs are few i number, this would appear this is
 > the best solution.
 >
> If I write a \setslot{scommaaccent} in my modified t1.etx file, I'm
> still not confident I could typeset it. My low-level TeX is not so
> good, and so I'm trying to puzzle out the code you sent. At some
> point in the code  do you
> have to refer to the glyph by its char number? Or is there another
> way to refer to whichever glyph occupies the "scommaaccent" slot by
> name?
 >
>>> 3) If I need a glyph that doesn't even have a TeX command, such
>>> as scommaccent, what do I have to do to get access to it in my
>>> latex  document?
>>> I know someone has written a \cb{} command that fakes comma
>>> below.  Can I
>>> write my own \cb{} command? What would it look like? I really
>>> only  need to
>>> access to scommaaccent and tcommaaccent.
 >>
 >>
>> Well, you can simply use the ogonek accent command: \k{a} and
>> \k{e} are defined as standard Latex commands. Ogoneks are applied
>> to vowels (like iogonek) and for consonants you simply provide
>> commaaaccents (g, k,  n, r, s,
>> t). See the sample for using \k as ogonek/commaaccent for 
>> vowels/consonants.
 >>
 >> Here the commaaccent is defined as a specific char code.
 >> You can place it as it suits your needs.
 >
 >
> Am I correct that this code is drawing the glyphs by composing a 
> base letter with a diacritic? But it first checks to see if there is
>  a real eogonek? I apologize that I'm having trouble following the 
> code. Would it be feasible to have a command that gets \cb{s} and 
> \cb{t} to print the right glyphs, assuming that the .etx has slots 
> for s/tcommaccent?
 >
>> s/tcommaccent are a trivial correction for T1: replace scedilla 
>> (which will mean there is no longer Turkish language support) and
>> use a T1' with a commaccent and redefine (extend) the \k{} command
>> as in the sample code.
 >>
 >
> It seems I can't quite go this route, as I can't lose Turkish as a
 > result.
 >
 >
 >>
>>> Fortunately, because I'm doing book work, I can make room in the
>>> encoding vector by discarding some math symbols and analphabetics.
>> 
>> Actually you are not free to assign the glyphs in Latex to
>> arbitrary character codes! All depends on the used hyphenation
>> patterns for  the needed
>> languages (Eastern European, Baltic, old Prussian &c). Therefore
>> T1 can be a reasonable template to start with since there is a 
>> high probability of language support for T1 encoded fonts.
 >
> Ok, this is good to know. I'm very curious to know where this "you
> are not free to assign..." mandate is written.

I've tried to document it in fontinst/doc/encspecs/encspecs.tex. The 
main problem is that TeX's \lccode table establishes a correspondence 
between upper and lower case letters, and if you don't respect these 
correspondences then the hyphenation can get screwed up.

 >> From the pure typographic view I'd strongly suggest to consider
 >> i) the font Aldus [...]
 >> ii)  Palatino and Aldus Nova OT from Linotype. [...]
 >>
 >
> These are both considerate choices. I very aware of both Aldus and
> Palatino Nova, and under different circumstances your suggestions
> would be entirely correct. But in this case I have already
> determined that the regular and bold weights of Palatino are
> precisely what I need (as a display face I'm using Sistina). Plus,
> the fact that I require embedding permissions means that I need
> Adobe. The fact that I have to tweak some glyphs means I need
> Palladio.
 >
 > If circumstances were different I probably wouldn't be using any of
> these. It just so happens that the work I am setting deals with
> literary and cultural history centered in Frankfurt from the 1950s 
> onward. It is as if the choice has been made for me!
 >
 > In point of fact this has all given me a much finer appreciation of
 > Zapf's accomplishment.
 >
 >
 >> iii) for the less strict user of existing implementations it is also
 >> worth to have a look into a Palatino Linotype, which is a TT
 >> (TrueType) font with a larger glyph set. In case you can take the
 >> trouble and make use of large TT fonts for Latex (extract the
 >> metrics, map with fontinst and embed them via Distiller) this can
 >> enrich the glyph set usable by Latex considerably.
 >
 > This is something I'm certainly interested in learning how to do.
 > Considering my modest skill-set I'm determined to stick with Type 1
 > fonts for now. Are there any good resources to learn more about this?
 >
 > My reasons for using fonstinst stem from my reliance on the microtype
 > package in pdflatex. But now I'm really hooked on the flexibility it
 > offers for mashing up fonts. I'm really interested in learning how
 > OpenType is going to change the fontinst landscape. Any pointers?

Well, since you ask about the future... But bear in mind that this is 
probably double-danger material, and very much work in progress:

   http://abel.math.umu.se/~lars/fontinst/bigbase.dvi

Again, be warned: I had the spur of fontinst development resulting in 
this back in September--October, but from November and on I've been 
doing other stuff. There's no reason to believe these mechanisms will 
be in anything resembling working condition within the foreseeable future.

Lars Hellström




More information about the fontinst mailing list