Comments wanted: AGL, Unicode, T1 vs. CE/Baltic

Hilmar Schlegel
Wed, 6 Sep 2000 00:38:22 -0400

Lars Hellström wrote:
> At 12.44 +0200 2000-09-05, Hilmar Schlegel wrote:
> [snip]
> >The really bad thing is that I do not see how to fix the clash of two
> >characters for one slot (actually 4 characters in two slots because of
> >uc/lc).
> >one could
> >- remove T/t-commaaccent and use the slot(s) for something else (e.g.
> >the Turkish lowercase Idotaccent = dotlessidot, Turkish uppercase
> >dotlessi = Dotlessi, which would be necessary to make hyphenation
> >patterns happy)
> >- make room at another slot for S/s-cedilla: e.g. remove Y/y-diearesis
> >(the problem would be again that the affected hyphenation patterns---if
> >there are any---would have to learn about the new/alternative position)
> >- introduce a slot for commaaccent to support at least the construction
> >of Baltic characters;
> If these characters are used in Romanian and Turkish, then why do you refer
> to them as Baltic characters?

g, k, l, n, r-commaaccent as well as a, e, i, u-ogonek are Baltic
They are referred to here since they require the same accent to be
present in addition to what is currently covered by "text fonts".
I.e. if the commaaccent is provided you could write Baltic text (at
least by using the accent primitive).
Romanian would be covered by T1 encoding with the correct character
shapes (and without using accent).
> One could put the comma accent (and its upside-down form) in some slot in
> the TS1 encoding. This would be sufficient for use by the \accent primitive
> (and various \ialign constructions).

This would be at least one step forward. Not optimal for building
ligatures though.
> >  if this accent is not available I see no alternative to making
> >S/s-cedilla a "constructed-only" character, i.e. remove it from
> >T1-encoding and access it always via a Tex accent primitive.
> Do you mean S/s-comma? I cannot see why S/s-cedilla would have to be removed.

Because you *can* construct S/s-cedilla (cedilla is there).

> >Then there would be a compromize possible: all C/consonant-commaaccent
> >characters are accessed via a ligature C/consonant+ogonek.
> What??! T1 has no room for any more glyphs, so how could using ligatures
> solve anything?

Fill in all you need for Turkish and use it for Turkish, or drop Turkish
characters to  make room for Romanian/Baltic. Due to the clash you
cannot have both at once.
> >This would
> >work with standard T1-encoding and T/S-commaaccent and as well with an
> >adopted Baltic variant (e.g. Turkish characters replaced by g, k, l, n,
> >r-commaaccent, Baltic hyphenation patterns should know about this).
> >There are only V/vowel-ogonek combinations in existance which require
> >the bonafide ogonek accent.
> >The advantage of this approach would be to cope consistently with all
> >the commaaccent characters *without* the introduction of a new accent in
> >the Tex/Latex scheme and asuring a high degree of compatibility on the
> >markup-level.
> It all sounds terribly complicated (and error-prone) to me. Wouldn't it be
> better to start working on a T1A (or whatever) encoding to support the
> needs of the languages that are partly left out in the cold by T1?

It is complicated - therefore I mentioned that I do not see a solution
(I'd have suggested it here).
Therefore the few compromizes to come along with the least possible
irritation for the holy cow: "standard" T1 font layout (to avoid the
term encoding here).

Two things remain essential: get rid of T-cedilla in T1 fonts
irrespective of their fornat (Metafont, Adobe Type1, TrueType).
Allow a consistent, portable markup in Latex to make sources still
usable when this issue will be resolved some day.
Unicode & AGL have been fixed, this should be done one day (soon?) for
Latex too.

Hilmar Schlegel