[omega] Question about the paper published in EuroTeX 2005

Mon Mar 14 20:52:27 CET 2005

On Mon, Mar 14, 2005 at 01:31:27PM +0100, Yannis Haralambous:
> 
> My answer:
> 
> we need an atomic unit for algorithms (like OTPs, paragraph builder, 
> etc.) to operate. Let us take again the example of backen: I agree that 
> saying that "k" is a variant glyph of "c" is absurd. But suppose now 
> that in our font we have an `ak' ligature. Then, when the word is 
> hyphenated "bak-ken", I would indeed like to have that ligature in the 
> part "bak-". Which means that the algorithm must detect a "k" texteme 
> to be able to apply the ligature. So even if it sounds absurd to have a 
> "c" with a "k" glyph, maybe this is how the engine sees it.

Another guess: a ligature is between two glyphs, not between two
characters.

With this idea in mind, it sounds natural that the glyph-string "ak"
can be ligatured in your font. I don't know how it fits in your current
texteme model anyway... Perhaps you should have something like what is
used in TeX: in a node, TeX almost only handle glyphs, not really
characters anymore. So the way he handles ligatures is almost correct
from this point.

> My argument about "pseudo" is that we weren't very precise yet on what 
> we mean when talk of a "glyph". A TeX charnode contains a number which 
> is the position of a glyph in the current font. So in some sense this 
> is both concrete and abstract. Concrete because we obtain one and only 
> image (when I say glyph 97 of font CMR10 at size 10 points, this is a 
> unique image, modulo the version changes by Knuth, and this no matter 
> how I get there, be it bitmap or vector outlines). Abstract because 
> being only a table position we can change fonts, and provided we have 
> the same glyph encoding we can get something else.

Well, yes, but in TeX, the ligature is computed only from glyphs in the
same font. I feel this is a hint: the ligatures are totally in the glyph
land.

> So the question is: how abstract must a "texteme glyph reference" be?

I would say it is not abstract at all. At least this is what I
understood from your paper...

> How many levels of specification do we have? When we say that we want 
> glyph=a, do we mean just any a (default font) at any size (default 
> size)? a in a font named foo? a in a font named foo version 1.01? in a 
> given size? using a specific outline?

According to your paper, I would say the glyph is the outline, as far as
the engine can guess it: in a TFM font, it's only the index in the font,
in an OpenType context, it's something more complex, and in an SVG
context it might be the outline itself.

> In TeX this problem was avoided by using *only* TFM. TeX knows only 
> about TFM and if we want to combine TFM with real-world fonts then all 
> the ambiguity of identifying a font is outside the scope of TeX. Omega 
> will soon read OpenType fonts directly. In a TTF-flavored OpenType font 
> you must go through GSUB/GPOS to access certain glyphs (not accessible 
> from cmap), otherwise the only way is through glyph indexes which are 
> not reliable, so the problem of identifying a glyph in a texteme is a 
> hot topic.
> 
> But first of all, we should be very clear about what we are talking 
> about when we say "glyph". (Vocabulary, terminology, taxonomy: sounds 
> like a talk by Chris, Joachim and Christina???)

An idea can be to define clearly two lands: the one of the characters,
where some kind of modifications can happen like deciding that this is
the begining of a word (OTPs); and another one, for the glyphs, where
other transformations can happen, that would be mostly graphical
(font related ligatures).

My guess is that the "initial" glyph would be defined by the "last
modified" char. Another point is, then, do decide which char is to be
send in the final file. Probably the initial char.

By the way, in what you explained, you use OTPs to translate a
char-string into a glyph-string, which is perhaps not the right idea.
Perhaps you should use OTPs to do translations only in the chars land,
transforming "X" into "X at the end of word". And only then would happen
the transformation into a glyph, perhaps it looks like an OTP, but I
think it should have another name.

That would lead to 3 tools to do transformations:
- OTPs to do those on chars;
- XXXs to do those from char to glyph;
- YYYs to do those on glyphs.

I think the YYYs are already provided by the fonts (kern/lig program in
TFM, other ones in OpenType). It might be interesting to provide some
more explicits, like for the poorly defined fonts. But I think a
different name would make things more clear.

I don't know if this might help in any way...

Regards,

	Benjamin.