[omega] Question about the paper published in EuroTeX 2005

Wed Mar 30 14:33:41 CEST 2005

Le 29 mars 05, à 18h16, Chris Rowley a écrit :

> I may have missed some of this `conversation' so I hope these ideas
> are still pertinent.
>
> 1.  Some years ago, Yannis published a useful analysis of `ligatures'.
>
> There he introduced (I hope I have the right names):
>
>   mandatory ligatures
>
>   aesthetic ligatures.
>
> In terms of the current discussion I would say that the `mandatory'
> ones are those that apply to `characters', whereas the `aesthetic'
> ones depend on the font used and on the typesetting tradition; and so
> these latter can be viewed as applying to glyphs or (much the same)
> as being part of the glyph-choice procedure.

And there are still many intermediate levels between "mandatory" and 
"aesthetic".
We all agree that "fi" is an aesthetic one (although there is a Unicode 
character for it...)
and that "æ" is a mandatory one (it is even considered as a "letter", 
and of course
also as a character). But what about Arabic lam-alif? It is mandatory 
in every Arabic
writing style, and it is the only ligature introduced in all Arabic 
grammars. But still it is not
a "character".

I disagree with ligatures resulting only from the glyph-choice 
procedure. They may be
aesthetic but there are still rules about avoiding them: in Turkish you 
*must* avoid
"fi" ligature, in German you *must* avoid it between word components. 
Unicode
gives a solution to this: the ZWNJ character, but IMHO this is the 
wrong way to deal
with this: it is not logical to introduce an extra character for 
avoiding a phenomenon
on the glyph level.

> 2.  It is probably better to continue to analyse glyph choice at the
> level of words (or, rather, sub-words if ligatures should be
> avoided at certain points (eg between parts of compound words).
>
> Thus each `word' has many different possible `visualisations': some
> of these variants depend on the fonts used, others on the
> typographic tradition in use, etc.  Some of these should only be used
> for `special cases' eg when a word is divided between two lines at a
> particular point.
>
> 3.  Thus it is, for example, the job of a `paragraph formatting'
> module to find enough feasible visualisations of the input text
> (character string) of that paragraph and then to choose a `good
> enough' visualisation (or maybe offer a choice to some other process,
> in a more complex system).
>
> This will require knowledge about feasible visualisations of
> (sub)words (including punctuation) and their metric (and perhaps
> other) properties and about the spaces between the words Plus a few
> other layout specs).

You are returning to an idea we where discussing (w/ John) when you were
in Brest: to act on words and to use a server of word visualisations. 
This is not
incompatible with textemes: the latter will describe the structure of 
the former,
and we can even imagine a link to the "word visualisation" as texteme 
property.

We had a similar idea about the Quran: a server providing authentified 
segments
of Quranic text, every word of which would be a link to the server. 
This can be
brought down to the texteme level and every Quranic texteme in a 
document
can be considered as the instance of the "abstract class" being on the 
server. It is
even compatible with religion: every printed Quran is an instance of 
the abstract ONE,
and this goes down to every single letter.

> 3a. When using a font resource whose rendering engine must be accessed
> via `sequences of Unicode slot numbers' there will be an extra step
> needed in order to deliver the correct sequence to the font resource
> (and to ensure that the right settings (for use of ligatures etc etc)
> are used when the font resource interprets that sequence).  This seems
> to me to be a peculiarity of this week's technology, so I am not sure
> that this step should be part of a good general model of
> characters/glyphs/@@@emes.

you refer to re-orderning as done by Uniscribe/ATSUI/Pango. This should 
be
feasible through an OTP, or maybe at a more global level so that we can 
apply bidi
across buffer boundaries. These are problems we are searching upon.

> 5.  Some of this may be relevant to what the `objects' used by the
> model (and implementation) are called.  Another thing that may be
> relevant to choosing a name is that similar objects (ie with extensible
> property lists) and structures (eg `network graphs') will be needed at
> all levels: characters/glyphs, words, lines, paragraphs (in their many
> forms), columns, table entries, tables, pages, spreads, ...

... and the Universe can be seen in Judith Foster's eye (as in the 
Contacts movie) :-)

Joke apart, I agree of course. But concerning omega, let us not be 
tempted (as others
in previous years have be tempted to rebuild the World). Our current 
experimentations
are as simple as possible, and they will become more complex whenever 
there is
a (concrete) need for it. Let us first solve the issues of 
micro-typography; when this
is done, then only we (or others) will examine higher-level structures.

But of course I do understand that similarities between different 
levels of structure
are crucial for your taxonomy project. And I can very well imagine that 
the Universe
is contained in Judy Foster's eye, she is so beautiful!

> Note also that these do not form a nice tree-like hierarchy.

I don't think that you will find any tree-fundamentalist on this list 
:-)

> That's enough for now; maybe some more when I have read through the
> interesting details of the many messages.

I noticed you call our *** concept @@@emes, don't you like the word 
"textemes"?

I had some discussion with linguists disagreeing with this choice, but 
they were unable to
suggest anything better.

I also thought of "micro-textemes", but *only* in case we want to 
prevent confusion with
the already existing notion of textemes as atomic units of text in 
rhetorics.

cheers

>
>
> chris
>
>
>
--
+--------------------------------------------------------------------+
| Yannis Haralambous, Ph.D.      yannis.haralambous at enst-bretagne.fr |
| Directeur d'Études                   http://omega.enstb.org/yannis |
|                                          Tel. +33 (0)2.29.00.14.27 |
|                                          Fax  +33 (0)2.29.00.12.82 |
| Département Informatique                                           |
| École Nationale Supérieure des Télécommunications de Bretagne      |
| Technopôle de Brest Iroise, CS 83818, 29238 Brest CEDEX 3, France  |
+--------------------------------------------------------------------+
                          ...pour distinguer l'extérieur d'un aquarium,
                                         mieux vaut n'être pas poisson

                         ...the ball I threw while playing in the park
                                        has not yet reached the ground