[omega] Question about the paper published in EuroTeX 2005
Yannis Haralambous
yannis.haralambous at enst-bretagne.fr
Wed Mar 30 14:33:41 CEST 2005
Le 29 mars 05, à 18h16, Chris Rowley a écrit :
> I may have missed some of this `conversation' so I hope these ideas
> are still pertinent.
>
> 1. Some years ago, Yannis published a useful analysis of `ligatures'.
>
> There he introduced (I hope I have the right names):
>
> mandatory ligatures
>
> aesthetic ligatures.
>
> In terms of the current discussion I would say that the `mandatory'
> ones are those that apply to `characters', whereas the `aesthetic'
> ones depend on the font used and on the typesetting tradition; and so
> these latter can be viewed as applying to glyphs or (much the same)
> as being part of the glyph-choice procedure.
And there are still many intermediate levels between "mandatory" and
"aesthetic".
We all agree that "fi" is an aesthetic one (although there is a Unicode
character for it...)
and that "æ" is a mandatory one (it is even considered as a "letter",
and of course
also as a character). But what about Arabic lam-alif? It is mandatory
in every Arabic
writing style, and it is the only ligature introduced in all Arabic
grammars. But still it is not
a "character".
I disagree with ligatures resulting only from the glyph-choice
procedure. They may be
aesthetic but there are still rules about avoiding them: in Turkish you
*must* avoid
"fi" ligature, in German you *must* avoid it between word components.
Unicode
gives a solution to this: the ZWNJ character, but IMHO this is the
wrong way to deal
with this: it is not logical to introduce an extra character for
avoiding a phenomenon
on the glyph level.
> 2. It is probably better to continue to analyse glyph choice at the
> level of words (or, rather, sub-words if ligatures should be
> avoided at certain points (eg between parts of compound words).
>
> Thus each `word' has many different possible `visualisations': some
> of these variants depend on the fonts used, others on the
> typographic tradition in use, etc. Some of these should only be used
> for `special cases' eg when a word is divided between two lines at a
> particular point.
>
> 3. Thus it is, for example, the job of a `paragraph formatting'
> module to find enough feasible visualisations of the input text
> (character string) of that paragraph and then to choose a `good
> enough' visualisation (or maybe offer a choice to some other process,
> in a more complex system).
>
> This will require knowledge about feasible visualisations of
> (sub)words (including punctuation) and their metric (and perhaps
> other) properties and about the spaces between the words Plus a few
> other layout specs).
You are returning to an idea we where discussing (w/ John) when you were
in Brest: to act on words and to use a server of word visualisations.
This is not
incompatible with textemes: the latter will describe the structure of
the former,
and we can even imagine a link to the "word visualisation" as texteme
property.
We had a similar idea about the Quran: a server providing authentified
segments
of Quranic text, every word of which would be a link to the server.
This can be
brought down to the texteme level and every Quranic texteme in a
document
can be considered as the instance of the "abstract class" being on the
server. It is
even compatible with religion: every printed Quran is an instance of
the abstract ONE,
and this goes down to every single letter.
> 3a. When using a font resource whose rendering engine must be accessed
> via `sequences of Unicode slot numbers' there will be an extra step
> needed in order to deliver the correct sequence to the font resource
> (and to ensure that the right settings (for use of ligatures etc etc)
> are used when the font resource interprets that sequence). This seems
> to me to be a peculiarity of this week's technology, so I am not sure
> that this step should be part of a good general model of
> characters/glyphs/@@@emes.
you refer to re-orderning as done by Uniscribe/ATSUI/Pango. This should
be
feasible through an OTP, or maybe at a more global level so that we can
apply bidi
across buffer boundaries. These are problems we are searching upon.
> 5. Some of this may be relevant to what the `objects' used by the
> model (and implementation) are called. Another thing that may be
> relevant to choosing a name is that similar objects (ie with extensible
> property lists) and structures (eg `network graphs') will be needed at
> all levels: characters/glyphs, words, lines, paragraphs (in their many
> forms), columns, table entries, tables, pages, spreads, ...
... and the Universe can be seen in Judith Foster's eye (as in the
Contacts movie) :-)
Joke apart, I agree of course. But concerning omega, let us not be
tempted (as others
in previous years have be tempted to rebuild the World). Our current
experimentations
are as simple as possible, and they will become more complex whenever
there is
a (concrete) need for it. Let us first solve the issues of
micro-typography; when this
is done, then only we (or others) will examine higher-level structures.
But of course I do understand that similarities between different
levels of structure
are crucial for your taxonomy project. And I can very well imagine that
the Universe
is contained in Judy Foster's eye, she is so beautiful!
> Note also that these do not form a nice tree-like hierarchy.
I don't think that you will find any tree-fundamentalist on this list
:-)
> That's enough for now; maybe some more when I have read through the
> interesting details of the many messages.
I noticed you call our *** concept @@@emes, don't you like the word
"textemes"?
I had some discussion with linguists disagreeing with this choice, but
they were unable to
suggest anything better.
I also thought of "micro-textemes", but *only* in case we want to
prevent confusion with
the already existing notion of textemes as atomic units of text in
rhetorics.
cheers
>
>
> chris
>
>
>
--
+--------------------------------------------------------------------+
| Yannis Haralambous, Ph.D. yannis.haralambous at enst-bretagne.fr |
| Directeur d'Études http://omega.enstb.org/yannis |
| Tel. +33 (0)2.29.00.14.27 |
| Fax +33 (0)2.29.00.12.82 |
| Département Informatique |
| École Nationale Supérieure des Télécommunications de Bretagne |
| Technopôle de Brest Iroise, CS 83818, 29238 Brest CEDEX 3, France |
+--------------------------------------------------------------------+
...pour distinguer l'extérieur d'un aquarium,
mieux vaut n'être pas poisson
...the ball I threw while playing in the park
has not yet reached the ground
More information about the omega
mailing list