[omega] Question about the paper published in EuroTeX 2005
Chris Rowley
C.A.Rowley at open.ac.uk
Wed Mar 30 15:08:47 CEST 2005
Yannis
> And there are still many intermediate levels between "mandatory" and
> "aesthetic".
Yes, indeed.
> We all agree that "fi" is an aesthetic one (although there is a Unicode
> character for it...)
> and that "æ" is a mandatory one (it is even considered as a "letter",
> and of course
> also as a character). But what about Arabic lam-alif? It is mandatory
> in every Arabic
> writing style, and it is the only ligature introduced in all Arabic
> grammars. But still it is not
> a "character".
>
Yes, I was not trying to claim that the analysis is simple. When you
get down to these type of considerations you start to ask `what is a
character?' (Or even, if you are me or Joachim: what is text?)
For the present I think it is good enough to say that a character is
almost (but with some exceptions in both directions) a `populated
Unicode slot' (so that text is a sequence of such `characters'
together, importantly, with a property (which I call, for want of a
better word, its `language') that tells us how to give a meaning to
the sequence (`meaning', in this message, is undefined).
> I disagree with ligatures resulting only from the glyph-choice
> procedure. They may be
> aesthetic but there are still rules about avoiding them: in Turkish you
> *must* avoid
> "fi" ligature, in German you *must* avoid it between word components.
I agree that another word is needed for this non-free choice, but
these can be viewed (for computing purposes) as rules that restrict
the glyph choice when typesetting a particular language. Benjamin B
will recognise this as part of a more general (and very long) LaTeX3
workshop where we tried to separate (not very successfully) `language
properties' from other `cultural, typographic and aesthetic'
conventions in typesetting.
> Unicode
> gives a solution to this: the ZWNJ character, but IMHO this is the
> wrong way to deal
> with this: it is not logical to introduce an extra character for
> avoiding a phenomenon
> on the glyph level.
That bit of Unicode is a collection of such ad hoc kludges!
Recent versions of Unicode also support the concept of a`language
label' for text (not sure if that is the word they used: it is what
Frank and I called it when we explained its importance to them).
> You are returning to an idea we where discussing (w/ John) when you were
> in Brest: to act on words and to use a server of word visualisations.
> This is not
> incompatible with textemes: the latter will describe the structure of
> the former,
> and we can even imagine a link to the "word visualisation" as texteme
> property.
>
Agreed, but if the software model has this idea of `word' that may
affect how much and what information you put into each texteme.
> We had a similar idea about the Quran: a server providing authentified
> segments
> of Quranic text, every word of which would be a link to the server.
> This can be
> brought down to the texteme level and every Quranic texteme in a
> document
> can be considered as the instance of the "abstract class" being on the
> server. It is
> even compatible with religion: every printed Quran is an instance of
> the abstract ONE,
> and this goes down to every single letter.
>
`In the beginning was The Word ...' (that is a kind of anti-pun:-).
> > 3a. When using a font resource whose rendering engine must be accessed
> > via `sequences of Unicode slot numbers' there will be an extra step
> > needed in order to deliver the correct sequence to the font resource
> > (and to ensure that the right settings (for use of ligatures etc etc)
> > are used when the font resource interprets that sequence). This seems
> > to me to be a peculiarity of this week's technology, so I am not sure
> > that this step should be part of a good general model of
> > characters/glyphs/@@@emes.
>
> you refer to re-ordering as done by Uniscribe/ATSUI/Pango. This should
> be
> feasible through an OTP, or maybe at a more global level so that we can
> apply bidi
> across buffer boundaries. These are problems we are searching upon.
>
More to the general idea that, as I understand it, one is not meant to
access glyphs in an OpenType-style of font resource
> > 5. Some of this may be relevant to what the `objects' used by the
> > model (and implementation) are called. Another thing that may be
> > relevant to choosing a name is that similar objects (ie with extensible
> > property lists) and structures (eg `network graphs') will be needed at
> > all levels: characters/glyphs, words, lines, paragraphs (in their many
> > forms), columns, table entries, tables, pages, spreads, ...
>
> ... and the Universe can be seen in Judith Foster's eye (as in the
> Contacts movie) :-)
>
> Joke apart, I agree of course. But concerning omega, let us not be
> tempted (as others
> in previous years have be tempted to rebuild the World). Our current
> experimentations
> are as simple as possible, and they will become more complex whenever
> there is
> a (concrete) need for it. Let us first solve the issues of
> micro-typography; when this
> is done, then only we (or others) will examine higher-level structures.
>
Again, that is fine but therefore do not spend too much time on
finding a word for a particular class when it may eventually turn out to be
simply a specialisation of some much bigger and more useful class
and, in that wider context, its `true name' may become obvious.
>
> I noticed you call our *** concept @@@emes, don't you like the word
> "textemes"?
>
No, I think @s are friendly characters...so I did not use ***emes:-)
I was merely trying to avoid that discussion, although the ending
`eme' as in `meme' is one I am quite fond of.
But if you want my analysis: the problem with `texteme' is that the `eme'
ending does not convey the atomicity that I think is an important
property of these objects; to me it sounds more like a bit of text
that has some wholeness (like a word or a phrase).
But here I am contradicting my injunction to not spend time on
getting the `true name' since that will become apparent when we really
understand the idea.
So I shall stop.
chris
More information about the omega
mailing list