[XeTeX] Σχετ: Re: Σχετ: Re: Assignment of codes (particularly \catcode) based on Unicode data

Julian Bradfield jcb+xetex at jcbradfield.org
Thu May 7 11:57:12 CEST 2015


On 2015-05-07, Apostolos Syropoulos <asyropoulos at yahoo.com> wrote:
> Well I do not know what Dendrinos says I just happen to know what people do in
> typography and in everyday practice. 

Which is not what the Unicode uppercase mapping is for. The uppercase
mapping in the data file gives a default mapping, which is appropriate
in the absence of any language specific behaviour (although some
special cases of Greek are built in, particularly the behaviour of
iota subscript and adscript).
Case-conversion algorithms may use additional data appropriate to the
language and environment.

Many languages have conventions that diacritics may be omitted when
writing in all capitals.  For example, in French, it is quite common
to omit accents, or to omit accents unless confusion would occur.

>> The only mark that remains when making all capitals is the dieredis
>> (dialytika). All other vanish. This is common knowledge for people who

This is a different matter again. Conventions for all-capital writing
may well be different from conventions for casing in mixed-case text,
and in many languages diacritics are freely omitted in all-capital
text -- Unicode specifically observes that accents are often omitted
from Modern Greek all-capital text. This is something that needs to be
handled by the functions that do the conversion; it's not something to
be done by the basic uppercase mapping.

As it happens, the breathings and accents of polytonic Greek were
introduced into the script *before* it developed an upper/lower-case
distinction.


More information about the XeTeX mailing list