[pdftex] OT: Unicode and typesetting

James Cloos cloos at jhcloos.com
Fri Apr 8 16:28:39 CEST 2005

|> But, even for conversions, why could not the Angstrom symbol in one
|> old encoding map to the sample codepoint as the A-with-circle-above
|> does from another encoding?

At least one legacy encoding had both.

|> And, if they cannot, why cannot one of
|> the points be a 'symbolic link' to the other (i.e. "you can render
|> either code point with any (sensible) glyph you like, but you must
|> render the two codepoints with exactly the same glyph')?

The compatability decompositions try to deal with that.

|> (One could argue differently I suppose for conversion _from_
|> Unicode, but that is such a can of worms.

Conversions have to work bidirectionally.  Consider the case of using
an editor that uses unicode internally and must be able to deal
correctly with any legacy data the user needs to edit.  In almost all
cases the user needs to save the data in the same encoding it was
already in so that whatever else uses it can still do so.

|> I suppose my (thusfar) unspoken problem is that I cannot understand
|> the logic. If I am wrong I would like to understand. What I fear,
|> though, is that Unicodes's foundations are not as firm as they
|> ought to be .... which will inevitably mean more changes in the
|> years to come ....

It is a compromise of unification vs transparent conversions to/from
legacy encodings.

The unicode book should do a better job of explaining their
motivations than I.  If you cannot find a hardcopy the pdfs can be
grabbed from ftp.unicode.org.

|> P.P.S. Would have replied 'off list' but your mail server is
|> apparently broken: <cloos at jhcloos.com>: host
|> jfk.uu.jhcloos.net[] said: 554 Service
|> unavailable; Client host [blocked using sbl-xbl.spamhaus.org;
|> http://www.spamhaus.org/query/bl?ip=

Spamhaus is usually pretty good at only blocking IPs that are actually
sending spam/virii/etc.  It has been stopping about a thousand such
each day for me, according to the logs.  Spamhaus shows via that url
that it got that ip from cbl, which claims to only list ips that have
sent actual spam to spam traps.  You can have that ip removed by
following their automated removal link: 


I hate the need to check a rbl server and only started to use spamhaus
because of several reviews all claiming an extremely low false-
positive rate, primarily by listing only sources of actual spam or
virii/etc sent to spamtraps and honeypots.


