[OS X TeX] TeXShop, perl and utf-8 encoding problems

Herb Schulz herbs at wideopenwest.com
Sun Feb 20 18:28:36 CET 2005


On 2/20/05 11:07 AM, "Stephen Moye" <stephenmoye at cox.net> wrote:

>>> Is it too much to ask of a computer program to have it say: "Ah, this
>>> fellow is using MacOSEncoding and has typed a ç; I just happen to
>>> know that that is Unicode 00E7, and that's what I'm going to give
>>> him."
>> 
>> Yes, that is too much to ask, as text without encoding happens to be
>> "just another bag of bits", and those bits are interpreted as whetever
>> TeXShop happens to think they mean. When the preferences are set
>> incorrectly, you simply get what you asked for. To quote a famous
>> computer scientist: ³Computers are good at following instructions, but
>> not at reading your mind². (The TeXBook, page 9).
> 
> Does that mean that text usung MacOSEncoding is essentially "text
> without encoding"?
> 
> Stephen Moye

Howdy,

Let me preface that I'm no expert on encodings... But here goes...

My impression is that there are two ways to input information into TeXShop;
the keyboard and reading a file. I believe that the display is UTF-8 and the
keyboard input is translated to UTF-8.

When TeXShop reads a file it has to make some assumption about the text
encoding in that file; unless told otherwise (via a ``%&encoding='' line
near the beginning of the file) it uses its default as set in its
preferences and translates that to the UTF-8 display.

When TeXShop writes a file it will, again, use its default setting unless
told not to by the ``%&encoding='' line. If you do a ``Save As'' you can
manually choose the save option but when it reads the file back in that
information is lost. Maybe it could save that information in the resource
fork of the file but then the text file is not a generic text file and could
lose that information at any time since the UNIX side of things knows
nothing about resource forks.

I don't believe MacOSEncoding is essentially ``text without encoding'' but
it may be the default interpretation you have set for reading files by
TeXShop. Actually, there is little difference between the encodings if you
stick to the ASCII coding usually used with TeX/LaTeX and do the accents via
the commands. I know that there is a problem with that in Plain XeTeX but
have no solution except to change the default input assumptions of TeXShop
if the file is really UTF-8 encoded or using the ``Open'' menu item to open
the file.

Good Luck,

Herb Schulz
(herbs at wideopenwest.com)

--------------------- Info ---------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
           & FAQ: http://latex.yauh.de/faq/
TeX FAQ: http://www.tex.ac.uk/faq
List Post: <mailto:MacOSX-TeX at email.esm.psu.edu>





More information about the macostex-archives mailing list