[OS X TeX] Typesetting info for TeXShop 2.01 documents

Peter Dyballa Peter_Dyballa at Web.DE
Thu May 5 00:16:45 CEST 2005

Am 04.05.2005 um 14:58 schrieb Herb Schulz:

> If I set the default input format to UTF-8 in TeXShop but only use the
> standard ASCII set and standard LaTeX commands for accented characters 
> as
> pdflatex input will pdflatex be able to compile it; i.e., is the plain 
> set the same in Mac and UTF-8 encoding? What about using your French
> keyboard and entering the extended set of charcters; does it need an 
> input
> encoding?

Hello Herb!

Although Terminal is set (and able too) to make use of the whole UTF-8 
character set, there are applications that are a bit restricted. vi, or 
better: vim for example. I know the files which have some characters 
from the range 128-255 (or, more exact, from the range 160-255, because 
the range 128-159 are the 8bit control codes, analogous to those in the 
7bit range, 0-31, in the ISO Latin encodings, only Mac Roman has no gap 
and only 7bit controls). In vim 8bit and UTF-8 characters are displayed 
as ?, in pico too. Both are clever enough not to re-encode the file 
when saving it. When you're stubborn enough to accept that ? can mean 
á, â, à, or anything else too, you just make your changes in the 7bit 
ASCII range and save them. That's all!

It makes no difference which keyboard layout you're using and which 
method you apply to type an 'extended' character (you could use 
Character Palette too!), they all get translated by internal mechanisms 
to show up as the same glyph. This show is just a presentation layer. 
Another kind of presentation layer is the way these glyphs are saved as 
bytes in a file, as one, two, three, or more bytes each ...  And only 
the first 128 code positions, the 7 bits behind the seven mountains, in 
the ISO Latin, Mac Roman, and MS Windows encodings are the same; the 
DOS Code Pages could exchange the characters #, $, @, [, ], \, {, }, |, 
^, _, `, or ~ by 'national' ones, German umlauts, French accents, the 
British £. While ISO Latin, HP Roman-8, and Unicode reflect the 7 bit 
control characters in the code postition range of 0-31 into the 8 bit 
extension, i.e. the code positions 128-159 are (again) 8 bit control 
characters, Mac Roman and MS Windows use some to all of these points. 
Compared to these ASCII encodings chaos is a well-structured and 
computable design.



Build a man a fire and he'll be warm for a night, but set a man on fire 
and he'll be warm for the rest of his life.

--------------------- Info ---------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
           & FAQ: http://latex.yauh.de/faq/
TeX FAQ: http://www.tex.ac.uk/faq
List Post: <mailto:MacOSX-TeX at email.esm.psu.edu>

More information about the macostex-archives mailing list