[XeTeX] in XeTeX
Keith J. Schultz
keithjschultz at web.de
Mon Nov 14 15:17:33 CET 2011
You are absolutely right in your assessment. True plain text files are/where traditionally 7-bits.
Though, I have to tell you that nowadays even 8-bit files are considered plain text.
The verdict is still out in how far unicode text files are plain text files, as unicode is well unicode and
its encoding goes a little further than what be considered as "plain text". Yet, if you considered unicode just as
a encoding of text, then unicode is plain text.
On the other side in computer science there is, as you said only bits and bytes. It is how we interpret them that
makes it text, or plain text, or binary code.
Am 14.11.2011 um 14:48 schrieb Herbert Schulz:
> Gosh, I hate to get into the middle of this but here's my interpretation of what a plain text file is and why.
> All files are, in fact, just a series of bytes (or even bits) and how these bytes are to be interpreted determine if the file is a plain text file or not. Traditional TeX used the 7-bit ASCII set of bytes. Most extensions of that set have those same byte values representing the same characters so 7-bit ASCII is usually a sub-set of those extensions (also known as encodings). A plain text file uses only the common 7-bit ASCII byte set and virtually any application that can read that file interprets the meanings of the bytes correctly. The moment you use an extension of that 7-bit ASCII set an additional piece of information must be given; which encoding is being used. (There are some heuristics for determining this on the fly but none are 100% accurate.) Because that extra information must be given before an application can display the meaning of the file (i.e., replace the bytes by the characters) I don't consider those files as being plain text. Maybe text because the inter!
> pretation of the bytes is characters of some sort but not plain text.
> Notice that how those characters are interpreted by other applications has nothing to do with whether the file is plain text or other text. A Text Editor interprets the bytes simply as characters and displays them in some way while pdflatex interprets bytes strings as combinations of commands and text; same file, different interpretations.
> This is as far as I'm going in this since I really want to stay out of the argument. It's just my 0.0001 cents.
> Good Luck,
> Herb Schulz
> (herbs at wideopenwest dot com)
> Subscriptions, Archive, and List information, etc.:
More information about the XeTeX