[latexrefman-commits] [SCM] latexrefman updated: r914 - trunk

karl at gnu.org.ua karl at gnu.org.ua
Wed May 26 21:38:00 CEST 2021


Author: karl
Date: 2021-05-26 19:38:00 +0000 (Wed, 26 May 2021)
New Revision: 914

Modified:
   trunk/ChangeLog
   trunk/aspell.en.pws
   trunk/latex2e.texi
Log:
utf8 discussion

Modified: trunk/ChangeLog
===================================================================
--- trunk/ChangeLog	2021-05-26 19:21:00 UTC (rev 913)
+++ trunk/ChangeLog	2021-05-26 19:38:00 UTC (rev 914)
@@ -1,5 +1,11 @@
+2021-05-26  Karl Berry  <karl at freefriends.org>
+
+	* latex2e.texi (inputenc package): tweak utf-8 discussion.
+
 2021-05-15  Karl Berry  <karl at freefriends.org>
 
+	* Makefile: website process doc
+
 	* latex2e.texi (About this document): urls were confused, with no
 	no link to the actual dev page with all the output formats.
 	Report from Paul A Norman, 15 May 2021 18:27:10.

Modified: trunk/aspell.en.pws
===================================================================
--- trunk/aspell.en.pws	2021-05-26 19:21:00 UTC (rev 913)
+++ trunk/aspell.en.pws	2021-05-26 19:38:00 UTC (rev 914)
@@ -268,3 +268,5 @@
 shellesc
 CLI
 adjustbox
+graphpap
+xr

Modified: trunk/latex2e.texi
===================================================================
--- trunk/latex2e.texi	2021-05-26 19:21:00 UTC (rev 913)
+++ trunk/latex2e.texi	2021-05-26 19:38:00 UTC (rev 914)
@@ -1480,7 +1480,7 @@
 
 
 @node fontenc package
- at section @file{fontenc} package
+ at section @code{fontenc} package
 
 @cindex Font encoding
 @cindex UTF-8, font support for
@@ -4315,7 +4315,7 @@
 
 
 @node xr package
- at section @code{xr} Package
+ at section @code{xr} package
 
 @findex @code{xr} package
 @findex @code{xr-hyper} package
@@ -6320,7 +6320,7 @@
 as an argument, as with @code{\put(1,2)@{...@}}, it is not enclosed in
 braces since the parentheses serve to delimit the argument.  Also,
 unlike in some computer graphics systems, larger y-coordinates are
-further up the page, ie.@: @math{y = 1} is @emph{above} @math{y = 0}.
+further up the page, for example, @math{y = 1} is @emph{above} @math{y = 0}.
 
 There are four ways to put things in a picture: @code{\put},
 @code{\multiput}, @code{\qbezier}, and @code{\graphpaper}.  The most
@@ -17262,7 +17262,7 @@
 * Text symbols::                Inserting other non-letter symbols in text.
 * Accents::                     Inserting accents.
 * Additional Latin letters::    Inserting other non-English characters.
-* Inputenc package::            Set the input file text encoding.
+* inputenc package::            Set the input file text encoding.
 * \rule::                       Inserting lines and rectangles.
 * \today::                      Inserting today's date.
 @end menu
@@ -18038,73 +18038,68 @@
 @end table
 
 
- at node Inputenc package
- at section Inputenc package
+ at node inputenc package
+ at section @code{inputenc} package
 
 @findex inputenc
 
-Synopsis, one of:
+Synopsis:
 
 @example
-\usepackage@{inputenc@}
+\usepackage[@var{encoding-name}]@{inputenc@}
 @end example
 
-or
+Declare the input file's text encoding. The default, if this package
+is not loaded, is UTF-8.
 
- at example
-\usepackage[@var{encoding-name}]@{inputenc@}
- at end example
+ at cindex encoding, of input files
+ at cindex character encoding
+ at cindex Unicode
+In a computer file, the characters are stored according to a scheme
+called the @dfn{encoding}.  There are many different encodings.  The
+simplest is ASCII, which supports 95 printable characters, not enough
+for most of the world's languages. For instance, to typeset the
+a-umlaut character @"{a} in an ASCII-encoded @LaTeX{} source file, the
+sequence @code{\"a} is used. This would make source files for anything
+but English hard to read; even for English, often a more extensive
+encoding is more convenient.
 
-Declare the input file's text encoding.
+The modern encoding standard, in some ways a union of the others, is
+UTF-8, one of the representations of Unicode. This is the default for
+ at LaTeX{} since 2018. 
 
-In a computer file, the characters are stored as binary according to
-some scheme, called the encoding.  There are many different encodings.
-The simplest is ASCII, but it does not accomodate many characters. For
-instance, to get the a-umlaut character @"{a} in an ASCII-encoded text
-file a user must enter @code{\"a}, which makes the file hard to read and
-also means that @TeX{} won't hyphenation the word containing that
-character.  Often a more inclusive encoding is more convenient.  The
-modern standard, in some ways a union of the others, is UTF-8.
+The @code{inputenc} package is how @LaTeX{} knows what encoding is
+used.  For instance, the following command explicitly says that the
+input file is UTF-8 (note the lack of a dash).
 
-In short, to enter material a user sets their file editor to use an
-encoding scheme and this package is how @LaTeX{} knows what encoding
-they used.  For instance, the following command says that the input file
-is UTF-8 (note the lack of a dash).
-
 @example
 \usepackage[utf8]@{inputenc@}
 @end example
 
-Caution: use this package only with the pdf at TeX{} engine (@pxref{@TeX{}
-engines}).  The Xe at TeX{} and Lua at TeX{} engines assume that the input
-file is UTF-8 encoded.  If you invoke @LaTeX{} with either the
- at command{xelatex} command or the @command{lualatex} command and use the
-above example line, then you will be warned @code{inputenc package
-ignored with utf8 based engines}.  And, if you instead declare a
-non-UTF-8 encoding such as @code{latin1} then you will get the error
- at code{inputenc is not designed for xetex or luatex}.
+Caution: use @code{inputenc} only with the pdf at TeX{} engine
+(@pxref{@TeX{} engines}).  (The Xe at TeX{} and Lua at TeX{} engines assume
+that the input file is UTF-8 encoded.)  If you invoke @LaTeX{} with
+either the @command{xelatex} command or the @command{lualatex}
+command, and try to declare a non-UTF-8 encoding with @code{inputenc},
+such as @code{latin1}, then you will get the error @code{inputenc is
+not designed for xetex or luatex}.
 
-In addition, @LaTeX{} releases since 2018 default to an equivalent of
-the above command.  So documents started after that time typically will
-not explicitly include this package.
-
-An @code{inputenc} package error like @code{Invalid UTF-8 byte "96}
+An @code{inputenc} package error such as @code{Invalid UTF-8 byte "96}
 means that some of the material in the input file does not follow the
 encoding scheme.  Often these errors come from copying material from a
-document that uses a different encoding than the input file; this one is
-a left single quote from a web page that uses @code{latin1} inside a
- at LaTeX{} input file that uses UTF-8.  The solution is to convert the
-character to the @LaTeX{} equivalent, in this case a left single quote,
-or to erase it and enter the character using the input file's encoding
-(consult your editor's documentation).
+document that uses a different encoding than the input file; this one
+is a left single quote from a web page using @code{latin1} inside a
+ at LaTeX{} input file that uses UTF-8.  The simplest solution is to
+replace the non-UTF-8 character with its UTF-8 equivalent, or use a
+ at LaTeX{} equivalent command or character.
 
-In some documents, such as in a collection of journal articles from a
+In some documents, such as a collection of journal articles from a
 variety of authors, changing the encoding in mid-document may be
 necessary.  Use the command
 @code{\inputencoding@{@var{encoding-name}@}}.  The most common values
-for @var{encoding-name} are: @code{ascii}, @code{latin1}, @code{latin2},
- at code{latin3}, @code{latin4}, @code{latin5}, @code{latin9},
- at code{latin10}, and at tie{}@code{utf8}.
+for @var{encoding-name} are: @code{ascii}, @code{latin1},
+ at code{latin2}, @code{latin3}, @code{latin4}, @code{latin5},
+ at code{latin9}, @code{latin10}, and at tie{}@code{utf8}.
 
 
 @node \rule



More information about the latexrefman-commits mailing list.