[latexrefman-commits] [SCM] latexrefman updated: r914 - trunk
karl at gnu.org.ua
karl at gnu.org.ua
Wed May 26 21:38:00 CEST 2021
Author: karl
Date: 2021-05-26 19:38:00 +0000 (Wed, 26 May 2021)
New Revision: 914
Modified:
trunk/ChangeLog
trunk/aspell.en.pws
trunk/latex2e.texi
Log:
utf8 discussion
Modified: trunk/ChangeLog
===================================================================
--- trunk/ChangeLog 2021-05-26 19:21:00 UTC (rev 913)
+++ trunk/ChangeLog 2021-05-26 19:38:00 UTC (rev 914)
@@ -1,5 +1,11 @@
+2021-05-26 Karl Berry <karl at freefriends.org>
+
+ * latex2e.texi (inputenc package): tweak utf-8 discussion.
+
2021-05-15 Karl Berry <karl at freefriends.org>
+ * Makefile: website process doc
+
* latex2e.texi (About this document): urls were confused, with no
no link to the actual dev page with all the output formats.
Report from Paul A Norman, 15 May 2021 18:27:10.
Modified: trunk/aspell.en.pws
===================================================================
--- trunk/aspell.en.pws 2021-05-26 19:21:00 UTC (rev 913)
+++ trunk/aspell.en.pws 2021-05-26 19:38:00 UTC (rev 914)
@@ -268,3 +268,5 @@
shellesc
CLI
adjustbox
+graphpap
+xr
Modified: trunk/latex2e.texi
===================================================================
--- trunk/latex2e.texi 2021-05-26 19:21:00 UTC (rev 913)
+++ trunk/latex2e.texi 2021-05-26 19:38:00 UTC (rev 914)
@@ -1480,7 +1480,7 @@
@node fontenc package
- at section @file{fontenc} package
+ at section @code{fontenc} package
@cindex Font encoding
@cindex UTF-8, font support for
@@ -4315,7 +4315,7 @@
@node xr package
- at section @code{xr} Package
+ at section @code{xr} package
@findex @code{xr} package
@findex @code{xr-hyper} package
@@ -6320,7 +6320,7 @@
as an argument, as with @code{\put(1,2)@{...@}}, it is not enclosed in
braces since the parentheses serve to delimit the argument. Also,
unlike in some computer graphics systems, larger y-coordinates are
-further up the page, ie.@: @math{y = 1} is @emph{above} @math{y = 0}.
+further up the page, for example, @math{y = 1} is @emph{above} @math{y = 0}.
There are four ways to put things in a picture: @code{\put},
@code{\multiput}, @code{\qbezier}, and @code{\graphpaper}. The most
@@ -17262,7 +17262,7 @@
* Text symbols:: Inserting other non-letter symbols in text.
* Accents:: Inserting accents.
* Additional Latin letters:: Inserting other non-English characters.
-* Inputenc package:: Set the input file text encoding.
+* inputenc package:: Set the input file text encoding.
* \rule:: Inserting lines and rectangles.
* \today:: Inserting today's date.
@end menu
@@ -18038,73 +18038,68 @@
@end table
- at node Inputenc package
- at section Inputenc package
+ at node inputenc package
+ at section @code{inputenc} package
@findex inputenc
-Synopsis, one of:
+Synopsis:
@example
-\usepackage@{inputenc@}
+\usepackage[@var{encoding-name}]@{inputenc@}
@end example
-or
+Declare the input file's text encoding. The default, if this package
+is not loaded, is UTF-8.
- at example
-\usepackage[@var{encoding-name}]@{inputenc@}
- at end example
+ at cindex encoding, of input files
+ at cindex character encoding
+ at cindex Unicode
+In a computer file, the characters are stored according to a scheme
+called the @dfn{encoding}. There are many different encodings. The
+simplest is ASCII, which supports 95 printable characters, not enough
+for most of the world's languages. For instance, to typeset the
+a-umlaut character @"{a} in an ASCII-encoded @LaTeX{} source file, the
+sequence @code{\"a} is used. This would make source files for anything
+but English hard to read; even for English, often a more extensive
+encoding is more convenient.
-Declare the input file's text encoding.
+The modern encoding standard, in some ways a union of the others, is
+UTF-8, one of the representations of Unicode. This is the default for
+ at LaTeX{} since 2018.
-In a computer file, the characters are stored as binary according to
-some scheme, called the encoding. There are many different encodings.
-The simplest is ASCII, but it does not accomodate many characters. For
-instance, to get the a-umlaut character @"{a} in an ASCII-encoded text
-file a user must enter @code{\"a}, which makes the file hard to read and
-also means that @TeX{} won't hyphenation the word containing that
-character. Often a more inclusive encoding is more convenient. The
-modern standard, in some ways a union of the others, is UTF-8.
+The @code{inputenc} package is how @LaTeX{} knows what encoding is
+used. For instance, the following command explicitly says that the
+input file is UTF-8 (note the lack of a dash).
-In short, to enter material a user sets their file editor to use an
-encoding scheme and this package is how @LaTeX{} knows what encoding
-they used. For instance, the following command says that the input file
-is UTF-8 (note the lack of a dash).
-
@example
\usepackage[utf8]@{inputenc@}
@end example
-Caution: use this package only with the pdf at TeX{} engine (@pxref{@TeX{}
-engines}). The Xe at TeX{} and Lua at TeX{} engines assume that the input
-file is UTF-8 encoded. If you invoke @LaTeX{} with either the
- at command{xelatex} command or the @command{lualatex} command and use the
-above example line, then you will be warned @code{inputenc package
-ignored with utf8 based engines}. And, if you instead declare a
-non-UTF-8 encoding such as @code{latin1} then you will get the error
- at code{inputenc is not designed for xetex or luatex}.
+Caution: use @code{inputenc} only with the pdf at TeX{} engine
+(@pxref{@TeX{} engines}). (The Xe at TeX{} and Lua at TeX{} engines assume
+that the input file is UTF-8 encoded.) If you invoke @LaTeX{} with
+either the @command{xelatex} command or the @command{lualatex}
+command, and try to declare a non-UTF-8 encoding with @code{inputenc},
+such as @code{latin1}, then you will get the error @code{inputenc is
+not designed for xetex or luatex}.
-In addition, @LaTeX{} releases since 2018 default to an equivalent of
-the above command. So documents started after that time typically will
-not explicitly include this package.
-
-An @code{inputenc} package error like @code{Invalid UTF-8 byte "96}
+An @code{inputenc} package error such as @code{Invalid UTF-8 byte "96}
means that some of the material in the input file does not follow the
encoding scheme. Often these errors come from copying material from a
-document that uses a different encoding than the input file; this one is
-a left single quote from a web page that uses @code{latin1} inside a
- at LaTeX{} input file that uses UTF-8. The solution is to convert the
-character to the @LaTeX{} equivalent, in this case a left single quote,
-or to erase it and enter the character using the input file's encoding
-(consult your editor's documentation).
+document that uses a different encoding than the input file; this one
+is a left single quote from a web page using @code{latin1} inside a
+ at LaTeX{} input file that uses UTF-8. The simplest solution is to
+replace the non-UTF-8 character with its UTF-8 equivalent, or use a
+ at LaTeX{} equivalent command or character.
-In some documents, such as in a collection of journal articles from a
+In some documents, such as a collection of journal articles from a
variety of authors, changing the encoding in mid-document may be
necessary. Use the command
@code{\inputencoding@{@var{encoding-name}@}}. The most common values
-for @var{encoding-name} are: @code{ascii}, @code{latin1}, @code{latin2},
- at code{latin3}, @code{latin4}, @code{latin5}, @code{latin9},
- at code{latin10}, and at tie{}@code{utf8}.
+for @var{encoding-name} are: @code{ascii}, @code{latin1},
+ at code{latin2}, @code{latin3}, @code{latin4}, @code{latin5},
+ at code{latin9}, @code{latin10}, and at tie{}@code{utf8}.
@node \rule
More information about the latexrefman-commits
mailing list.