[latex3-commits] [git/LaTeX3-latex3-latex2e] ltnew33: Manually merge branch 'gh478' into ltnew33 (29315f81)
Frank Mittelbach
frank.mittelbach at latex-project.org
Mon May 17 12:37:41 CEST 2021
Repository : https://github.com/latex3/latex2e
On branch : ltnew33
Link : https://github.com/latex3/latex2e/commit/29315f819fdab4203c0c5b6e151802c5db090d31
>---------------------------------------------------------------
commit 29315f819fdab4203c0c5b6e151802c5db090d31
Merge: 79ecdaf2 13cc212f
Author: Frank Mittelbach <frank.mittelbach at latex-project.org>
Date: Mon May 17 12:37:41 2021 +0200
Manually merge branch 'gh478' into ltnew33
>---------------------------------------------------------------
29315f819fdab4203c0c5b6e151802c5db090d31
base/doc/ltnews33.tex | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 75 insertions(+), 1 deletion(-)
diff --cc base/doc/ltnews33.tex
index 3d1119ce,0487d2e8..97f8d8e1
--- a/base/doc/ltnews33.tex
+++ b/base/doc/ltnews33.tex
@@@ -123,8 -87,8 +123,8 @@@
\providecommand\tubcommand[1]{}
\tubcommand{\input{tubltmac}}
-\publicationmonth{May}
-\publicationyear{2021}
+\publicationmonth{June}
- \publicationyear{2021 --- Draft Version 3a}
++\publicationyear{2021 --- Draft Version 3b}
\publicationissue{33}
@@@ -266,240 -159,139 +266,314 @@@ management
\githubissue{444}
-\subsection{Shipping out a page while bypassing hooks}
+\subsection{Change of font series/shape delayed until \cs{selectfont}}
+
+With the NFSS extensions introduced in 2020, the font series and shape
+settings can be influenced by changes to the font family. The
+settings of these two are now therefore delayed until \cs{selectfont}
+is executed; this avoids unnecessary or incorrect substitutions that
+may otherwise happen due to the order of declarations.
+%
+\githubissue{444}
+
+
+
+\section{Handling file names}
+
+
+
+\subsection[File names with spaces, multiple dots or\\
+ \acro{utf-8} characters]
+ {File names with spaces, multiple dots or \acro{utf-8} characters}
+
+In one of the recent \LaTeX{} releases we improved the interface
+for specifying file names so that they can now safely contain spaces
+(as is common these days),
+more than one dot character, and also UTF-8 characters
+outside the \acro{ascii} range.
+In the past this was only possible by applying a special syntax
+in the case of spaces,
+whilst file names with several dots often failed,
+as did most UTF-8 characters.
+
+
+\subsubsection{Consequences for file names in \cs{include}}
+
+\TeX{} has a built-in rule saying that you can normally leave out the
+extension if it is \texttt{.tex}. Thus \verb=\input{file}= and
+\verb=\input{file.tex}= both load \file{file.tex} (if it exists).
+While this is convenient most of the time, it is a little awkward in
+some scenarios (for example, when both \file{file} and \file{file.tex}
+exist) and also when you manually try to implement the rule.
+
+\LaTeX{} therefore had one special syntax for \cs{include} and
+\cs{includeonly}: they always expected that
+their arguments contains a
+file name\footnote{In case of \cs{includeonly} a comma separated list of such names.}
+with no extension given,
+ so that it had to be\texttt{.tex}. Thus,
+ when you mistakenly wrote
+\verb=\include{mychap.tex}= (for example,
+because you changed from \cs{input}
+to \cs{include}),
+\LaTeX{} went ahead and looked for the
+file \file{mychap.tex.tex} for inclusion and tried to
+use the file \file{mychap.tex.aux} for internal (auxiliary) information. The reason was that
+\cs{include} had to construct both
+of these file names from the given
+argument and it didn't bother to do
+anything special
+with the supplied
+extension \texttt{.tex}.
+
+With the new implementation this has
+changed:
+the extension \texttt{.tex}
+now gets removed/ignored if it was
+supplied.
+Thus \verb=\include{mychap.tex}= now
+no longer looks for \file{mychap.tex.tex}
+but loads
+\file{mychap.tex}
+and uses \file{mychap.aux}.
+%
+\githubissue{486}
+
+
+
+\subsection{Normalization of robust commands in file names}
+
+The handling of file names has been modified so that \verb|\string| is
+applied to normalize robust commands within the file name.
+Previously, for example, \verb|\input{\sqrt{2}}| would cause
+\LaTeX\ to loop indefinitely whereas with with the new normalization
+it looks for the file named \verb|sqrt {2}.tex|
+%%FMi
+(and reports a file not found failure instead).
+%
+\githubissue{481}
+
+
+
+
+\subsection{Fix for \env{filecontents} with \acro{utf-8}
+ chars in the file name}
+
+Since a few releases back, the \env{filecontents} environment has
+allowed \acro{utf-8} characters in the file name. There was, however,
+a bug that would not allow \emph{over}writing a file with \acro{utf-8}
+characters in its name. This has been fixed and now
+\env{filecontents} allows any characters in the file name.
+%
+\githubissue{415}
+
+
+
+
+\section{On characters \& encodings}
+
+
+
+\subsection{Improved copy\,\&\,paste support for \pdfTeX{} documents}
+
+When compiling with \pdfTeX{}, additional information is now added
+automatically to the PDF file in order to improve copying from, and
+searching in, text.
+
+This in particular allows the most common ligatures to be copied as
+intended from all generated PDF files without the need to explicitly
+load the package \pkg{cmap} or the file \texttt{glyphtounicode.tex}.
+
+%%FMi the above sounds as if ``in most cases things work without
+%%glyphtounicode but sometimes you need it ... which is wrong
+%%(sometimes you need cmap but glyphtounicode is now part of the
+%%kernel!
+
+%
+%%FMi \iffalse
+%%FMi %%CCC Since this has been integrated into the kernel,
+%%FMi %%CCC
+%%FMi This means that most documents %%CCC should
+%%FMi no
+%%FMi longer need to load the %%CCC
+%%FMi package \pkg{cmap} or input
+%%FMi %%CCC
+%%FMi the file
+%%FMi \texttt{glyphtounicode.tex}.
+%%FMi \fi
+%
+\githubissue{465}
+
+
-In the 2020 October release several hooks were added to the page
-shipout process, e.g., to add some background or foreground material
-to some or all pages. We now also added a \cs{RawShipout} command that
-bypasses most of these hooks during the shipout. Some essential
-internal bookkeeping still takes place such as updating the
-\texttt{totalpages} counter or adding \texttt{shipout/firstpage} or
-\texttt{shipout/lastpage} material if the page happens to be the first
-or last.
+\subsection{Support for more Unicode characters}
+
+
+\LaTeX\ is quite capable of typesetting characters such as
+\enquote{\d{m}}, but until now it could not access some Unicode
+characters from the Latin Extended Additional block. This meant that,
+for example, there were no Unicode mappings for some characters that
+are used to write Sanskrit words in Latin transliteration (as seen in
+books about yoga, Buddhist philosophy, etc.).
+%
+These characters have now been added so that they can be entered
+directly instead of using \verb=\d{m}=, etc.
+%
+\githubissue{484}
+
+
+
+
+
+\subsection{More ``dashes'' in encodings \texttt{OT1},
+ \texttt{T1} and \texttt{TU}}
+
+When pasting in text from external sources, one can encounter these
+three Unicode characters
+%
+\texttt{"2011} (non-breaking hyphen),
+\texttt{"2012} (figure dash) and
+\texttt{"2015} (horizontal bar),
+%
+in addition to the more common
+%
+\texttt{"2013} (en-dash) and \texttt{"2014} (em-dash).
+%
+In the past, these first three characters produced an error message
+when used with \pdfTeX{} (since they are not available in \texttt{OT1}
+or \texttt{T1} encoded fonts). They now typeset an approximation to
+the glyph: e.g., the `figure dash' is approximated by an en-dash.
+
+With Unicode engines they either work (when the glyph is contained in
+the selected Unicode font) or they typeset nothing, producing a
+``Missing character'' warning in the log file.
+
+Additionally, with all engines, these characters can now be accessed
+with the command names \cs{textnonbreakinghyphen}, \cs{textfiguredash}
+and \cs{texthorizontalbar}, respectively.
+%
+\githubissue{404}
+
+
+
+
+\subsection{Poor man's \cs{textasteriskcentered}}
+
+The \cs{textasteriskcentered} symbol, used as part of the set of
+footnote symbols in \LaTeX{}, is assumed to be implemented by every
+font with the \texttt{TS1} encoding (when \pdfTeX{} is used) or with
+the \texttt{TU} encoding for the Unicode engines. That assumption is
+unfortunately not correct for all fonts since, for example, the
+\texttt{stix2} fonts don't provide this glyph. A result is that one
+gets missing glyph messages when using \cs{thanks}, etc.
+
+Therefore \cs{textasteriskcentered} now checks whether there is such a
+glyph and, if not, uses a normal \enquote{*}, but slightly enlarged
+and lowered. This may not be perfect in all cases, but it is
+certainly better than no glyph showing up.
+%
+\githubissue{502}
-\section{A note on the \texttt{TS1} encoding}
++\subsection{A note on the \texttt{TS1} encoding}
+
+ The \enquote{text symbol encoding} (\texttt{TS1}) was originally
+ designed at the Cork Conference as a companion to the \texttt{T1}
+ encoding. In it various symbols that are not subject to hyphenation
+ got assembled and the \pkg{textcomp} package was developed to make
+ them accessible. Unfortunately the \TeX{} community was a bit too
+ enthusiastic and included several symbols only available in a few
+ \TeX{} fonts and some, such as the capital accents, not available at
+ all but developed as part of the reference font implementation.
+
+ In hindsight that was a very bad idea because it meant that other
+ existing fonts (at the time) and later new fonts that got developed
+ were unable to provide the full set of glyphs that made up the
+ \texttt{TS1} encoding. For existing free PostScript fonts people to
+ the extra effort and produced virtual fonts that faked (some) of
+ the missing glyphs. But this was and is a time consuming effort so it
+ was done only for a few basic fonts. But even then, only some fonts
+ included all glyphs from \texttt{TS1} so the \pkg{textcomp} already
+ back then contained a long list, dividing fonts into 5 categories
+ according to which glyphs were implemented and which were missing.
+
+ A couple of releases back the functionality of the \pkg{textcomp}
+ package got integrated into the core code of the \LaTeX{} kernel so
+ that its glyph definitions, e.g., \cs{textcopyright}, \cs{texteuro} or
+ \cs{textyen}, are now automatically available without the need to load
+ an additional package in the preamble.
+
+ At the time this happened many new free fonts had appeared and
+ unfortunately the chaos around the question \enquote{which glyphs of
+ the \texttt{TS1} encoding are implemented by which font} had
+ increased with it. Not only did one find many new holes it was next to
+ impossible to order the set of fonts into a reasonable set of
+ sub-encodings that are contained in each other in a single sequence.
+
+ In the end we decided on nine or ten sub-encodings with a reasonable
+ number of font in each so that all font implemented all glyphs of the
+ sub-encoding they got mapped to. Thus when typesetting with a font one
+ could be sure that a command like \cs{textcopyleft} would either
+ typeset the requested character (if the glyph was part of the
+ sub-encoding the font belonged to) or it would raise an error, saying
+ that the glyph is unavailable in that fact. The mapping would ensure
+ that \LaTeX{} always errs on the side of caution, because it might
+ claim a glyph is unavailable even though in fact it is.
+
+ For example, the old \texttt{pcr} (PostScript Courier) font (as well
+ as most other older PS fonts) is mapped to sub-encoding 5 and
+ therefore claims that \cs{textasciigrave} is unavailable even though
+ in fact for Courier this is not true. If one uses such a font and this
+ becomes an issue then there are a couple (suboptimal) possibilities.
+ For one, one can alter the mapping of Courier and pretend that belongs
+ to a fuller sub-encoding, e.g.
+ \begin{verbatim}
+ \DeclareEncodingSubset{TS1}{pcr}{2}
+ \end{verbatim}
+ The downside is, that \LaTeX{} then believes other glyphs that are in fact
+ unavailable are also there, so that it is important to check that the
+ final document doesn't have some missing glyphs.
+
+ An alternative is to pretend that \cs{textasciigrave} can always be
+ taken from the \texttt{TS1} encoding (no questions asked):
+ \begin{verbatim}
+ \DeclareTextSymbolDefault{\textasciigrave}{TS1}
+ \end{verbatim}
+ Again there is a danger that this is not true when it is used with a
+ different font and would then generate a missing glyph.
+
+ Finally, and possibly the best solution, if not impossible for other
+ reasons, is to simply use a different font, for example, to use the
+ \TeX{} Gyre Cursor font (a reimplementation of Courier but with a
+ much more complete glyph set).
+
+
+
-\section{Other changes to the \LaTeX{} kernel}
-
-
-\subsection{Support several footnote marks to one footnote}
-
-It is sometimes necessary to reference the same footnote several
-times, i.e., produce several footnote marks with the same number or
-symbol. This is now always possible by placing a \cs{label} into the
-\cs{footnote} and reference it with the command \cs{footref}
-elsewhere. This way marks referring to footnotes anywhere on the page
-(including those in \texttt{minipage}s) can be generated. In the past
-this command was only available with certain classes or when loading
-the \pkg{footmisc} package.
-%
-\githubissue{482}
+\section{New or improved commands}
-\subsection{Improved copy \& paste support for \pdfTeX{} documents}
-When compiling with \pdfTeX{}, additional information is added to the
-PDF file to improve copying from and searching in text. This especially
-allows ligatures to copy correctly from \pdfTeX{} generated PDF files in
-most cases.
+\subsection{Adjusting \env{itemize} labels with \cs{labelitemfont}}
-Since this has been integrated into the kernel, most documents should no
-longer need to load the \pkg{cmap} package or input \texttt{glyphtounicode}.
-%
-\githubissue{465}
-
-
-\subsection{Customize \env{itemize} labels with \cs{labelitemfont}}
-
-The command \cs{labelitemfont} was in fact already introduced with the
-\LaTeX\ release 2020-02-02, but back then we forgot to describe it, so
+The command \cs{labelitemfont} was introduced already with the
+\LaTeX\ release 2020-02-02, but back then we forgot to describe it so
we do this now. Its purpose is to resolve some bad formatting issues
-with the \env{itemize} environment and at the same time make it easier
-to adjust its layout if necessary. What could happen in the past was that the
-\env{itemize} labels, e.g., the \textbullet{}, would sometimes react to
-surrounding font changes and could suddenly change shape, for example
-to \textit{\textbullet}.
-
-Now \cs{labelitemfont} is applied to each
-label defaulting to \cs{normalfont} which will prevent this behavior.
-By choosing a different settings other effects can be achieved, for example
+with the \env{itemize} environment and also to make it easier to
+adjust the layout when necessary. What could happen in the past was
+that the \env{itemize} labels (e.g., the \textbullet{}) would
+sometimes react to surrounding font changes and could then suddenly
+change shape, for example to \textit{\textbullet}.
+
+This new command \cs{labelitemfont}, which defaults to
+\cs{normalfont}, is now used
+%%FMi to typeset
+when typesetting %%FMi to seems wrong (it is only additionally applied
+each label. Thus by choosing
+different settings other effects can be achieved. Here are two
+examples:
\begin{verbatim}
\renewcommand\labelitemfont
{\normalfont\fontfamily{lmss}\selectfont}
More information about the latex3-commits
mailing list.