[latex3-commits] [git/LaTeX3-latex3-latex2e] ltnew33: Manually merge branch 'gh478' into ltnew33 (29315f81)

Mon May 17 12:37:41 CEST 2021

Repository : https://github.com/latex3/latex2e
On branch  : ltnew33
Link       : https://github.com/latex3/latex2e/commit/29315f819fdab4203c0c5b6e151802c5db090d31

>---------------------------------------------------------------

commit 29315f819fdab4203c0c5b6e151802c5db090d31
Merge: 79ecdaf2 13cc212f
Author: Frank Mittelbach <frank.mittelbach at latex-project.org>
Date:   Mon May 17 12:37:41 2021 +0200

    Manually merge branch 'gh478' into ltnew33


>---------------------------------------------------------------

29315f819fdab4203c0c5b6e151802c5db090d31
 base/doc/ltnews33.tex | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 75 insertions(+), 1 deletion(-)

diff --cc base/doc/ltnews33.tex
index 3d1119ce,0487d2e8..97f8d8e1

--- a/base/doc/ltnews33.tex
+++ b/base/doc/ltnews33.tex
@@@ -123,8 -87,8 +123,8 @@@
  \providecommand\tubcommand[1]{}
  \tubcommand{\input{tubltmac}}
  
 -\publicationmonth{May}
 -\publicationyear{2021}
 +\publicationmonth{June}
- \publicationyear{2021 --- Draft Version 3a}
++\publicationyear{2021 --- Draft Version 3b}
  
  \publicationissue{33}
  
@@@ -266,240 -159,139 +266,314 @@@ management
  \githubissue{444}
  
  
 -\subsection{Shipping out a page while bypassing hooks}
 +\subsection{Change of font series/shape delayed until \cs{selectfont}}
 +
 +With the NFSS extensions introduced in 2020, the font series and shape
 +settings can be influenced by changes to the font family.  The
 +settings of these two are now therefore delayed until \cs{selectfont}
 +is executed; this avoids unnecessary or incorrect substitutions that
 +may otherwise happen due to the order of declarations.
 +%
 +\githubissue{444}
 +
 +
 +
 +\section{Handling file names}
 +
 +
 +
 +\subsection[File names with spaces, multiple dots or\\
 +            \acro{utf-8} characters]
 +           {File names with spaces, multiple dots or \acro{utf-8} characters}
 +
 +In one of the recent \LaTeX{} releases we improved the interface
 +for specifying file names so that they can now safely contain spaces
 +(as is common these days),
 +more than one dot character, and also UTF-8 characters
 +outside the \acro{ascii} range. 
 +In the past this was only possible by applying a special syntax
 +in the case of spaces, 
 +whilst file names with several dots often failed, 
 +as did most UTF-8 characters.
 +
 +
 +\subsubsection{Consequences for file names in \cs{include}}
 +
 +\TeX{} has a built-in rule saying that you can normally leave out the
 +extension if it is \texttt{.tex}.  Thus \verb=\input{file}= and
 +\verb=\input{file.tex}= both load \file{file.tex} (if it exists).
 +While this is convenient most of the time, it is a little awkward in
 +some scenarios (for example, when both \file{file} and \file{file.tex}
 +exist) and also when you manually try to implement the rule.
 +
 +\LaTeX{} therefore had one special syntax for \cs{include} and
 +\cs{includeonly}: they always expected that 
 +their arguments contains a
 +file name\footnote{In case of \cs{includeonly} a comma separated list of such names.} 
 +with no extension given,
 +  so that it had to be\texttt{.tex}.  Thus,
 + when you mistakenly wrote
 +\verb=\include{mychap.tex}= (for example,
 +because you changed from \cs{input}
 +to \cs{include}),
 +\LaTeX{} went ahead and looked for the
 +file \file{mychap.tex.tex} for inclusion and tried to
 +use the file \file{mychap.tex.aux} for internal (auxiliary) information.  The reason was that
 +\cs{include} had to construct both
 +of these file names from the given
 +argument and it didn't bother to do
 +anything special
 +with the supplied 
 +extension \texttt{.tex}.
 +
 +With the new implementation this has
 +changed:
 +the extension \texttt{.tex}
 +now gets removed/ignored if it was
 +supplied.
 +Thus \verb=\include{mychap.tex}= now 
 +no longer looks for \file{mychap.tex.tex} 
 +but loads
 +\file{mychap.tex} 
 +and uses \file{mychap.aux}.
 +%
 +\githubissue{486}
 +
 +
 +
 +\subsection{Normalization of robust commands in file names}
 +
 +The handling of file names has been modified so that \verb|\string| is
 +applied to normalize robust commands within the file name.
 +Previously, for example, \verb|\input{\sqrt{2}}| would cause
 +\LaTeX\ to loop indefinitely whereas with with the new normalization
 +it looks for the file named \verb|sqrt {2}.tex|
 +%%FMi
 +(and reports a file not found failure instead).
 +%
 +\githubissue{481}
 +
 +
 +
 +
 +\subsection{Fix for \env{filecontents} with \acro{utf-8} 
 +  chars in the file name}
 +
 +Since a few releases back, the \env{filecontents} environment has
 +allowed \acro{utf-8} characters in the file name.  There was, however,
 +a bug that would not allow \emph{over}writing a file with \acro{utf-8}
 +characters in its name.  This has been fixed and now
 +\env{filecontents} allows any characters in the file name.
 +%
 +\githubissue{415}
 +
 +
 +
 +
 +\section{On characters \& encodings}
 +
 +
 +
 +\subsection{Improved copy\,\&\,paste support for \pdfTeX{} documents}
 +
 +When compiling with \pdfTeX{}, additional information is now added
 +automatically to the PDF file in order to improve copying from, and
 +searching in, text.
 +
 +This in particular allows the most common ligatures to be copied as
 +intended from all generated PDF files without the need to explicitly
 +load the package \pkg{cmap} or the file \texttt{glyphtounicode.tex}.
 +
 +%%FMi the above sounds as if ``in most cases things work without
 +%%glyphtounicode but sometimes you need it ... which is wrong
 +%%(sometimes you need cmap but glyphtounicode is now part of the
 +%%kernel!
 +
 +%
 +%%FMi \iffalse
 +%%FMi %%CCC Since this has been integrated into the kernel, 
 +%%FMi %%CCC
 +%%FMi This means that most documents %%CCC should 
 +%%FMi no
 +%%FMi longer need to load the %%CCC 
 +%%FMi package \pkg{cmap} or input 
 +%%FMi %%CCC  
 +%%FMi the file
 +%%FMi \texttt{glyphtounicode.tex}.
 +%%FMi \fi
 +%
 +\githubissue{465}
 +
 +
  
 -In the 2020 October release several hooks were added to the page
 -shipout process, e.g., to add some background or foreground material
 -to some or all pages. We now also added a \cs{RawShipout} command that
 -bypasses most of these hooks during the shipout. Some essential
 -internal bookkeeping still takes place such as updating the
 -\texttt{totalpages} counter or adding \texttt{shipout/firstpage} or
 -\texttt{shipout/lastpage} material if the page happens to be the first
 -or last.
  
  
 +\subsection{Support for more Unicode characters}
 +
 +
 +\LaTeX\ is quite capable of typesetting characters such as
 +\enquote{\d{m}}, but until now it could not access some Unicode
 +characters from the Latin Extended Additional block.  This meant that,
 +for example, there were no Unicode mappings for some characters that
 +are used to write Sanskrit words in Latin transliteration (as seen in
 +books about yoga, Buddhist philosophy, etc.).
 +%
 +These characters have now been added so that they can be entered
 +directly instead of using \verb=\d{m}=, etc.
 +%
 +\githubissue{484}
 +
 +
 +
 +
 +
 +\subsection{More ``dashes'' in encodings \texttt{OT1},
 +  \texttt{T1} and \texttt{TU}}
 +
 +When pasting in text from external sources, one can encounter these
 +three Unicode characters
 +%
 +\texttt{"2011} (non-breaking hyphen),
 +\texttt{"2012} (figure dash) and
 +\texttt{"2015} (horizontal bar),
 +%
 +in addition to the more common 
 +%
 +\texttt{"2013} (en-dash) and \texttt{"2014} (em-dash).
 +%
 +In the past, these first three characters produced an error message
 +when used with \pdfTeX{} (since they are not available in \texttt{OT1}
 +or \texttt{T1} encoded fonts).  They now typeset an approximation to
 +the glyph: e.g., the `figure dash' is approximated by an en-dash.
 +
 +With Unicode engines they either work (when the glyph is contained in
 +the selected Unicode font) or they typeset nothing, producing a
 +``Missing character'' warning in the log file.
 +
 +Additionally, with all engines, these characters can now be accessed
 +with the command names \cs{textnonbreakinghyphen}, \cs{textfiguredash}
 +and \cs{texthorizontalbar}, respectively.
 +%
 +\githubissue{404}
 +
 +
 +
 +
 +\subsection{Poor man's \cs{textasteriskcentered}}
 +
 +The \cs{textasteriskcentered} symbol, used as part of the set of
 +footnote symbols in \LaTeX{}, is assumed to be implemented by every
 +font with the \texttt{TS1} encoding (when \pdfTeX{} is used) or with
 +the \texttt{TU} encoding for the Unicode engines.  That assumption is
 +unfortunately not correct for all fonts since, for example, the
 +\texttt{stix2} fonts don't provide this glyph.  A result is that one
 +gets missing glyph messages when using \cs{thanks}, etc.
 +
 +Therefore \cs{textasteriskcentered} now checks whether there is such a
 +glyph and, if not, uses a normal \enquote{*}, but slightly enlarged
 +and lowered.  This may not be perfect in all cases, but it is
 +certainly better than no glyph showing up.
 +%
 +\githubissue{502}
  
  
 -\section{A note on the \texttt{TS1} encoding}
++\subsection{A note on the \texttt{TS1} encoding}
+ 
+ The \enquote{text symbol encoding} (\texttt{TS1}) was originally
+ designed at the Cork Conference as a companion to the \texttt{T1}
+ encoding. In it various symbols that are not subject to hyphenation
+ got assembled and the \pkg{textcomp} package was developed to make
+ them accessible. Unfortunately the \TeX{} community was a bit too
+ enthusiastic and included several symbols only available in a few
+ \TeX{} fonts and some, such as the capital accents, not available at
+ all but developed as part of the reference font implementation.
+ 
+ In hindsight that was a very bad idea because it meant that other
+ existing fonts (at the time) and later new fonts that got developed
+ were unable to provide the full set of glyphs that made up the
+ \texttt{TS1} encoding. For existing free PostScript fonts people to
+ the extra effort and produced virtual fonts that faked (some) of
+ the missing glyphs. But this was and is a time consuming effort so it
+ was done only for a few basic fonts. But even then, only some fonts
+ included all glyphs from \texttt{TS1} so the \pkg{textcomp} already
+ back then contained a long list, dividing fonts into 5 categories
+ according to which glyphs were implemented and which were missing.
+ 
+ A couple of releases back the functionality of the \pkg{textcomp}
+ package got integrated into the core code of the \LaTeX{} kernel so
+ that its glyph definitions, e.g., \cs{textcopyright}, \cs{texteuro} or
+ \cs{textyen}, are now automatically available without the need to load
+ an additional package in the preamble.
+ 
+ At the time this happened many new free fonts had appeared and
+ unfortunately the chaos around the question \enquote{which glyphs of
+   the \texttt{TS1} encoding are implemented by which font} had
+ increased with it. Not only did one find many new holes it was next to
+ impossible to order the set of fonts into a reasonable set of
+ sub-encodings that are contained in each other in a single sequence.
+ 
+ In the end we decided on nine or ten sub-encodings with a reasonable
+ number of font in each so that all font implemented all glyphs of the
+ sub-encoding they got mapped to. Thus when typesetting with a font one
+ could be sure that a command like \cs{textcopyleft} would either
+ typeset the requested character (if the glyph was part of the
+ sub-encoding the font belonged to) or it would raise an error, saying
+ that the glyph is unavailable in that fact. The mapping would ensure
+ that \LaTeX{} always errs on the side of caution, because it might
+ claim a glyph is unavailable even though in fact it is.
+ 
+ For example, the old \texttt{pcr} (PostScript Courier) font (as well
+ as most other older PS fonts) is mapped to sub-encoding 5 and
+ therefore claims that \cs{textasciigrave} is unavailable even though
+ in fact for Courier this is not true. If one uses such a font and this
+ becomes an issue then there are a couple (suboptimal) possibilities.
+ For one, one can alter the mapping of Courier and pretend that belongs
+ to a fuller sub-encoding, e.g.
+ \begin{verbatim}
+   \DeclareEncodingSubset{TS1}{pcr}{2}
+ \end{verbatim}
+ The downside is, that \LaTeX{} then believes other glyphs that are in fact
+ unavailable are also there, so that it is important to check that the
+ final document doesn't have some missing glyphs.
+ 
+ An alternative is to pretend that \cs{textasciigrave} can always be
+ taken from the \texttt{TS1} encoding (no questions asked):
+ \begin{verbatim}
+   \DeclareTextSymbolDefault{\textasciigrave}{TS1}
+ \end{verbatim}
+ Again there is a danger that this is not true when it is used with a
+ different font and would then generate a missing glyph.
+ 
+ Finally, and possibly the best solution, if not impossible for other
+ reasons, is to simply use a different font, for example, to use the
+ \TeX{} Gyre Cursor font (a reimplementation of Courier but with a
+ much more complete glyph set).
+ 
+ 
+ 
  
 -\section{Other changes to the \LaTeX{} kernel}
 -
 -
 -\subsection{Support several footnote marks to one footnote}
 -
 -It is sometimes necessary to reference the same footnote several
 -times, i.e., produce several footnote marks with the same number or
 -symbol. This is now always possible by placing a \cs{label} into the
 -\cs{footnote} and reference it with the command \cs{footref}
 -elsewhere.  This way marks referring to footnotes anywhere on the page
 -(including those in \texttt{minipage}s) can be generated.  In the past
 -this command was only available with certain classes or when loading
 -the \pkg{footmisc} package.
 -%
 -\githubissue{482}
  
 +\section{New or improved commands}
  
 -\subsection{Improved copy \& paste support for \pdfTeX{} documents}
  
 -When compiling with \pdfTeX{}, additional information is added to the
 -PDF file to improve copying from and searching in text. This especially
 -allows ligatures to copy correctly from \pdfTeX{} generated PDF files in
 -most cases.
 +\subsection{Adjusting \env{itemize} labels with \cs{labelitemfont}}
  
 -Since this has been integrated into the kernel, most documents should no
 -longer need to load the \pkg{cmap} package or input \texttt{glyphtounicode}.
 -%
 -\githubissue{465}
 -
 -
 -\subsection{Customize \env{itemize} labels with \cs{labelitemfont}}
 -
 -The command \cs{labelitemfont} was in fact already introduced with the
 -\LaTeX\ release 2020-02-02, but back then we forgot to describe it, so
 +The command \cs{labelitemfont} was introduced already with the
 +\LaTeX\ release 2020-02-02, but back then we forgot to describe it so
  we do this now. Its purpose is to resolve some bad formatting issues
 -with the \env{itemize} environment and at the same time make it easier
 -to adjust its layout if necessary. What could happen in the past was that the
 -\env{itemize} labels, e.g., the \textbullet{}, would sometimes react to
 -surrounding font changes and could suddenly change shape, for example
 -to \textit{\textbullet}.
 -
 -Now \cs{labelitemfont} is applied to each
 -label defaulting to \cs{normalfont} which will prevent this behavior.
 -By choosing a different settings other effects can be achieved, for example
 +with the \env{itemize} environment and also to make it easier to
 +adjust the layout when necessary. What could happen in the past was
 +that the \env{itemize} labels (e.g., the \textbullet{}) would
 +sometimes react to surrounding font changes and could then suddenly
 +change shape, for example to \textit{\textbullet}.
 +
 +This new command \cs{labelitemfont}, which defaults to
 +\cs{normalfont}, is now used
 +%%FMi to typeset
 +when typesetting   %%FMi to seems wrong (it is only  additionally applied
 +each label.  Thus by choosing
 +different settings other effects can be achieved.  Here are two
 +examples:
  \begin{verbatim}
    \renewcommand\labelitemfont
       {\normalfont\fontfamily{lmss}\selectfont}