[luatex] Adding a callback before trailing spaces are removed from a line of input

Vítek Novotný witiko at mail.muni.cz
Sun Aug 29 18:38:05 CEST 2021

On Sun, Aug 29, 2021 at 12:11:44PM +0200, Hans Hagen wrote:
> On 8/29/2021 12:43 AM, Vítek Novotný wrote:
> > Hello all,
> > 
> > in Knuth's TeX, trailing spaces are removed very early on when a line is
> > being put to the input buffer. [1]  According to Eijkhout's TeX by
> > Topic, this is because "these spaces are hard to see in an editor" [2].
> > 
> >   [1]: https://texdoc.org/serve/tex.pdf/0#page=15
> >   [2]: http://mirrors.ctan.org/info/texbytopic/TeXbyTopic.pdf
> > 
> > I develop and maintain the Markdown package [3] for plain TeX, ConTeXt,
> > and LaTeX. The package makes it possible to use the lightweight markup
> > of markdown [4] in TeX documents. In markdown, a hard line break can be
> > inserted by ending a line with two or more spaces. However, since
> > trailing spaces are removed by TeX, hard breaks are only recognized when
> > we' are inserting an external markdown file, not when markdown is typed
> > in the top-level document. This deficiency is known and documented [5],
> > but I am hoping we could resolve it with LuaTeX.
> i wonder how this double spaces works out in practice, for instance one
> needs to 'visualize' them in the editor so see them and also make sure that
> the editor is not in 'prune space at the end of line' mode

Dear Hans,

thank you for your response. To visualize trailing spaces in Vim, I have
these three lines at the bottom of my ~/.vimrc:

    """ Highlight trailing spaces
    highlight TrailingSpaces ctermbg=red guibg=red
    match TrailingSpaces /\s\+$/

In my experience, most editors don't automatically prune trailing
spaces. If yours does, you can separate markdown to an external
document, and set up your text editor to behave differently for TeX
files and markdown files.

> >   [3]: https://github.com/witiko/markdown
> >   [4]: https://daringfireball.net/projects/markdown
> >   [5]: https://mirrors.ctan.org/macros/generic/markdown/markdown.pdf#page=20
> > 
> > In LuaTeX, the `process_input_buffer` callback [6] can be used to
> > intercept the text coming *out* of the input buffer. However, the
> > trailing spaces have already been removed by this point.
> > 
> > By adding a callback right after a line has entered the input buffer
> > [1], we could either replace the trailing space characters with tabs,
> > or place a character such as the zero-width non-joiner (U+200C) to the
> > right of the trailing spaces.
> i fear that this will introduce a performance hit

Hopefully negligible when there are no registered callbacks.

> >   [6]: https://www.pragma-ade.com/general/manuals/luatex.pdf#page=176
> > 
> > Is this something you would consider---if not for LuaTeX then perhaps
> > for LuaMetaTeX?
> for a while i had a endofline handler but removed it because i never used it
> as it made no sense (it's all too unpredictable) so i removed it
> (i wanted to backport it but it doesn't really fit in now that luatex also
> has some special \par handling added)
> concerning context, how is \startmarkdown defined? I'm pretty sure that this
> issue can handled without adding callbacks

We implement \startmarkdown in terms of the \markdownReadAndConvert [7]
plain TeX macro, which scans and buffers the input using TeX. We don't
use ConTeXt buffers for consistency, since the implementation of
\markdownReadAndConvert is shared between the plain TeX, LaTeX, and
ConTeXt Markdown packages. We don't use Lua to buffer the input for
compatibility with non-Lua TeX engines.

 [7]: https://github.com/Witiko/markdown/blob/main/markdown.dtx#L18070

We could use the ConTeXt buffers [8] to define \startmarkdown, but these
too gobble the trailing spaces, so this does not improve the status quo.

 [8]: https://wiki.contextgarden.net/Command/startbuffer

> (i never needed/use(d) markdown myself so i can only guess here) but
> we can discuss the needs off-luatex-list

And yet, you are included in the copyright line of the Markdown package,
since you contributed to the lunamark parser:

    $ docker run --rm -i witiko/markdown markdown-cli -v
    markdown-cli.lua (Markdown) 2.10.0-64-ge9b5180
    Copyright (C) 2009-2016 John MacFarlane, Hans Hagen
    Copyright (C) 2016-2021 Vít Novotný
    License: LPPL 1.3c

> Hans


> -----------------------------------------------------------------
>                                           Hans Hagen | PRAGMA ADE
>               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
>        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
> -----------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://tug.org/pipermail/luatex/attachments/20210829/4a552cc9/attachment-0001.sig>

More information about the luatex mailing list.