[pdftex] Caching intermediate compilation results for near-real-time PDF re-renders during editing

Fri Jan 14 18:01:22 CET 2022

Ross,

It seems your mail program clobbers the quoting in the plain text
part.

All,

At the cost of incorrect quoting below, I'll carry on with the email as-is.

On Fri, Jan 14, 2022 at 14:06 (+1100), Ross Moore wrote:

> Hi Jim, Karl and others.

> From: Jim Diamond via pdftex <pdftex at tug.org<mailto:pdftex at tug.org>>
> Date: 14 January 2022 at 12:26:27 pm AEDT
> To: Karl Berry <karl at freefriends.org<mailto:karl at freefriends.org>>, pdftex at tug.org<mailto:pdftex at tug.org>
> Subject: Re: [pdftex] Caching intermediate compilation results for near-real-time PDF re-renders during editing
> Reply-To: Jim Diamond <jdiamond at acadiau.ca<mailto:jdiamond at acadiau.ca>>

> Hi all,

> On Thu, Jan 13, 2022 at 16:36 (-0700), Karl Berry wrote:

> Hi Jamie - thanks for the interesting message. Thanh could say more, but
> FWIW, here are my reactions (in short, "sounds impossible").

> That statement may be true in one very limited sense only.
> In typical real-world documents, anything that occurs anywhere
> can have an effect on any other page of the final PDF.

This is true.  But see below.

> I recognize that you are employing hyperbole for effect here.  But
> thinking about the OP's question, I wonder... just how many variables
> are in play after a shipout?

> A shipout of a page does *not* mean that what comes afterwards is like
> a whole separate stand-alone document.

> To process what comes next still relies on everything that has been setup
> earlier, in terms of how macros will expand, or even what is defined.
> Think about targets of cross-references, citations, hyperlinks, etc.

Good point.

> There is no finite set of “variables” whose values can be saved.

Surely the collection of registers, macros and other objects
defining the state of the computation after a shipout is finite.

> You would need a snapshot of a portion of the memory,
> as well as a way to sensibly make use of it.

Which gets us back to the OP's question.

> Suppose a small change is then made to the latex source, such that the
> compiler determines this change would first affect page k.

> I can't imagine how that could be determined without retypesetting the
> entire document.

> Agreed.
> Theoretically, it is like the Halting problem for Turing machines.
> While the output is sometimes predictable, in general
> you can only know what a program will do by running it.
> And will it even stop? ... allowing you to examine the complete output
> that it will produce?  In general, NO.

I'd argue that solving the halting problem is sufficient but not
strictly necessary.  

> Currently, various editors and document viewers can do lookups and
> reverse lookups so that one can go from a point in the source doc to
> the corresponding line in the compiled document, and vice versa.

> There is no way to predict what any given change will affect.

Perhaps not,  reports of the features of BaKoMa TeX (which I have
never used) notwithstanding.

> Is that really true?  ***Except for multi-page paragraphs (bah!),
> tables and similar***, is it possible for a change on page N of a
> document to affect anything before page N-1?

> Absolutely.
> That’s what the  .aux  file is typically used for.
> And many packages have their own auxiliary files which write into a file,
> so that extra information is available to possibly affect any page N-k
> (for any value of  k) on the next processing run.

It is true that if a change to the source causes the .aux (and
similar) files to be changed, that any/all of the document might be
changed the next time it is compiled.

But should we conclude that we can't do anything useful here,
following the OP's question?  I think we can.  (See below.)

> (While I can see change
> to a word in a continued paragraph on page N changing the typesetting
> of that paragraph, and thus possibly how page N-1 should be laid out,
> can page N-2 be affected?)

> This kind of thing is simple enough; but still not always.
> If the editing forces something onto the next page,
> the overall effect just can get more and more complicated.

<snip>

> I’ve been a TeX user since the late 1980s.
> In that time the speed of processing has increased considerably – due mostly
> to the speed of the (laptop or desktop) computer doing the processing.

> Today we do things that would have been inconceivable back last
> century, precisely because of the greater speed and available
> memory.  This growth is known as Moore’s Law – though not due to me,
> nor any known relative.

<snip>

> I believe very strongly in the benefits of ultra-fast recompilation

> If your jobs are not compiling quickly enough for you, then the best option
> could well be to update your hardware, rather than fiddle with
> the fundamental design of the software.

I've been a TeX user since the early 1980's.  Yes, things have sped up
considerably.  However, to be blunt, I think the suggestion "get
faster hardware" is a bit on the obnoxious side.  While the OP may
have more money than God, I have heard that there are lots of people
on the planet with limited financial means, and some of them may have
to do their computing with a low-end Raspberry Pi or even something
less capable.  (And, IMHO, people buying into the "just get
more/faster hardware" mantra is why we need 8 GB of RAM to look at a
web page.)

Anyway, for what it's worth here is a thought of how compilation could
be sped up to help someone quickly preview their documents.

There could be two types of "re-compilation":
(1) A full (re-)compilation, perhaps running pdf(la)tex the usual
    number of times to ensure all the ToC entries, cross references,
    and so on are done correctly.
    These runs (only the last one is really relevant) could save
    whatever computation state is needed at the end of each page.
    Further, similar to synctex, it could record a correspondence
    between source locations and PDF locations.
(2) A "best-effort", "fast" re-compilation could look at where in the
    source the first change is since the last "full" (re-)compilation;
    the editor would have to keep track of this.  Suppose the point of
    change was found at page N in the most recent full re-compilation.
    Recognizing that this change *might* have affected previous pages,
    it could boldly carry on and load the state of the universe at the
    end of page N-1 and then carry on the compilation from there.

This best-effort compilation would allow the user to quickly see how
local changes look, at the cost of the user needing to recognize that
a full re-compilation may change things dramatically.  But in many
cases, this might be enough to make a user happy.

Jamie: having said all that, I would guess the development effort to
do this would be considerable for someone who thoroughly knows the
internals of the TeX engine, and perhaps a tremendous effort for
someone starting from scratch. I'd further guess that Unless some grad
student out there finds this to be an interesting project, and his/her
supervisor thinks it can be turned into a thesis and/or publishable
material, I'm not sure I see this happening, even though what I
described may be (far?) less than BaKoMa TeX does anyway; the author
of that had undoubtedly given it a *lot* more thought than me.

Cheers.
                                Jim