<br><br><div class="gmail_quote">On Sun, Mar 8, 2009 at 10:01 AM, Taco Hoekwater <span dir="ltr"><<a href="mailto:taco@elvenkind.com">taco@elvenkind.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br>
Hi,<div class="im"><br>
<br>
Scott Pakin wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Karl Berry wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I also have an idea to propose. In short, the idea is parallel<br>
version of TeX. My plan is to use OpenMP for parallelization. It<br>
should make TeX run faster without much more complication in the<br>
code base. Please give me some thoughts.<br>
<br>
The basic TeX code is monolithic and relies almost entirely on global<br>
variables. There have been some discussions of parallelizing certain<br>
small parts of the code, such as the output routine, but even this is<br>
fraught with extreme difficulty. I am not sure what can be usefully<br>
parallelized.<br>
</blockquote>
<br>
While it's true that TeX is monolithic and relies almost entirely on<br>
global variables, the real problem for parallelism as I see it is<br>
that TeX implements an extremely sequential state machine. Each<br>
character must be read and processed before proceeding with the next<br>
character. This is because earlier processing can affect later<br>
processing (cf. \catcode). One can't even process pages in<br>
parallel because it is not known in advance where the page breaks<br>
will wind up.<br>
</blockquote>
<br></div>
After a lot of talk and discussion with lots of people, I have<br>
collected some ideas where parallelization seems possible,<br>
at least in theory (I am talking about pdftex|luatex. None of<br>
this applies to tex82, and I don't know enough about xetex).<br>
<br>
<br>
A list of ideas, more or less in descending order of potential<br>
gain and increasing order of implementation complexity:<br>
<br>
* The backend can run in a separate thread. This is the big one,<br>
I estimate that up to a 20% speedup is possible. All other<br>
points below will score lower than this except for specially<br>
crafted documents (and a lot lower, like maybe 2-3% gain in<br>
'good' cases).<br>
<br>
* Some types of images require conversion or extensive parsing<br>
(like inclusion of pdf pages). These could be prefetched by a<br>
separate thread.<br>
<br>
* Font subsetting at the end of the run could happen in parallel<br>
for multiple fonts.<br>
<br>
* Processing can usually continue after a \font\cs assignment. And even<br>
if the font turns out to be needed immediately, at least the guard<br>
code can be simple: you only have to block the \cs and \fontdimen<br>
access to it.<br>
<br>
* The paragraph builder typically does two runs over a paragraph:<br>
first without hyphenation, then with. That second run is not always<br>
needed, but it could be done in a second thread in any case, thus<br>
saving time when it was indeed necessary.<br>
<br>
* Input reading can continue during the display math list<br>
conversion, because the result of that conversion is not needed<br>
until the next display or \par or one of a few primitives<br>
(\unskip, \prevgraf, and maybe a few more) is encountered.<br>
<br>
* The same may be true for input reading during paragraph breaking,<br>
but there the list of blocking events is much longer.<br>
<br>
<br>
The problem: I fear that for someone not intimate with tex-like<br>
engines and their complex build systems, the simplest of these<br>
(image prefetching or font subsetting) will take the full gsoc period,<br>
and may not even get done in that time. Worse still, we could end up<br>
with some patch that only works on linux, nowhere else. I definately<br>
cannot mentor such a project because it would in all likelihood eat<br>
up so much time that I could do nothing else in the period.<br>
<br>
Best wishes,<br><font color="#888888">
Taco<br>
<br>
<br>
</font></blockquote></div>Thank you for your thoughts. It seems to be not worth the effort since the best case is only 20% speed up.<br><font color="#888888"><br>Kittipat</font>