[pdftex] understanding "split.tex"

Laurent Siebenmann Laurent.Siebenmann at math.u-psud.fr
Thu Oct 17 13:24:41 CEST 2002


Dear Michael Chapman,  

You wrote (Wed, 9 Oct 2002):

> I am probably missing something very basic, but is 
> their a method that will convert BOOK.pdf to 
> PAGE_1.pdf, PAGE_2.pdf, ...... , PAGE_N-1.pdf, 
> PAGE_N.pdf ? 

Indeed. Heiko Oberdiek and Andreas Matthias (search for
\pdfximage in the October list transactions), have provided a
clever pdfTeX script to accomplish this. Blinded by science, I
asked for a bit more explanation of how *pdfTeX primitives*
contribute to that solution:

> ... Why not give *both* [working code *plus* some explanation] 
> at moderate extra cost?

I have been left to fulfill my own request.  Here goes. (At
moderate cost!)

As documented, pdfTeX primitives ending in "ximage" allow
one to pull out as image any whole page from any named PDF
file. So it is not necessary to go back to the TeX source (the
obvious last resort).

On the other hand, pdfTeX, like classical TeX, is limited to
one binary output file per job; so we *do* need to run as many
jobs as there are pages in BOOK.pdf.  Here is (essentially)
Heiko's job file for page 7, with my comments.

 %%% PAGE_7.tex for Plain TeX to produce PAGE_7.pdf %%%
 \immediate\pdfximage\space page 7 {BOOK.pdf}%
 %% ^ Embed the the page 7 data into PAGE_7.pdf
 %% and put the identifying number into the integer register
 %% \pdflastximage, but without displaying the page as yet.
 \setbox0=\hbox{\pdfrefximage\pdflastximage}%
 %%  ^ Pop the page into a TeX box register using the
 %% the image placement command \pdfrefximage 
 \pdfpagewidth=\wd0 \pdfpageheight=\ht0 %% obviously!
 \pdfhorigin=0pt \pdfvorigin=0pt
 %% ^ so top left of next box shipped out 
 %% will go to TeX position (0,0)
 \shipout\box0 %% create the unique output page
 \end

Classical TeX programming can quickly produce all these TeX 
page job files, except for one minor detail:- finding the number 
of pages in BOOK.pdf. For that, use \pdfximage{BOOK.pdf},
which puts it into the *undocumented*(!?) integer register 
\pdflastximagepages.

To avoid running all these files individually, one wants to
have have instead one "superjob".  Heiko and Andreas manage
this feat (for unix) using the recent *system-dependent*
\write18 TeX extension (suggested by Knuth I believe), which,
roughly speaking, writes to the system command line.  But that
has nothing to do with the primitives that pdfTeX introduced...

Now you have the explanations I requested, and are more ready
to understand Heiko's perfected splitting program "split.tex"
of Thu Oct 10 17h. Enjoy!

     Laurent Siebenmann

Question. Where does one find the clearest specs for \write18?




More information about the pdftex mailing list