[pdftex] redundant objects with includegraphics

Otfried Cheong otfried at cs.uu.nl
Tue May 1 17:32:24 CEST 2001


 > >There is a variant of \pdfximage that would include a PDF file and
 > >create XObject's for each of its pages.  These XObject's are numbered
 > >consecutively, \pdflastximage returns the number of the first page,
 > >and \pdflastximagepages (or so) returns the number of pages read.
 > >
 > >The TeX document can now use the pages of the document using
 > >\pdfrefximage (with a number in the range \pdflastximage
 > >.. \pdflastximage + \pdflastximagepages - 1).  
 > >
 > >All resources used by ALL PAGES of the PDF file would be included in
 > >the pdftex output either immediately by \pdfximage or when the first
 > >page is referenced.  
 > 
 > But, if only one page is needed, then _all_ resources will be
 > included instead of perhaps ten percent.

No, because the user (or macro package) would use "\pdfximage page 17
{a.pdf}" instead of the proposed variant "\pdfximage page * {a.pdf}"
to include a single page.

Tricky is the case of 2 .. n-1 pages, where the user has a choice
between including them separately (with the risk of duplicated
resources), or at once (with the risk of including unused resources).

Clearly both problems can be fixed with a postprocessor.  One might
even argue that my proposed solution is even better with a
postprocessor, since it is easier to identify unused resources than to
remove duplicates (duplicates are not necessarily literally identical,
think about indirect references inside resources).

If one were really ambitious, one could extend my suggestion to a
format that would handle arbitrary subsets of pages, such as:

\pdfximage page 17 {a.pdf}
\pdfximage page 17-42 {a.pdf}
\pdfximage page 1-12,24,40-42 {a.pdf}
\pdfximage page * {a.pdf}

 > I agree with Reinhard, that a post processor can do a better job:
 > [...]

Clearly such a postprocessor would be a powerful tool, and we'll all
be happy if this mythical volunteer with lots of free time appears to
our rescue. (If any PDF-interested student at our department reads
this - come to me to discuss a graduation project :-)

But not everything that can be done by a postprocessor needs to be
deferred to that stage.  Pdftex produces excellent PDF as it is, and
has several features that could have been left to a postprocessor
(compression is a prime example). 

My suggested extension to \pdfximage can be implemented with very
little effort.  In fact, I just looked at the pdftex sources and found
that the suggested command "\pdflastximagepages" to retrieve the number
of pages in an included PDF file already exists!  It works fine on
14h-pretest-20010310, but is not yet in the pdftex manual.

So, if people agree that my suggested extension to \pdfximage is
useful, and nobody more qualified takes it up, I will go and try to
implement it in pdftex.  I guess I could do this in a day, while a
postprocessor looks a bit more work :-)

Best wishes,
  Otfried Cheong





More information about the pdftex mailing list