[XeTeX] xdvipdfmx, page subsets, pgf, transparency

Mon May 16 14:57:14 CEST 2011

On Mon, 16 May 2011, Heiko Oberdiek wrote:
> take some actions arising specials. But these actions might
> have effects on the whole document. Sometimes this is good
> to avoid missing object declarations and other things. But sometimes
> extra stuff or wrong stuff is added because of pages that the user
> has explicitly excluded.

Thanks for investigating this.  It's a complicated and pretty obscure
problem that would only affect a few users, so I appreciate your paying
attention.

I excluded the *page*.  I didn't exclude the *special*.  I think it's a
bug because A. the software detects that it is writing incorrect output,
and B. it gives me no way to get correct output.  There's no way to
include the special except by including the page.

> > that won't meet your needs instead!" responses:  The small PDFs will be
> > used with the pdfpages package in a second run of XeLaTeX to generate PDFs
> > for printing the large document as a multi-volume set of books.
>
> The purpose of the second step is not clear to me:
> The final result consists of several PDF files, one for each book?
> And the first XDV file is just the contents of the books and perhaps
> some pages are reused for each book, thus that the second run
> is only used for putting pages together?

The second run of XeTeX also adds some additional graphics, most notably
thumb-tabs along the sides of the pages, and it changes the page size to
accomodate that, so that the tabs will bleed all the way off the paper
edge when printed and trimmed.  It's not purely a page-subsetting
operation.  And I'd rather not make it purely a page-subsetting operation
(by adding the thumb-tabs and changed size to the original large file)
because that would mean generating two large files (one with thumb-tabs
and one without; I also have a use for the large file in its current form)
and the generation of the large file can't be parallelized (it's three
long XeTeX runs that must be done sequentially on a single CPU).  The way
I'm currently doing it means much of the work can be performed
simultaneously on multiple CPUs and I can do at least some of the testing
on just a single volume without having to generate the whole thing every
time.

> Then I would suggest writing a program that deals with the XDV file:
> a) splitting the single master .xdv file into the book .xdv files
> b) analyzing the specials to add the missing ones to the .xdv file.

I'll do this if forced to, but since xdvipdfm is documented as being able
to generate a page subset, I'd like to use it for its documented behaviour.

> If I have understood the second XeTeX run, then this step wouldn't
> be necessary, saving you much time.

Unfortunately, I don't think it'll be trivial to eliminate the recombining
step, because it does more than pasting the small PDFs together.  I'll
continue playing with it, though; there may be a way to either eliminate
the pasting step or use something faster than the pdfpages package.
Right now the pdfpages package seems to be the real trouble spot for
speed.  If there is something similar to it that could take pages from an
XDVI file then I could eliminate the intermediate run of xdvipdfm, but
then I'd still need to either do the page subsetting outside of XeTeX, or
have the thing that pastes together the XDVI files also do page subsetting
and not have the same problem with specials that xdvipdfm has.

I proceeded with the "dummy image on an early page" workaround and that
seems to work pretty well.  I put on an early page a TikZ image consisting
of a semi-transparent white circle, which is invisible against the white
page background.  That causes the "pgfopacities" object to be emitted for
that page.  Then I added that page to all the subsets generated with
xdvipdfm, and had all my invocations of pdfpages include pages starting
with the second page of each subset, instead of all pages.

One small gotcha is that there apparently is a separate "pgfopacities"
object generated for each distinct numerical value of opacity.  My actual
images used "opacity=0.4" and the dummy image had to use that value too -
an earlier attempt in which the dummy image used "opacity=0.1" generated a
separate object that wasn't helpful in eliminating the error.  It seems
ridiculous to me that a single small number is a thing that must be
created as a callable object and referred to by a name much longer than
the number itself, instead of being included literally when it's used - I
can't see how this indirection saves any time or space - but of course
that's a PGF issue, nothing to do with xdvipdfm.
-- 
Matthew Skala
mskala at ansuz.sooke.bc.ca                 People before principles.
http://ansuz.sooke.bc.ca/