[pdftex] Redundant objects: patch available
Han The Thanh
thanh at informatics.muni.cz
Fri May 4 14:56:46 CEST 2001
> > The file (test-14hpatch.pdf) created with the Ofrieds patch is even
> > smaller than the pdf file from Adobe. Seems like Adobe has some
> > duplicated resources in their files. Now pdftex is first :-)
>
> Before you jump to conclusions about Adobe's PDF generators: If
> PDFRef.pdf contains duplicated resources, those wouldn't be merged by
> pdftex. My patch stops resources from being embedded more than once,
> it does not actively search for things to merge...
>
> There are several possible reasons why test-14hpatch.pdf is nearly a
> megabyte smaller than PDFRef.pdf. Remember, objects are only copied
> if they are referenced (directly or indirectly) from a page of the
> document.
>
> (1) There could be unused objects.
>
> (2) Document outlines, thumbnails, threads and named destinations
> are not copied. (PDFRef.pdf contains lots of names links, one
> per page, figure, table, etc., occupying a total of about
> 400kB.)
>
> (3) Known Type1 fonts are embedded by pdftex itself, so if the
> document contains extra resources for these, they are not
> copied. This explains why the "ToUnicode" resource of these
> fonts is lost.
>
> (4) The hint tables of linearized PDF are lost (and the output is of
> course not linearized).
>
> (5) Framemaker might embed additional information for its own use.
>
> (6) There could be a bug in the copying code :-)
>
> I still don't know exactly where the missing megabyte went. I think
> I'll have to write a small tool to compute PDF statistics (how many
> bytes in what kind of data).
the pdf spec is quite a `rich' pdf with a lot of outlines, annotations and
the likes. I think the amount these elements occupy can be quite a lot
(400KB seems too small).
Regards,
Thanh
More information about the pdftex
mailing list