[pdftex] pdftex compression -- proposed addition to manual

Reinhard Kotucha reinhard at kammer.uni-hannover.de
Sun Aug 26 03:38:48 CEST 2001


>>>>> "Ben" == Ben Crowell <crowell01 at lightandmatter.com> writes:

    > I'd like to suggest adding the following to the pdftex manual.

    > [...]

    > pdftex only supports Flate. If your compress_level is set
    > appropriately, pdftex will use Flate compression. Flate
    > compression does a good job of compressing text and line art.

    > For bitmapped images, however, Flate compression isn't enough to
    > produce good compression. If your input images are uncompressed,
    > Flate will compress them somewhat, but not as much as a lossless
    > compression algorithm designed for images. If your input images
    > are in a compressed format such as JPEG, Flate compression does
    > not produce very much improvement.
    > [...]

This is what I would expect.  Recently I converted some jpegs to pdf
because the direct inclusion of jpg-files produced a pdf file that
couldn't be printed.

I was amazed how small the pdf files were compared to the original
jpegs.  Here are the compression ratios I got:

file  1:  32.73 %
file  2: 205.75 %
file  3:  30.11 %
file  4:  29.27 %
file  5:  23.99 %
file  6:  25.76 %
file  7:  29.62 %
file  8:  30.19 %
file  9:  29.75 %
file 10:  29.62 %
file 11:  33.44 %
file 12:  29.18 %
file 13:  30.22 %
file 14:  25.93 %
file 15:  23.48 %
file 16:  23.25 %
file 17:  28.32 %
file 18:  33.20 %
file 19:  24.93 %
file 20:  33.84 %
file 21:  28.74 %
file 22:  28.55 %
file 23: 103.98 %
file 24:  58.20 %

The compression ratio has been calculated as
100*filesize(pdf)/filesize(jpg) .

I do not know where the jpegs come from, but I know that at least most
of them had been produced by digital cameras.  

The conversion has been done by the following Makefile:
###
.SUFFIXES : .jpg .eps .pdf

all : $(patsubst %.jpg,%.pdf,$(wildcard *.jpg))

.jpg.eps :
	convert $< $@

.eps.pdf :
	epstopdf $<
###

convert is part of ImageMagick and just puts a wrapper around the
jpeg.  epstopdf does the conversion to pdf by calling ghostscript,
which uses the /DCTDecode filter:

7 0 obj
<</Subtype /Image
/ColorSpace /DeviceRGB
/Width 295
/Height 306
/BitsPerComponent 8
/Filter /DCTDecode
/Length 18663>>stream


If I understand your mail correctly, you want to discourage people to
apply further compression to jpeg files.  Theoretically, if a
compression algorithm is ideal, then there shouldn't be any other
algorithm that is able to further compress that file.  This is
obviously not true for jpeg.

My conclusion is that jpeg compression is far from optimum and people
should try to apply further compression, at least for online
documents.

Regards,
  Reinhard

-- 
----------------------------------------------------------------------------
Reinhard Kotucha			               Phone: +49-511-751355
Berggartenstr. 9
D-30419 Hannover	              mailto:reinhard at kammer.uni-hannover.de
----------------------------------------------------------------------------
Microsoft isn't the answer. Microsoft is the question, and the answer is NO.
----------------------------------------------------------------------------





More information about the pdftex mailing list