[pdftex] pdftex compression -- proposed addition to manual

Ben Crowell crowell01 at lightandmatter.com
Mon Aug 20 14:03:08 CEST 2001

  Thanks, Siep Kroonenberg, for your comments! I took the liberty
  of incorporating them more or less verbatim into the proposed
  addition to the manual. Is this OK with you?

  However, this just makes me wish more for an answer to the question
  I previously posed about what pdflatex actually does with PNG
  images, since, AFAICT from the Adobe docs, PNG can't be incorporated
  in PDF without converting to some other format such as JPEG, CCITT,...

  Since the people who maintain the manual haven't responded, I'm
  wondering if this list is even the right place to discuss it. I couldn't
  find their e-mails on the manual's webpage
  (http://www.tug.org/applications/pdftex/) before, but now I notice that
  their addresses are given inside the manual. Do they read this list, or
  do I need to e-mail them?

  	proposed addition to manual

  If you want to make PDF files with their compression tuned up
  perfectly for your purposes, then you'll need to understand some of
  the technical details about the PDF format below. If you want
  to skip the complexities and just produce reasonably well compressed
  output files, there are two main things you should know.
  First, you should check that compress_level is set to 9.
  Second, you should prepare bitmapped graphics input files
  in a compressed format --- typically JPEG or PNG--- that makes an appropriate
  tradeoff between compression and image quality; pdftex retains
  the compression of the input image, but doesn't do much further

  PDF format has some generic lossless compression capabilities.
  Old versions of the format only allowed the LZW compression
  algorithm, which is patent-encumbered. Newer versions also allow
  the use of the Flate algorithm. Because of the patent issues,
  pdftex only supports Flate. If your compress_level is set
  appropriately, pdftex will use Flate compression. Flate
  compression does a good job of compressing text and
  line art.

  For bitmapped images, however, Flate compression isn't enough
  to produce good compression. If your input images are
  uncompressed, Flate will compress them somewhat, but not
  as much as a lossless compression algorithm designed for
  images. If your input images are in a compressed format
  such as JPEG, Flate
  compression does not produce very much improvement.
  PDF format therefore allows the use of several
  different compression schemes for images: JPEG,
  CCITT, and JBIG2. CCITT and JBIG2 are meant for black
  and white text. JPEG is a more general-purpose
  lossy-compression format for greyscale and color images,
  but it is optimized for photographs.
  If you use a JPEG file as an input, pdftex simply copies
  it to the output, without changing its resolution or
  applying any further compression. (Flate compression will
  be applied if you've set compress_level appropriately,
  but it has very little effect.)

  A typical method of working with compressed images would
  be to maintain all your original images in a lossless
  format such as PNG, and produce JPEG versions as inputs
  to pdftex. You can tune up the resolution and compression
  level of the JPEG versions to achieve the desired tradeoff
  between compression and image quality in your output file.

  JPEG, however, is not always the best compressed format for
  images. If the image consists of flat areas and discrete colors
(e.g. screenshots or diagrams) then lossless compressed formats such as PNG
are quite efficient, whereas JPEG compression would introduce artifacts.
The use of JPEG should be limited to photographic and comparable images,
for which it produces compression much better than PNG (typically by
about a factor of 4), without noticeably
affecting the visual quality of the images.

More information about the pdftex mailing list