[tex4ht] [bug #343] Package pdfpages

Karl Berry karl at freefriends.org
Sun Jan 22 19:52:44 CET 2017


Follow-up Comment #4, bug #343 (project tex4ht):

Regarding pdf to png conversion, I finally took a few minutes to try to
get to the bottom of it.  (Additional discussion on mailing list, 
http://tug.org/pipermail/tex4ht/2016q4/001682.html)

I started with pdflatex small2e.tex. Resulting PDF is 60587 bytes.
I saw the same basic results you did: convert small2e.tex magick.png
resulted in a smaller file than your rungs invocation:

-rw-rw-r-- 1 karl root  9262 Jan 22 10:26 convert.png                         
 
-rw-rw-r-- 1 karl root 19189 Jan 22 10:15 rungs.png                           
 

I wondered if the precise gs invocation would make a difference.
So I ran
  strace -vfs 9999 convert small2e.pdf convert.png >&/tmp/str
where the options to strace make it display everything.
The (voluminous) output shows gs being invoked this way,
except with temporary filenames:


    gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT \
    -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 \
    -sDEVICE=pngalpha -dTextAlphaBits=4 -dGraphicsAlphaBits=4 \
    -r72x72 -sOutputFile=gsmagick.png \
    small2e.pdf

But, running this results in the same file size as rungs; I was surprised:
-rw-rw-r-- 1 karl root 19189 Jan 22 10:29 gsmagick.png                        
 

Ok, so then I ran
  convert -debug all small2e.pdf convert.png >&/tmp/deb
to get a sense of what convert thought it was doing.

And indeed, I see it running gs as we expected, and getting but then doing
postprocessing on the png file:
  Searching for module "PNG" using filename "png.la"
  ...
  Enter ReadPNGImage()
  ...

Ok, so I am led to believe that convert is smarter than gs about how to
use png compression features (or whatever), and this seems plausible.

Finally, running it through netpbm results in an even smaller file:
  pngtopnm convert.png | pnmtopng >pngto.png; ls -l pngto.png                 
 
-rw-rw-r-- 1 karl root 4185 Jan 22 10:34 pngto.png                            
 

While identify shows that the netpbm output is "PseudoClass" (uses color
table) rather than "DirectClass" (separate color per pixel):

  $ identify pngto.png convert.png                                            
 
pngto.png PNG 612x792 612x792+0+0 8-bit PseudoClass 2c 4.18KB 0.000u 0:00.000 
 
convert.png[1] PNG 612x792 612x792+0+0 8-bit DirectClass 9.26KB 0.000u
0:00.000

Some discussion at
http://www.imagemagick.org/discourse-server/viewtopic.php?t=16706.

And no doubt with additional options one could get imagemagick to do
that too, or netpbm not to, or whatever, but it doesn't matter :).



    _______________________________________________________

Reply to this item at:

  <http://puszcza.gnu.org.ua/bugs/?343>

_______________________________________________
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/



More information about the tex4ht mailing list