[l2h] Poor resolution for my .eps file contents in a .html file produced by LaTeX2HTML and viewed in a Web browser such as Konqueror

Pat Somerville l_pat_s at hotmail.com
Thu Jul 14 01:03:39 CEST 2011


Thanks for kindly writing to me, Professor Moore.  I apologize for two 
reasons here: 1) I think I misspelled mebibyte (2^20 bytes).  2) It appears 
that I sent one e-mail letter twice.--Sorry, I made those two mistakes. 
Anyhow, again thanks, Professor Moore, for kindly taking the time to write 
to me.

Pat

--------------------------------------------------
From: "Ross Moore" <ross.moore at mq.edu.au>
Sent: Friday, June 24, 2011 7:22 PM
To: "Pat Somerville" <l_pat_s at hotmail.com>
Subject: Re: [l2h] Poor resolution for my .eps file contents in a .html file 
produced by LaTeX2HTML and viewed in a Web browser such as Konqueror

> Hi Pat,
>
> On 25/06/2011, at 7:06 AM, Pat Somerville wrote:
>
> Getting good results from scanning old books is quite an art form.
> You definitely need to experiment quite a bit with settings on the 
> scanner,
> and also with image manipulation software afterwards.
>
> I always advise to do this with just a few pages first, before embarking
> on a program to scan hundreds of pages. Otherwise, you'll end up
> having to repeat a lot of the physical work of placing and scanning
> many of the pages that did not give you the quality result that you
> desire.
>
>> I scanned a copy of one of my own drawings and caption on paper in both 
>> the Tagged Image File Format (.tif), in an attached figure called 
>> TestFig.tif, and the Portable Document Format (.pdf), in an attached 
>> figure called TestFig.pdf.
>
> Yes, these look really good, due to the scanning resolution.
>
> Note how much smaller (45k) is TestFig.pdf than TestFig.tif (2.7Mb).
>
>
>
>> Using the Gnu Not Unix (GNU) Image Manipulation Program (GIMP) 2.6.11 I 
>> made Encapsulated PostScript (.eps) files of those figures, which are 
>> respectively the attached files TestFigTif.eps and TestFigPDF.eps; in 
>> that process I probably cropped the original figures.
>
> TestFigTif.eps  worked fine, giving a good size reduction to  336kb.
>
> But TestFigPDF.eps  has expanded to  2.6Mb and displays at poor quality.
> If this really did come from  TestFig.pdf  then the compression used
> in that file certainly has not been preserved by the conversion to .eps .
> What displays seems to be a black&white bitmap version of the image,
> losing all gray-scale information. But it is the use of gray-scales that
> make images seem to be clearer and cleaner (better quality!).
> The effect is also know as "anti-aliasing".
> The 2.6 Mb .eps file must surely contain the grayscale info., presumably
> as well as the lo-res (Preview) bitmap version, but it is the lo-res that
> seems to be the image that my viewer shows. Not sure why --- with more
> than one version of the image within the file, it could be that the viewer
> just chose to show the lo-res preview. A printer should use the hi-res
> description ...
>
>
>> Then in the attached LaTeX file Throwaway11.tex I included those two .eps 
>> figures using the LaTeX epsfig software package.  I executed a latex2html 
>> command on Throwaway11.tex to produce Throwaway11.html and 11 other files 
>> which are all attached to this e-mail letter.  On viewing 
>> Throwaway11.html in a Konqueror Web browser I found the qualities of the 
>> two figures TestFigPDF.eps and TestFigTif.eps quite acceptable!
>
> ... as must have happened here.
>
>
>>
>> Then I noticed that some lines or curves of letters in some mathematics 
>> type in the book I was copying were probably thinner than the lines or 
>> curves of the drawing on paper to which I referred in the previous 
>> paragraph here.  On a "surface level" that could point to problems or 
>> complications associated with the original source on paper rather than to 
>> LaTeX2HTML or the GIMP, although to my human eye the type on the original 
>> book page was of excellent visual quality.  But I found ways to adjust my 
>> Epson Stylus CX3810's scanning program to deal with the thin-lined type 
>> on the page of the book in question: In the Epson scanner program's 
>> "Professional Mode" 1) for the "Auto Exposure Type" I probably switched 
>> from "Photo" to "Document."  2) For the "Image Type" I switched from 
>> "Black and White" to "8-bit Grayscale."  I still kept the resolution 
>> setting at 600 dots per inch (dpi).  Differences with various choices of 
>> "Image Type" could even be seen in a "Preview" or brief scan of the book 
>> page.  As a result  the full scan of the book page took considerably 
>> longer than previously, but with a total scan time of within a few 
>> minutes.  I guess that increased scan time might be a clue to why I 
>> gratefully had success with the scanner program setting "8-bit Grayscale"
>
> Yes.  8-bit implies 2^8 = 256 different possible shades of gray,
> ranging from white to black.
> When rasterising, you definitely want to capture this amount of 
> information,
> otherwise you get "blocky" images.
>
> Later processing can cut down this information to something that still 
> looks
> good on-screen. But you must start with a lot more.
>
>> in the case of a thin-lined document!  I saved the scanned file as a .pdf 
>> file with a size of 1.1 MiB (megibyte=2**20 bytes).  As a result of the 
>> "8-bit Grayscale" setting I could see a blue or grey color near the 
>> book-binding side of the page in the scanned image file of that book 
>> page. In importing the .pdf file in the GIMP it might have been important 
>> to set the resolution to 600 pixels/inch instead of some possibly lower 
>> default setting.  In the GIMP I could crop away that unwanted grey or 
>> blue color by clicking on the GIMP's select tool and then clicking on the 
>> image and dragging the touch-pad pointer while holding down the left 
>> touch-pad button to enclose the interesting portion of the image of the 
>> book page in a dashed rectangle; then I could select "Image" and then 
>> "Crop to Selection" to accomplish the cropping.  Via "Image" and "Rescale 
>> Image" or something similar I could resize the image to the roughly 
>> 6.00-inch width I desired while keeping the aspect ratio or proportions 
>> of the figure unchanged.  A surprise for me was that the .eps file 
>> converted from this .pdf file using the GIMP had a large size of 33.2 
>> MiB.  In the GIMP I had to do something unusual, namely as I was directed 
>> by the GIMP to export the file, which I found I could do by clicking on 
>> an "Export" button before finally saving the .eps file.  In my .tex file 
>> I changed the name of the .eps file I wanted to include to match the new 
>> .eps file name I made using the GIMP.
>
>
>> Then after executing latex .... and latex2html.... commands on that .tex 
>> file I gratefully found that this time the new .eps file had very good 
>> visual quality with minor graininess visible especially within the letter 
>> "H."  The size of the folder produced by LaTeX2HTML was roughly equal to 
>> the 1.1-MiB size of the .pdf file with which I started.  So apparently 
>> about 97 percent or more of the contents of the 33.2-MiB .eps file were 
>> discarded in LaTeX2HTML's process of generating the folder containing the 
>> output, .html file.
>
> Sure.
> The resulting files img1.png  and  img2.png have down-sampled back
> to a simple black&white image rather than using grayscales.
> I'm not sure why this has happened. There should be options that you
> can give to LaTeX2HTML to affect the image processing, and get
> much better final graphics.
>
> Isn't there a switch  -antialias   that you can use?
>
>
>>
>> I am grateful to now have a way to produce fairly good-quality figures in 
>> a .html file using LaTeX2HTML from .pdf scans of printed paper converted 
>> to .eps files using the GIMP.  But as the results in the first paragraph 
>> have shown, the "Black and White"
>
> Definitely do *not* use this setting.
>
>> and probably "Photo" settings of my scanner program were apparently 
>> adequate for lines and curves that were not very thin in the original 
>> source.
>
> Certainly try with "Photo", to see the differences.
> But "Grayscale" better tells the scanner what kind of material
> is being scanned.
> It may have been black&white once, but ink spreads into the paper,
> and the pages fade with time --- not to mention yellowing.
> So it really is appropriate to use Grayscale.
>
>>
>> So in retrospect thank you, Professor Moore, for your suggestion that I 
>> provide you with one or more of my example files.  That suggestion and 
>> the circumstance of a copyright were early steps which ultimately and 
>> gratefully led me to a solution to the problem of how to generate fairly 
>> good-quality figures in a .html file produced by LaTeX2HTML.
>>
>> Pat
>
> I'm glad you are happy with what you have achieved.
>
> Preserving the content of old books --- particularly those with
> mathematical content that is still valid and relevant --- 
> is a very worthwhile undertaking.
>
>
> All the best,
>
> Ross
>
> ------------------------------------------------------------------------
> Ross Moore                                       ross.moore at mq.edu.au
> Mathematics Department                           office: E7A-419
> Macquarie University                             tel: +61 (0)2 9850 8955
> Sydney, Australia  2109                          fax: +61 (0)2 9850 8114
> ------------------------------------------------------------------------
>
>
>
> 


More information about the latex2html mailing list