[l2h] Poor resolution for my .eps file contents in a .html file produced by LaTeX2HTML and viewed in a Web browser such as Konqueror
Pat Somerville
l_pat_s at hotmail.com
Thu Jul 14 01:03:39 CEST 2011
Thanks for kindly writing to me, Professor Moore. I apologize for two
reasons here: 1) I think I misspelled mebibyte (2^20 bytes). 2) It appears
that I sent one e-mail letter twice.--Sorry, I made those two mistakes.
Anyhow, again thanks, Professor Moore, for kindly taking the time to write
to me.
Pat
--------------------------------------------------
From: "Ross Moore" <ross.moore at mq.edu.au>
Sent: Friday, June 24, 2011 7:22 PM
To: "Pat Somerville" <l_pat_s at hotmail.com>
Subject: Re: [l2h] Poor resolution for my .eps file contents in a .html file
produced by LaTeX2HTML and viewed in a Web browser such as Konqueror
> Hi Pat,
>
> On 25/06/2011, at 7:06 AM, Pat Somerville wrote:
>
> Getting good results from scanning old books is quite an art form.
> You definitely need to experiment quite a bit with settings on the
> scanner,
> and also with image manipulation software afterwards.
>
> I always advise to do this with just a few pages first, before embarking
> on a program to scan hundreds of pages. Otherwise, you'll end up
> having to repeat a lot of the physical work of placing and scanning
> many of the pages that did not give you the quality result that you
> desire.
>
>> I scanned a copy of one of my own drawings and caption on paper in both
>> the Tagged Image File Format (.tif), in an attached figure called
>> TestFig.tif, and the Portable Document Format (.pdf), in an attached
>> figure called TestFig.pdf.
>
> Yes, these look really good, due to the scanning resolution.
>
> Note how much smaller (45k) is TestFig.pdf than TestFig.tif (2.7Mb).
>
>
>
>> Using the Gnu Not Unix (GNU) Image Manipulation Program (GIMP) 2.6.11 I
>> made Encapsulated PostScript (.eps) files of those figures, which are
>> respectively the attached files TestFigTif.eps and TestFigPDF.eps; in
>> that process I probably cropped the original figures.
>
> TestFigTif.eps worked fine, giving a good size reduction to 336kb.
>
> But TestFigPDF.eps has expanded to 2.6Mb and displays at poor quality.
> If this really did come from TestFig.pdf then the compression used
> in that file certainly has not been preserved by the conversion to .eps .
> What displays seems to be a black&white bitmap version of the image,
> losing all gray-scale information. But it is the use of gray-scales that
> make images seem to be clearer and cleaner (better quality!).
> The effect is also know as "anti-aliasing".
> The 2.6 Mb .eps file must surely contain the grayscale info., presumably
> as well as the lo-res (Preview) bitmap version, but it is the lo-res that
> seems to be the image that my viewer shows. Not sure why --- with more
> than one version of the image within the file, it could be that the viewer
> just chose to show the lo-res preview. A printer should use the hi-res
> description ...
>
>
>> Then in the attached LaTeX file Throwaway11.tex I included those two .eps
>> figures using the LaTeX epsfig software package. I executed a latex2html
>> command on Throwaway11.tex to produce Throwaway11.html and 11 other files
>> which are all attached to this e-mail letter. On viewing
>> Throwaway11.html in a Konqueror Web browser I found the qualities of the
>> two figures TestFigPDF.eps and TestFigTif.eps quite acceptable!
>
> ... as must have happened here.
>
>
>>
>> Then I noticed that some lines or curves of letters in some mathematics
>> type in the book I was copying were probably thinner than the lines or
>> curves of the drawing on paper to which I referred in the previous
>> paragraph here. On a "surface level" that could point to problems or
>> complications associated with the original source on paper rather than to
>> LaTeX2HTML or the GIMP, although to my human eye the type on the original
>> book page was of excellent visual quality. But I found ways to adjust my
>> Epson Stylus CX3810's scanning program to deal with the thin-lined type
>> on the page of the book in question: In the Epson scanner program's
>> "Professional Mode" 1) for the "Auto Exposure Type" I probably switched
>> from "Photo" to "Document." 2) For the "Image Type" I switched from
>> "Black and White" to "8-bit Grayscale." I still kept the resolution
>> setting at 600 dots per inch (dpi). Differences with various choices of
>> "Image Type" could even be seen in a "Preview" or brief scan of the book
>> page. As a result the full scan of the book page took considerably
>> longer than previously, but with a total scan time of within a few
>> minutes. I guess that increased scan time might be a clue to why I
>> gratefully had success with the scanner program setting "8-bit Grayscale"
>
> Yes. 8-bit implies 2^8 = 256 different possible shades of gray,
> ranging from white to black.
> When rasterising, you definitely want to capture this amount of
> information,
> otherwise you get "blocky" images.
>
> Later processing can cut down this information to something that still
> looks
> good on-screen. But you must start with a lot more.
>
>> in the case of a thin-lined document! I saved the scanned file as a .pdf
>> file with a size of 1.1 MiB (megibyte=2**20 bytes). As a result of the
>> "8-bit Grayscale" setting I could see a blue or grey color near the
>> book-binding side of the page in the scanned image file of that book
>> page. In importing the .pdf file in the GIMP it might have been important
>> to set the resolution to 600 pixels/inch instead of some possibly lower
>> default setting. In the GIMP I could crop away that unwanted grey or
>> blue color by clicking on the GIMP's select tool and then clicking on the
>> image and dragging the touch-pad pointer while holding down the left
>> touch-pad button to enclose the interesting portion of the image of the
>> book page in a dashed rectangle; then I could select "Image" and then
>> "Crop to Selection" to accomplish the cropping. Via "Image" and "Rescale
>> Image" or something similar I could resize the image to the roughly
>> 6.00-inch width I desired while keeping the aspect ratio or proportions
>> of the figure unchanged. A surprise for me was that the .eps file
>> converted from this .pdf file using the GIMP had a large size of 33.2
>> MiB. In the GIMP I had to do something unusual, namely as I was directed
>> by the GIMP to export the file, which I found I could do by clicking on
>> an "Export" button before finally saving the .eps file. In my .tex file
>> I changed the name of the .eps file I wanted to include to match the new
>> .eps file name I made using the GIMP.
>
>
>> Then after executing latex .... and latex2html.... commands on that .tex
>> file I gratefully found that this time the new .eps file had very good
>> visual quality with minor graininess visible especially within the letter
>> "H." The size of the folder produced by LaTeX2HTML was roughly equal to
>> the 1.1-MiB size of the .pdf file with which I started. So apparently
>> about 97 percent or more of the contents of the 33.2-MiB .eps file were
>> discarded in LaTeX2HTML's process of generating the folder containing the
>> output, .html file.
>
> Sure.
> The resulting files img1.png and img2.png have down-sampled back
> to a simple black&white image rather than using grayscales.
> I'm not sure why this has happened. There should be options that you
> can give to LaTeX2HTML to affect the image processing, and get
> much better final graphics.
>
> Isn't there a switch -antialias that you can use?
>
>
>>
>> I am grateful to now have a way to produce fairly good-quality figures in
>> a .html file using LaTeX2HTML from .pdf scans of printed paper converted
>> to .eps files using the GIMP. But as the results in the first paragraph
>> have shown, the "Black and White"
>
> Definitely do *not* use this setting.
>
>> and probably "Photo" settings of my scanner program were apparently
>> adequate for lines and curves that were not very thin in the original
>> source.
>
> Certainly try with "Photo", to see the differences.
> But "Grayscale" better tells the scanner what kind of material
> is being scanned.
> It may have been black&white once, but ink spreads into the paper,
> and the pages fade with time --- not to mention yellowing.
> So it really is appropriate to use Grayscale.
>
>>
>> So in retrospect thank you, Professor Moore, for your suggestion that I
>> provide you with one or more of my example files. That suggestion and
>> the circumstance of a copyright were early steps which ultimately and
>> gratefully led me to a solution to the problem of how to generate fairly
>> good-quality figures in a .html file produced by LaTeX2HTML.
>>
>> Pat
>
> I'm glad you are happy with what you have achieved.
>
> Preserving the content of old books --- particularly those with
> mathematical content that is still valid and relevant ---
> is a very worthwhile undertaking.
>
>
> All the best,
>
> Ross
>
> ------------------------------------------------------------------------
> Ross Moore ross.moore at mq.edu.au
> Mathematics Department office: E7A-419
> Macquarie University tel: +61 (0)2 9850 8955
> Sydney, Australia 2109 fax: +61 (0)2 9850 8114
> ------------------------------------------------------------------------
>
>
>
>
More information about the latex2html
mailing list