[l2h] Poor resolution for my .eps file contents in a .html file produced by LaTeX2HTML and viewed in a Web browser such as Konqueror

Pat Somerville l_pat_s at hotmail.com
Fri Jun 24 23:21:48 CEST 2011

Thanks once again for kindly taking the time to reply to me, Professor
Moore.  There are conceivable complications with permission in quickly
sending you and numerous other people in the LaTeX2HTML users group a copy
of the figure in a published book with which I was working.  I am in the
eastern United States of America.  About a couple of months ago I obtained
permission from the modern-day right holder, for a book copyrighted in 1969,
now in Oxford, The United Kingdom to send scanned copies of up to 25 pages
of the book in question to one recipient.   Instead of dealing with
obtaining permission to send a copy of a figure from that book to multiple
recipients, at 600 dots per inch (dpi) I scanned a copy of one of my own
drawings and caption on paper in both the Tagged Image File Format (.tif),
in an attached figure called TestFig.tif, and the Portable Document Format
(.pdf), in an attached figure called TestFig.pdf.  Using the Gnu Not Unix
(GNU) Image Manipulation Program (GIMP) 2.6.11 I made Encapsulated
PostScript (.eps) files of those figures, which are respectively the
attached files TestFigTif.eps and TestFigPDF.eps; in that process I probably
cropped the original figures.  Then in the attached LaTeX file
Throwaway11.tex I included those two .eps figures using the LaTeX epsfig
software package.  I executed a latex2html command on Throwaway11.tex to
produce Throwaway11.html and 11 other files which are all attached to this
e-mail letter.  On viewing Throwaway11.html in a Konqueror Web browser I
found the qualities of the two figures TestFigPDF.eps and TestFigTif.eps
quite acceptable!

Then I noticed that some lines or curves of letters in some mathematics type
in the book I was copying were probably thinner than the lines or curves of
the drawing on paper to which I referred in the previous paragraph here.  On
a "surface level" that could point to problems or complications associated
with the original source on paper rather than to LaTeX2HTML or the GIMP,
although to my human eye the type on the original book page was of excellent
visual quality.  But I found ways to adjust my Epson Stylus CX3810's
scanning program to deal with the thin-lined type on the page of the book in
question: In the Epson scanner program's "Professional Mode" 1) for the
"Auto Exposure Type" I probably switched from "Photo" to "Document."  2) For
the "Image Type" I switched from "Black and White" to "8-bit Grayscale."  I
still kept the resolution setting at 600 dots per inch (dpi).  Differences
with various choices of "Image Type" could even be seen in a "Preview" or
brief scan of the book page.  As a result  the full scan of the book page
took considerably longer than previously, but with a total scan time of
within a few minutes.  I guess that increased scan time might be a clue to
why I gratefully had success with the scanner program setting "8-bit
Grayscale" in the case of a thin-lined document!  I saved the scanned file
as a .pdf file with a size of 1.1 MiB (megibyte=2**20 bytes).  As a result
of the "8-bit Grayscale" setting I could see a blue or grey color near the
book-binding side of the page in the scanned image file of that book page.
In importing the .pdf file in the GIMP it might have been important to set
the resolution to 600 pixels/inch instead of some possibly lower default
setting.  In the GIMP I could crop away that unwanted grey or blue color by
clicking on the GIMP's select tool and then clicking on the image and
to enclose the interesting portion of the image of the book page in a dashed
rectangle; then I could select "Image" and then "Crop to Selection" to
accomplish the cropping.  Via "Image" and "Rescale Image" or something
similar I could resize the image to the roughly 6.00-inch width I desired
while keeping the aspect ratio or proportions of the figure unchanged.  A
surprise for me was that the .eps file converted from this .pdf file using
the GIMP had a large size of 33.2 MiB.  In the GIMP I had to do something
unusual, namely as I was directed by the GIMP to export the file, which I
found I could do by clicking on an "Export" button before finally saving the
.eps file.  In my .tex file I changed the name of the .eps file I wanted to
include to match the new .eps file name I made using the GIMP.  Then after
executing latex .... and latex2html.... commands on that .tex file I
gratefully found that this time the new .eps file had very good visual
quality with minor graininess visible especially within the letter "H."  The
size of the folder produced by LaTeX2HTML was roughly equal to the 1.1-MiB
size of the .pdf file with which I started.  So apparently about 97 percent
or more of the contents of the 33.2-MiB .eps file were discarded in
LaTeX2HTML's process of generating the folder containing the output, .html
file.

I am grateful to now have a way to produce fairly good-quality figures in a
.html file using LaTeX2HTML from .pdf scans of printed paper converted to
.eps files using the GIMP.  But as the results in the first paragraph have
shown, the "Black and White" and probably "Photo" settings of my scanner
program were apparently adequate for lines and curves that were not very
thin in the original source.

So in retrospect thank you, Professor Moore, for your suggestion that I
provide you with one or more of my example files.  That suggestion and the
circumstance of a copyright were early steps which ultimately and gratefully
led me to a solution to the problem of how to generate fairly good-quality
figures in a .html file produced by LaTeX2HTML.

Pat

--------------------------------------------------
From: "Ross Moore" <ross.moore at mq.edu.au>
Sent: Thursday, June 23, 2011 8:56 PM
To: "Pat Somerville" <l_pat_s at hotmail.com>
Subject: Re: [l2h] Poor resolution for my .eps file contents in a .html file
produced by LaTeX2HTML and viewed in a Web browser such as Konqueror

> Hello Pat,
>
> On 24/06/2011, at 9:50 AM, Pat Somerville wrote:
>
>> Hello. On my Epson scanner I can lay paper with a figure, printing, or
>> writing on it; use my scanner to scan that sheet of paper; and then with
>> the scanner's computer program generate an image file of the contents on
>> the paper. The program's available file formats are MULTI-TIFF (.tif,
>> Tagged Image File Format), bitmap (.bmp), JPEG (Joint Photographic
>> Experts Group, .jpg), and .pdf (Portable Docoument Format). I commonly
>> selected the .tif format for my scanned image files; lately I selected
>> 600 dots per inch (dpi) for the resolution of black-and-white images.
>> Notice that Encapsulated PostScript (.eps), which can be conveniently be
>> used in my .tex file with LaTeX and LaTeX2HTML, is not one of my scanner
>> program's available formats. But with both the Gnu Not Unix (GNU) Image
>> Manipulation Program (GIMP) 2.6.11 and the LibreOffice program Draw I
>> could convert the .tif files to .eps files.
>
>
> Rather than going on with you long rambles about what you obtain and why
> you
> tink you are getting it, could you please send screenshots of your poor
> output and point out those things that you would like improved.
>
> Most likely you have not chosen to use modern scaleable fonts, in the
> final output, or have rasterised to a less-than-optimal resolution.
> Show us your output and someone can more easily identify what might be
> going wrong for you.
>
>>
>> In my openSUSE-11.4, Linux operating system using GIMP my .eps image
>> files gratefully had good-looking, sharp letters and mathematical symbols
>> from the original sections of paper. In my .tex file I used the package
>> epsfig to deal with the .eps figures. After I executed latex and
>> latex2html commands on the .tex file, I viewed the resulting .html file
>> in Konqueror and Firefox Web browsers. Unfortunately the qualities of the
>> letters and symbols in the figure in both the .html file produced by the
>> program LaTeX2HTML and the .dvi (Device-Independent) file produced by the
>> program LaTeX were very poor. Unfortunately the quality of a .eps file
>> after conversion from a .tif file using the LibreOffice program Draw was
>> also poor when viewing the LaTeX2HTML-produced,  .html, output file in
>> probably the Konqueror Web browser.
>>
>> In my reading of postings on the Internet relating to file formats I
>> found complexities associated with this situation, especially regarding
>> file formats. I can foresee a few ways to understand or learn the
>> complexities in order to obtain good-looking figures in a
>> LaTeX2HTML-produced .html output file, namely, by 1) reading more on the
>> Internet, 2) experimenting myself with pdflatex and/or latex options,
>> packages, and various file formats; and/or 3) learning things from people
>> who on reading this letter can realize why I haven't always obtained
>> sharp-looking images in my .html file and how I could somehow do so. This
>> letter is an attempt toward approach 3, as well as to report things I
>> already tried and some ideas I imagined.
>>
>> Meanwhile I can present some of what I read and learned from the Internet
>> and a few puzzles of mine relating to this matter. From answer 13 at
>> http://thedailyreviewer.com/compsys/view/quality-of-picturesgraphics-using-graphicx-package-113295267
>> on the Internet I learned that the .bmp, .tif, and .jpg file formats that
>> my Epson scanner program can produce are classified as raster formats
>> which are composed of pixels; .png (Portable Network Graphics) and ..jpg
>> are apparently two other raster formats. The .eps format with which LaTeX
>> and LaTeX2HTML can work is a vector format which works with points and
>> paths between points; .pdf, .which my scanner can apparently produce,
>> dvi, and .ps (PostScript) are apparently three other vector formats. The
>> answer-13 poster wrote that the quality was lost in converting from a
>> vector to a raster format. I wonder if a loss of quality is supposed to
>> occur in converting from a raster to a vector format, which is the type
>> of conversion I made from .tif to .eps
>
> No.
> .eps is used as a wrapper around many types of data format.
> If it started as raster as a .tif then it stays that way when
> wrapped up for inclusion within an .eps file.
>
> Once an image is rasterised, you cannot improve the quality,
> except perhaps by squeezing into a smaller size, so that any
> dotty-ness becomes less noticeable. But then everything becomes
> smaller and harder to see.
>
>
>> using both GIMP 2.6.11 and LibreOffice Draw. From the results I discussed
>> above you can see that I have apparently two conflicting answers to this
>> question: a) Such a .eps file looked fine in GIMP 2.6.11. b) But when it
>> became part of my .html file the letters and mathematical symbols
>> contained in the .eps file were not clear and sharp. On the same Web page
>> I mentioned the poster of answer 14 reported,"Saving
>>
>> the graphic into eps and then converting them to pdf via epstopdf is the
>> way to go with minimal quality loss." I haven't tried that. Easier and
>> better would be for me to directly produce a .pdf file with my scanner
>> program.
>
> Yes.
>
>> But can LaTeX2HTML and LaTeX work directly with pdf files?
>
> No. But you use Ghostscript to render to .png or .jpg format.
> That is precisely what LaTeX2HTML does anyway, with TeX material.
>
>
>> I did try scanning something on paper to a .pdf file with my scanner and
>> its associated computer program.  After opening that .pdf file in GIMP
>> 2.6.11 I converted the vector format .pdf to another vector format .eps.
>> But unfortunately the quality of the content of that .eps file in the
>> resulting .html file produced via a latex2html command was poor with some
>> letters appearing blurry or even "on the edge" of being doubled, as in
>> double vision. Even so, I think the result was probably somewhat better
>> than converting a .tif file to a .eps file and including it in a .tex
>> file.
>>
>> But the quality of my original, scanned .pdf file was pretty good when
>> viewed in GIMP, except for some thin lines in equal signs. So if I had a
>> nearly lossless way to include the original, scanned .pdf file in a .html
>> file produced by LaTeX2HTML, that could be pretty good. Can one do that?
>> And if so, how could one do that?
>>
>> The Web page http://www.cv.nrao.edu/~abridle/toolmemo/node12.shtml
>> indicates that LaTeX can work with other formats besides .ps and .eps,
>> such as .jpg, but that the coordinates of a bounding box must be inserted
>> I think when using \usepackage{graphicx} and an \includegraphics command
>> in a LaTeX file. How does one practically determine the coordinates of
>> the bounding box of a figure he wants to include in a LaTeX document? On
>> the Internet I read that .a jpg file can be directly included in a .tex
>> file when executing pdflatex on it. So one option that looked attractive
>> to me was to include a scanned, .jpg file in a .tex file (Would one have
>> to include the coordinates of the bounding box in the .tex file in this
>> case?); then execute pdflatex on that .tex file. What would happen if one
>> executed a latex2html command on that .tex file? Would the contents of
>> the .jpg file appear in the .html output file?
>>
>> I tried a pdflatex command on one of my .tex files and found that some
>> LaTeX commands, such as \begin{document}, were apparently not recognized.
>> I am relatively ignorant of pdflatex.
>
> Are you sure you used  'pdflatex'  and not  'pdftex' ?
>
> To a user, 'pdflatex' works just like 'latex'.
> If it barfs at \begin{document}  then there is probably a fault
> earlier within your document preamble.
>
>>
>>
>> Pat
>
> Hope this helps,
>
> Ross
>
> ------------------------------------------------------------------------
> Ross Moore                                       ross.moore at mq.edu.au
> Mathematics Department                           office: E7A-419
> Macquarie University                             tel: +61 (0)2 9850 8955
> Sydney, Australia  2109                          fax: +61 (0)2 9850 8114
> ------------------------------------------------------------------------
>
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Throwaway11.tex
Type: application/octet-stream
Size: 849 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0007.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: images.aux
Type: application/octet-stream
Size: 588 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0008.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: images.log
Type: application/octet-stream
Size: 14270 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0009.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: images.pl
Type: application/octet-stream
Size: 979 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0010.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: images.tex
Type: application/octet-stream
Size: 7409 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0011.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: img1.png
Type: image/png
Size: 22249 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: img2.png
Type: image/png
Size: 29002 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0003.png>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: labels.pl
Type: application/octet-stream
Size: 160 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0012.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Throwaway11.css
Type: text/css
Size: 891 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0001.css>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: WARNINGS.dat
Type: application/octet-stream
Size: 90 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0013.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TestFigTif.eps
Type: application/postscript
Size: 333868 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0002.eps>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TestFig.pdf
Type: application/pdf
Size: 44132 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0001.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TestFig.tif
Type: image/tiff
Size: 2920698 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0001.tif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TestFigPDF.eps
Type: application/postscript
Size: 2573004 bytes
Desc: not available
URL: <http://tug.org/pipermail/latex2html/attachments/20110624/a48e81ca/attachment-0003.eps>