[l2h] An Apparent Byte Size Limit for a Portable Network Graphics (.png) Image File Containing Simplified Chinese Characters Produced by LaTeX2HTML From a .tex File Containing LaTeX and Chinese/Japanese/Korean (CJK) for LaTeX Commands

Pat Somerville l_pat_s at hotmail.com
Mon Aug 2 19:55:22 CEST 2010


Thank you, Professor Moore, for kindly taking the time to respond to me.  In my .tex file I now have CJK segments that each begin with \begin{CJK}{UTF8}{gbsn} and end with \end{CJK} that are small enough to avoid the problem of a too-tall or too-large a .png file.  Assuming you are correct about some .png images being too tall for a page, the corresponding, problematic, .png size appears to have been between 65.7 KiloBytes (KB) and 75.1 KB.  By now a number of my output files from failed LaTeX and LaTeX2HTML runs may have been deleted.  But from a failed execution of a command of the form "latex2html...... MyFile.tex" in a folder with a corresponding name of the form MyFile it seems like I had a zero-byte, .png file numbered before a .png file with a size of 107 KiloBytes (KB) or 94 KB.  That is consistent with what you expected.  For the benefit of other readers of this e-mail letter you thought the 0-byte-sized .png file could occur when the .png file after it would be too tall to fit on a page of output.  I doubt if I have ever encountered a case in my use of LaTeX2HTML in which a mathematical expression in a .png file would have been too tall to fit on a page of, say .dvi output from LaTeX.  It seems to me that a .html output file from an execution of latex2html command ought to be just one Web page long.  If so, this is a curious thing for me.--That is I would expect a .png file to always be shorter in height than the entire, .html, output file from LaTeX2html that uses the .png file; and I think the "Bad file descriptor" errors for generating problematic .png files were generated by LaTeX2HTML 1.70 instead of LaTeX 2e.

I realize that my understandings of the operations of LaTeX and LaTeX2HTML are limited.  From running the two programs on a file of the form MyFile.tex in a terminal program as a root user I recall that LaTeX can produce a file of the form MyFile.dvi with multiple pages of output.  That design of separating the output into multiple pages is understandable when one wants the MyFile.dvi output file to be printed onto sheets of paper.

Basic questions:

1.  I often execute a command of the form "latex MyFile.tex" once or twice before executing a command of the form "latex2html.....MyFile.tex".  In this way sometimes I could be made aware of some LaTeX command errors in my file of the form MyFile.tex.  But is first running LaTeX like that absolutely necessary before running LaTeX2HTML?

2a.  How is it that LaTeX2HTML could "think" in terms of multiple pages when the .html output file appears to me to be just one, long Web page?
2b.  Does LaTeX2HTML rely on the page separations generated by LaTeX in producing a file of the form MyFile.dvi?

Okay, now I return to the problem of some .png image files containing simplified Chinese characters, .png images which are too tall for a page, assuming you are correct.  I am not sure I have ever encountered this problem for a .png file for a single mathematical expression containing only numbers, mathematical symbols, and/or just a few Greek letters and/or English words.  So the design of putting each mathematical expression or sometimes one Greek letter in one .png file is a good one because it usually avoids this problem.  Apparently the simplified Chinese characters are packaged in groups in .png files with one CJK segment per .png file.  I now see two possible ways in which the problem of too-tall, .png images containing Chinese characters could be avoided by a change in the design of some software:

I.  Make LaTeX2HTML always "think" of MyFile.html as one, long page; and make it "think" of the length of that page as including all of the .png files the .html file uses.  Managed in this way the problem of a .png file being too tall or too large in byte size should never occur because the .html file should always be as "tall" or "taller" (really long or longer) or contain as many or more bytes as a .png file used by .html file.

2.  Have LaTeX2HTML assign each, different, simplified Chinese character used in the file MyFile.tex file to its own .png file.  This would be similar to the strategy used by LaTeX2HTML for each mathematical expression or isolated Greek letter.  I guess that using a font size for a Chinese character taller than a page of corresponding MyFile.dvi output would either seldom occur or may not even be possible if such a large font size does not exist.

Meanwhile, if needed, I could in principle continue to break long {CJK} segments in a .tex file into shorter ones to avoid the problem of a .png file that is too tall.  Again thanks for writing to me, Professor Moore.  Oh yes, some more good news is that although the messages I sent to two, different, e-mail addresses attempting to subscribe to a CJK users group failed to be delivered, from an e-mail letter I sent to a different, CJK,  e-mail address for the purpose of discussing this problem I received what appears to have been an automatically generated response informing me that I could receive a future response.

Pat         

From: Ross Moore 
Sent: Saturday, July 31, 2010 6:35 PM
To: Pat Somerville 
Cc: <cjk at ffii.org> ; <latex2html at tug.org> 
Subject: Re: [l2h] An Apparent Byte Size Limit for a Portable Network Graphics (.png) Image File Containing Simplified Chinese Characters Produced by LaTeX2HTML From a .tex File Containing LaTeX and Chinese/Japanese/Korean (CJK) for LaTeX Commands


Hello Pat,




On 01/08/2010, at 8:02 AM, "Pat Somerville" <l_pat_s at hotmail.com> wrote:
 

  The Apparent Byte Size Limit for a Portable Network Graphics (.png) File

  However, in the file of the form MyFile.tex, apparently when the set of LaTeX commands and text between the commands \begin{CJK}{UTF8}{gbsn} and \end{CJK} was too extensive, one of the ensuing messages after entering the command of the form "latex2html........ MyFile.tex" was "Bad file descriptor" in attempting to generate the .png image.  That .png image was listed in the so-generated folder with a corresponding name of the form MyFile; but it had a size of 0 bytes.--So,


Usually when an image OS size 0 bytes is produced, it is because the image was too tall for the page size being used in the images.tex LaTeX job. This causes TeX to output a extra blank page before the oversize image. It is this blank page which becomes the bad image.


You should check the images.log file to see whether this has happened. In particular the number of pages output will be greater than the expected number of images. The console log of the LaTeX2HTML job indicates how many images are to be created, so you can check this without looking into images.tex itself.


I doubt very much that you have reached any limit on the size of a PNG file.
There are many other possible causes of the failure, before appealing to such a physical limitation. 


  of course, it either wasn't displayed or else a blank for it was displayed when the file of the form MyFile.html was opened in the Konqueror Web browser.  From experience the limiting size of the so-generated, yet displayable, not-empty, .png file had to have been somewhat larger than 60 kilobytes, based on the largest .png file size I recall seeing in this context without the file-size problem.  Such a large size is in stark contrast to 4.6 kilobytes, the largest size I saw for a .png image of a mathematical expression generated by LaTeX2HTML 1.70 from a .tex file containing mathematical expressions and possibly one or more Greek letters, but no simplified Chinese characters.

  A "Workaround" Solution

  By breaking the single, long, \begin{CJK}{UTF8}{gbsn}, \end{CJK} segment into several, shorter, such segments, such that no LaTeX2HTML-generated .png file had a size larger than the apparent byte limit of somewhat greater than 60 kilobytes, the .html file produced by LaTeX2HTML could contain the designed, simplified Chinese characters and mathematical content.  Then each {CJK} segment of the .tex file corresponded to one .png file.


This could certainly work, to keep the vertical height to smaller chunks.
Then each of these images would be made correctly. 




  The above solution is much preferred over the alternative solution of breaking the long, say MyFile.tex file into files of the form MyFileA.tex, MyFileB.tex, MyFileC.tex, etc.; executing latex2hmtl commands of the forms "latex2html......MyFileA.tex",  "latex2html......MyFileB.tex",  "latex2html......MyFileC.tex", etc., so-producing output files of the respective forms MyFileA.html, MyFileB.html, MyFileC.html, etc.; and finally appending each of those files in the order of MyFileA.html, MyFileB.html, MyFileC.html, etc., to make one long, .html document.--The undesirable features of this alternative solution are that, say equation number 1 and the image file name img1.png could conceivably appear for each of the files of the forms MyFileA.html, MyFileB.html, MyFileC.html, etc.  So if all of the multiple, img1.png files were placed in the same directory, there would likely be mistakes or problems when img1.png would be referenced by one of the .html files.  No, for one project each of the equations and .png files should have its own, unique number.  And that can be arranged automatically by LaTeX2HTML by using the first solution in which the long, CJK segment in the original .tex file is broken into several CJK segments, as discussed in the first paragraph of this section.

  Unknown Origin of the Apparent .png-File Size Limitation

  Since the origin in the computer code of the apparent, .png, file-size limitation is unknown to me, even whether it is within the CJK for LaTeX or LaTeX2HTML code, I hope I will able to send this e-mail letter to both the LaTeX2HTML and CJK users groups; however, so far e-mail letters sent to two e-mail addresses posted for joining the CJK users group have been returned to me as undeliverable.  Please advise me on where to make a change in one of the computer codes to accommodate a .png file size larger than the current apparent limit, which appears to be somewhat over 60 kilobytes.  Thanks in advance for help with where to make a change in some computer code to overcome this apparent limitation.


I think you are looking at the wrong place, for the problem that you encountered.



  Pat  






Hope this helps.


        Ross

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/latex2html/attachments/20100802/f8d4b9da/attachment.html>


More information about the latex2html mailing list