[tex4ht] validity of tex4ht HTML-code output

Johannes Wilm mail at johanneswilm.org
Thu Jul 28 11:37:26 CEST 2011

On Thu, Jul 28, 2011 at 2:10 AM, Ulrike Fischer <news3 at nililand.de> wrote:

> Am Thu, 28 Jul 2011 01:24:04 -0700 schrieb Johannes Wilm:
> >>> What I wonder though is what the state of the HTML that is being output
> >>> really is. It seems to me specifically that:
> >>> a. almost none of the <p>-tags are closed
> >> Use the xhtml-Option I mentioned earlier.
> > Ah, yes that's probably what I should done to start out with. Now when I
> > switch html for xhtml, it somehow breaks my SVG-fixing script. I didn't
> know
> > that the html-option would produce partially invalid HTML, as it seems.
> It doesn't produce invalid html, it produce html 4.01:
> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
> Which is different to xhtml which would get with the xhtml option
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
> >>> b. an element that is used a lot are "tspans" which the W3C validation
> >>> claims to not have heard about.
> http://www.w3.org/TR/SVG/text.html#TSpanElement
> > I guess everything is currently changing, like
> > bibtex -> biblatex
> bibtex is an application and biblatex a package.
Yes, I am aware of that. I translated som biblatex stuff a while back.
Still, it is a change that needs to be implemented various places -- such as
LyX, such as tex4ht, such as in any bibtex-database administration program.

>> The transition is more
>> bibtex + natbib or bibtex + jurabib or bibtex + some bib-style ->
>> bibtex + biblatex or biber + biblatex.
>> > pdftex -> luatex
>> > 8 bit -> utf-8
>> > PDF -> EPUB
>> these are formats for very different purposes. pdf is a page
>> description format, while in ebook/web-formats text can be easily
>> refloated.
I know. It's still an issue though that needs to be dealt with and is
therefore a cause of problems with getting everything to work together.

>> > PNG -> SVG
>> PNG is a bitmap format, SVG a vector format. How would convert a
>> foto to a vector format?
I am just saying that this is a period when some thing which previously were
in PNG henceforth will be in SVG. The issues with that are that the Tikz ->
SVg doesn't work perfectly yet, and apparently although SVG is part of the
epub-specification, most epub-readers have bad or no SVg-renderers built in.
This is yet another cause for issues I ended up writing a bunch of scripts
that would fix the SVG-files, and in the end I made it convert the SVGs to
PNGs and change the corresponding parts in the HTML-files.  Not really all
that smooth.

>> > Kile -> LyX
>> I will certainly never use LyX, and I don't any TeX/LaTeX/context
>> expert who considers such a step.
Yeah, well I wrote for ages in Kile. Yet I need to communicate with the
world around me. Professors can not really be trained to use anything else
than Word. proofreaders (mine at least) I have managed to move onto
LyX/Dropbox. And that was a challenge. When editing a student journal
together with others, I can either choose (as I have in the past), that all
the other editors send me the articles in Word-format, which I then convert
to Latex by hand, whereupon for about 4 months I receive about 15 emails a
day with some 20+ instructions in each of the type "On page 382 in the very
last paragraph, there is a comma missing. You'll see it when you read the

Or, alternatively, I can force everybody else to write and maintain
everything in LyX in a Dorpbox-folder, with me holding one central
masterfile (in latex)  and my monster-script to auto-build the various
output formats. This is what I plan on doing for all future. I can live with
the fact that I don't have as much control as I used to with Kile. LyX rally
only recently has become easy enough to install and usable for novices that
I find it safe to try to do this.

Anyways, it doesn't really matter. I was just rationalizing how the current
state of the one and only state-of-the-art open source publishing software
is what it is.

Johannes Wilm
tel: +1 (520) 399 8880
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/tex4ht/attachments/20110728/1f2bee02/attachment-0001.html>

More information about the tex4ht mailing list