[texhax] lwarp package — Native LaTeX to HTML conversion

Shubho Roy shubho.roy85 at gmail.com
Mon Mar 21 06:30:57 CET 2016

Can this package be on github?

On Mon, Mar 21, 2016 at 7:58 AM, Deyan Ginev <d.ginev at jacobs-university.de>

> Dear all,
> Since I noticed there is a brief tex -> html discussion going on, let me
> add one more perspective here. I'm a contributor to the LaTeXML
> conversion project, which I am guessing is the one implied by the
> reference to "Perl" in Uwe's last email.
> On 20.03.2016 20:08, Uwe Lueck wrote:
> >> This is a LaTeX package which causes LaTeX to directly generate HTML
> tags,
> >> using pdftotext and a few other utilities to convert the resulting PDF
> file
> >> into HTML files.
> > The approach is interesting, yet if you convert LaTeX to PDF
> > and the result to HTML, the meaning of "direct" forbids calling
> > this a "direct" generation of HTML. It is just as "direct" as
> > the late Eitan Gurari's tex4ht.
> Another point to sneak in here is now that libarries such as "pdf.js"
> exist, and browsers have native PDF previews via the HTML5 canvas,
> having a TeX-near mapping from PDF to HTML may have little added value.
> In a way it is only worth the effort if it produces a "better" HTML5
> document, where "better" is parametric in the goal the author is trying
> to achieve (e.g. eBooks have different requirements than web sites, or
> say even more modern - interactive exercise sheets.)
> > I have never used tex4ht, but my impression is that this is the
> > most promising way to get HTML from LaTeX. So you should tell
> > what lwarp offers that tex4ht doesn't.
> Well, that measure of being "promising" depends on the goal you're
> trying to achieve...
> > Now as to "native": There are LaTeX-to-HTML converters that use
> > Perl or things like this (Pandoc). For those I could accept
> > calling them "direct" conversion, but not "native", as they use
> > external software for the conversion rather than the LaTeX
> > typesetting system.
> This seems to be a fair distinction.
> >  A problem with this approach is that the
> > author's custom macros cannot be processed.
> Well, that's not entirely true. It's definitely true for early stage
> reimplementations (such as pandoc), which are quite limited in coverage.
> LaTeXML on the other hand already supports an impressive subset of TeX.
> A brief illustration could be the classic xii.tex example [1].
> It is certainly not complete in coverage yet, with about 61% of
> arXiv.org's academic sources converting without issues [2]. But it's
> also definitely not a toy project at this stage.
> That being said, I find lwarp.sty to be a fascinating project, and the
> "Alternatives" section in its manual is a rather on-point overview of
> the current solution attempts [3]. Looking at the implementation
> details, it seems that much of the inevitable customization is again
> present - in order to resolve the "impedance mismatch" between the
> printed page and the hypertext webpage, various high level constructs
> need to be mapped correctly over to the HTML, before TeX gets to have
> its way with them. (I had some more thoughts in that vein in an old blog
> post of mine [4]).
> I think each conversion process has done significant work in that
> direction, and I find myself wishing that we could reuse and exchange
> our bindings more effectively between projects.
> > I should not advertise my blog package in the moreyhpe bundle
> >
> >     http://ctan.org/pkg/morehype
> >
> > (at present) but the original posting provokes the question
> > of what "native" conversion of LaTeX to HTML could be:
> > With blog.sty, the source code is actually parsed by LaTeX
> > (I consider it so perverse to parse LaTeX source code
> >  by non-TeX software), and it "directly" \writes HTML,
> > the TeX macros expand to HTML.
> Right, there is a clear difference of perspective there. My view is that
> LaTeX's markup should need no further extensions to generate (semantic,
> high quality) HTML5, as the alternative increases the already high
> learning curve, and makes a highly technical writing experience even
> more involved. For me the pain of having to reeducate the entirety of
> the latex authoring world to write "web-friendly" latex is a worse
> approach than silently offering a solution under the hood. And that is
> something I really appreciate in lwarp's current vision, as Brian seems
> to be sheltering the authors as much as possible.
> I think there is an interesting design problem there, and I am always
> curious to see the trade-offs that each of these projects decides on.
> I'm curious to learn more about the differences between lwarp and
> tex4ht, the amount of work that it took to get to the current stage, and
> the estimated difficulty for adding support for new packages. Looking at
> lwarp's support list, I see that there is a Tikz -> SVG support already
> operational. Is that building on the tex4ht driver for SVG directly?
> Wishing everyone a nice week ahead,
> Deyan
> [1] xii.tex example in the latexml showcase
> http://latexml.mathweb.org/editor?demo=xii
> [2] arXMLiv conversion status
> http://cortex.mathweb.org/corpus/arXMLiv/tex_to_html
> Note: And the 61% here are missing a baseline, since it's unclear how
> many of these sources run error-free with a stock pdflatex installation.
> Apologies for the hanging number.
> [3] lwarp 0.12 manual
> http://bdtechconcepts.com/portfolio/lwarp_v0_12.pdf
> [4] "LaTeX is Dead (long live LaTeX)" blog post
> http://prodg.org/blog/latex_today/2015-02-20/LaTeX%20is%20Dead,%20Long%20Live%20LaTeX
> >
> > Cheers,
> >
> >     Uwe.
> >
> > _______________________________________________
> > TeX FAQ: http://www.tex.ac.uk/faq
> > Mailing list archives: http://tug.org/pipermail/texhax/
> > More links: http://tug.org/begin.html
> >
> > Automated subscription management:
> http://tug.org/mailman/listinfo/texhax
> > Human mailing list managers: postmaster at tug.org
> _______________________________________________
> TeX FAQ: http://www.tex.ac.uk/faq
> Mailing list archives: http://tug.org/pipermail/texhax/
> More links: http://tug.org/begin.html
> Automated subscription management: http://tug.org/mailman/listinfo/texhax
> Human mailing list managers: postmaster at tug.org

Shubho Roy
National Institute of Public Finance and Policy
18/2 Satsang Vihar Marg,
Special Institutional Area,
New Delhi 110067.
[Near JNU East Gate]

Mobile:- +91-9716479606
Location:- http://goo.gl/ICCjh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/texhax/attachments/20160321/011bfde5/attachment-0001.html>

More information about the texhax mailing list