[tex4ht] [bug #340] Math issues in the ODF export

Michal Hoftich michal.h21 at gmail.com
Wed Nov 23 22:52:10 CET 2016


Hi Bill,

> 
> For MathML I think it's better -- and sometimes more
> inter-operable -- simply to insert the full MathML namespace
> via the xmlns attribute in each <math> element and avoid
> prefixing altogether.  Of course, if inside a <math> element
> you want to insert something that is not MathML, that (a) is
> probably going to cause a problem somewhere and (b) would
> require either prefixing or inserting an xmlns attribute on
> each external subelement.
> 
> With HTML5 (as text/html) using prefixes will cause
> breakage.  Using an xmlns attribute on a <math> element in
> HTML5 is unnecessary but should go as unrecognized and be
> ignored by web browsers.  It is helpful when one wants the
> code to work for both the text/html and the
> application/xhtml+xml serializations of HTML5.  I'm guessing
> that this technique would be good for xtpipes.
> 

The prefixes are used only in OpenOffice output (and probably other
formats based on XML, I guess), not in HTML. I think that Eitan used
them because in the main xml file in the ODF file, everything is
prefixed. But each math instance is saved in standalone XML file and
included as picture from the main document, so it seems that prefixing
is not necessary. At least in the examples I've found, the prefixes
weren't used.

We still don't have HTML5 output unfortunately - basic structure
shouldn't be that hard to support, but what about new semantic or
accessibility attributes. Interesting ideas are contained in Scholarly
html format, although it is little too much prescriptive for my taste.
For example, for every math instance, annotation in TeX format is
required. This is doable in tex4ht, but not easy.


> 
> And, alas, failing to include presentation mathml in the
> html namespace was a way to make support of mathml in web
> browsers have secondary importance.  There are problems in
> browser handling of HTML5 that result from having MathML be
> external.  For example, a comma following an inline <math>
> element may wind up on the following line, whereas a comma
> following an (inline) <em> element will not.
> 

Sure, it seems that no one who creates reading applications likes
mathml. Some issues in browsers can be at least fixed using Mathjax,
but this is not an option in office suites. 

Totally tragic is mathml support in Epub 3 readers, where mathml is part
of the standard. Those who support it, do it using Mathjax [2]. This was
painfully slow last time I tried one of such applications on my phone.

One interesting idea is to use Mathjax to convert mathml to html - it
uses some CSS tricks to display the math correctly. Mathjax is now
supported in Node.js [3], with cli tool for such conversion. I've played
with it a bit and created library for make4ht [4]. It can be used with
the following make4ht build file:


---------------
local mathjax_node = require "mathnode"

Make:htlatex{}

local format = "woff"
Make:match("html$", mathjax_node, {fontdir = format, fontformat = format})
----------------

It assumes that Mathjax fonts in "woff" format are stored in the "woff"
directory. You can see sample result here [5].

I was also able to compile the previous example to an Epub file, which
could be displayed in readers with modern CSS support. I also tried to
convert that epub file to kindle format, but it didn't work, as
expected.


Best regards,
Michal


[1] https://w3c.github.io/scholarly-html/
[2] http://docs.mathjax.org/en/latest/misc/epub.html
[3] https://github.com/mathjax/MathJax-node
[4] https://github.com/michal-h21/make4ht/blob/master/mathnode.lua
[5] http://michal-h21.github.io/mathjaxsample/sample.html



More information about the tex4ht mailing list