[tex4ht] Bug identified: tex4ht to .odt (also identifying bugs in MiKTeX's setup)

吳聰敏 ntut019 at ntu.edu.tw
Sat Jun 4 05:21:28 CEST 2016

MWE (test.tex):

Hello World!

Some math: $\alpha$ and $\beta$


1. htlatex test "xhtml,ooffice" "ooffice/! -cmozhtf" "-coo -cvalidate"
2. Use LibreOffice to open the test.odt.
   Save as .docx if you want to read it with Windows Word.

Running the first command will produce a test.odt, but if you open the file with
LibreOffice, you will see an error.

Before describing where is this error from, I first discuss some bugs in 
current MiKTeX's setup of tex4ht. The first set of bugs is about subdirectory specfication.

Open tex4ht.env (in MiKTeX's subdirectory) with an editor, 

Replace "%%~/texmf-dist/tex4ht/bin/tex4ht.jar" with "c:\PROGRA~2\MIKTEX~1.9\tex4ht\bin\tex4ht.jar"
Replace "%%~/texmf-dist/tex4ht/xtpipes/" with "c:\PROGRA~2\MIKTEX~1.9\tex4ht\xtpipes\

There are 12 instances for each to be replaced.
What we are doing here is to change directory format form TL to MiKTeX.
(If you use MiKTeX 64bit version, "c:\PROGRA~2\MIKTEX ..." should be "c:\PROGRA~1\MIKTEX ...")

There is another bug in current MiKTeX system (2016.6),
the first command line above needs a space after "xhtml,":

1. htlatex test "xhtml, ooffice" "ooffice/! -cmozhtf" "-coo -cvalidate"

If you don't leave a space, MiKTeX's latex get confused, and can't run.

Now back to the bug in tex4ht system.

After running the first command line, you will get test.odt.
Open the file with LibreOffice, you will see an error message telling you 
that something is wrong in styles.xml (line 94, column 128).

test.odt is a zip file. I unzip test.odt, and open styles.xml,
my editor give the following message:

"Some characters were lost during the conversion"

I go to line 94, and check the end of line:

text:bullet-char="? >

I guess it should be something like:  text:bullet-char="." >

Anyway, following line 90 of styles.xml, I change it to 

text:bullet-char="-" >

Save it back, and now there is no error when opening test.odt with LibreOffice.
I can also save the file as test.docx, and read it with Windows Word.

Hope this helps.

Tsong-Min Wu
National Taiwan University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/tex4ht/attachments/20160604/c4bf97a4/attachment.html>

More information about the tex4ht mailing list