[texhax] ODT to TeX

Pierre MacKay pierre.mackay at comcast.net
Tue Jul 8 03:33:45 CEST 2008


Bear with me before deciding that this is hopelessly off-topic. 

I believe, and certainly hope that there is a strong future for 
conversion of Open Office ODT format into TeX input source.  Alas, it is 
not easy, and I am struggling with my first attempts.  OpenOffice ODT is 
simply a zipped file of XML files and some other things, and it unzips 
easily, producing the needed files styles.xml and content.xml, which are 
then subject to transformation, using XSLT. 

I have an extremely crude, partially functional, operation that has 
already saved me on several occasions.  I take the increasingly strange 
Word documents I am sent and run them through OpenOffice2, so as to 
escape the hopelessly flaky RTF translations altogether. 

Is anyone else who reads TeXhax doing the same thing?  It has  to be 
open-ended and to produce Vanilla TeX, as well as other things like 
LaTeX.  The last thing we should be trying to do is imitate 
word-processor formatting.  (I receive MSWord documents that look like 
generic graduate-school papers, double-spaced for the copy-editor, and 
with all sorts of horrors such as notes with explicit note number 
references that have to be eliminated. The journals I set look nothing 
like that.)

OpenOffice stylesheets are not simple.  You do not get anything obvious 
like <italic>. . .</italic> setting off type styles.  ("Italic" is 
buried as an attribute of a child node of a font declaration node.)  But 
ODT is reliable and stable, and XSLT ought to be able to do everything 
we need. 

If anyone knows of an open forum such as TeXhax, where someone like 
myself could ask naive questions about XSL stylesheets, that would help 
too. 

My usual approach is:
xsltproc  -o minio_tr.tex  minio_tr.xsl styles.xml content.xml
and I have updated libxml and libxslt today.

I am ready (although with some embarassment) to send anyone who wants 
them, my working xsl files.

Pierre MacKay




More information about the texhax mailing list