[texhax] ODT to TeX
Pierre MacKay
pierre.mackay at comcast.net
Tue Jul 8 03:33:45 CEST 2008
Bear with me before deciding that this is hopelessly off-topic.
I believe, and certainly hope that there is a strong future for
conversion of Open Office ODT format into TeX input source. Alas, it is
not easy, and I am struggling with my first attempts. OpenOffice ODT is
simply a zipped file of XML files and some other things, and it unzips
easily, producing the needed files styles.xml and content.xml, which are
then subject to transformation, using XSLT.
I have an extremely crude, partially functional, operation that has
already saved me on several occasions. I take the increasingly strange
Word documents I am sent and run them through OpenOffice2, so as to
escape the hopelessly flaky RTF translations altogether.
Is anyone else who reads TeXhax doing the same thing? It has to be
open-ended and to produce Vanilla TeX, as well as other things like
LaTeX. The last thing we should be trying to do is imitate
word-processor formatting. (I receive MSWord documents that look like
generic graduate-school papers, double-spaced for the copy-editor, and
with all sorts of horrors such as notes with explicit note number
references that have to be eliminated. The journals I set look nothing
like that.)
OpenOffice stylesheets are not simple. You do not get anything obvious
like <italic>. . .</italic> setting off type styles. ("Italic" is
buried as an attribute of a child node of a font declaration node.) But
ODT is reliable and stable, and XSLT ought to be able to do everything
we need.
If anyone knows of an open forum such as TeXhax, where someone like
myself could ask naive questions about XSL stylesheets, that would help
too.
My usual approach is:
xsltproc -o minio_tr.tex minio_tr.xsl styles.xml content.xml
and I have updated libxml and libxslt today.
I am ready (although with some embarassment) to send anyone who wants
them, my working xsl files.
Pierre MacKay
More information about the texhax
mailing list