[OS X TeX] Scripting

Joshua Kuritzky joshua at kuritzky.com
Sun Jun 30 04:37:39 CEST 2002



Adrian:

If you decide not to go the rtf2latex route, consider saving the Word 
document as a Web Page and then using something like Perl to parse the 
resulting .html file. The latest versions of Word for both Mac and 
Windows have been designed to use .doc and .html files somewhat 
interchangeably. This is from the Word v.X online help:

>
> Save entire file into HTML   This option saves all of the document's 
> properties into the HTML file. Use this option if you have Word 
> specific elements in your document, such as comments, header or footer 
> information, or document properties that you want to maintain but won't 
> appear when the document is viewed in a Web browser. When you select 
> this option, Word retains all of the special elements and attributes 
> contained in the file.

In other words, if you save the Word file as an .html file and make sure 
that you "save entire file", then all your Word attributes will be 
embedded in the .html file (through what appears to be a lot of 
complex-looking XML).

 From this point, you'll need to figure out what types of tags you want 
to look for and devise a script to convert them to LaTeX. Not the 
simplest task, but by no means impossible.

Good luck,
-Joshua


On Saturday, June 29, 2002, at 12:38  AM, Adrian Heathcote wrote:

> Hi Folks
>
> I want to write a script which will take a Word file and convert some 
> of the most common character formatting in that file and replace it 
> with latex commands. So, for example if a string of characters is 
> italisized in word the script will replace it with that string 
> surrounded by \textit{}. And so on for some other bits of formatting 
> that will not convert with find/replace in Word.
>
> Does anyone have any suggestions, for a scripting novice, of which 
> scripting language would best accomplish this? Should I learn Perl, 
> Applescript or what?
>
> Thanks in advance
>
> Adrian Heathcote
>
>
> -----------------------------------------------------------------
> Threaded list archives can be found at:
> <http://www.masda.vxu.se/~pku/MacOSX_TeX/>
> -----------------------------------------------------------------
> To UNSUBSCRIBE, send email to <info at email.esm.psu.edu> with
> "unsubscribe macosx-tex" (no quotes) in the body.
> For additional HELP, send email to <info at email.esm.psu.edu> with
> "help" (no quotes) in the body.
> -----------------------------------------------------------------
>
--
Joshua Kuritzky
joshua at kuritzky.com


-----------------------------------------------------------------
Threaded list archives can be found at:
<http://www.masda.vxu.se/~pku/MacOSX_TeX/>
-----------------------------------------------------------------
To UNSUBSCRIBE, send email to <info at email.esm.psu.edu> with
"unsubscribe macosx-tex" (no quotes) in the body.
For additional HELP, send email to <info at email.esm.psu.edu> with
"help" (no quotes) in the body.
-----------------------------------------------------------------




More information about the macostex-archives mailing list