# [OS X TeX] Preparing large non-tex text for use in latex

Eric van der Oord eric.vanderoord at gmail.com
Tue Jul 5 08:07:23 CEST 2011

```If you use TeXShop, you can run the following applescript macro to apply sed to a selected text of a TeXShop document :

--applescript
tell application "System Events"
tell process "TeXShop"
end tell
end tell

do shell script " pbpaste > texfmr ;   sed -i 1 -f ~/Library/conversion/foo.txt texfmr ; pbcopy < texfmr ; rm texfmr ; rm texfmr1"

tell application "System Events"
tell process "TeXShop"
end tell
end tell

--

~/Library/conversion/foo.txt contains the sed instructions.

texfmr is an auxiliary file.

In this script the menu item names are in french.

Eric

Le 5 juil. 2011 à 01:43, Michael Sharpe a écrit :

>
> On Jun 30, 2011, at 2:33 PM, Bobby Cheren wrote:
>
>> I am working on writing a casebook with a law professor. This involves lots and lots of text in the form of cases and articles. This process reveals how achingly painful it is to prep text for use in TeX by adjusting quotation marks, adding \ in fron to \$ and &, and replacing section symbols with \S \. Has anyone built utility for cleaning/preparing text? I wrote the following sed code that seems to get the job done:
>>
>> s/‘/'/g
>> s/'\([a-z]\)/`\1/g
>> s/'\([A-Z]\)/`\1/g
>>
>> s/“/"/g
>> s/”/"/g
>> s/"/''/g
>> s/''\([a-z]\)/``\1/g
>> s/''\([A-Z]\)/``\1/g
>>
>> s/```/``\\,`/g
>> s/'''/'\\,''/g
>>
>> s/§/ \\S \\ /g
>> s/&/\\&/g
>> s/\\$/\\\\$/g
>>
>> I downloaded the texhelpers sed gui and set this to the default command. The process now requires I create a file, put it in a sed input folder, and then retrieve the text from the file created in the sed output folder. Not exactly a clean process.
>>
>> Any thoughts out there on this issue? The ability to batch prepare text would make tasks like LaTeX-ing public domain books and law cases easy.
>>
>
> It's very hard to get such conversions right, even in just a good majority cases, so you have to be prepared to read the output carefully, no matter what you use for conversion. Have you looked at pandoc? It's a freeware format conversion tool written in Haskell, which you need to install (that's not hard) and then install pandoc---directions for that part are at
>
> http://johnmacfarlane.net/pandoc/
>
> You tell it convert from markdown format (which includes plain text) and output to latex. It does not convert the section symbol to \S, but that's simple. It does do a good job with quotes of both types, but only seems to recognize two linefeeds as the end of a paragraph and translates a single linefeed to \\.
>
> Of course, keeping track of the sources and referencing them properly is quite a job. A good start might an Automator workflow to batch convert all *.txt files in a folder to .tex and constructing suitable \input lines in the clipboard.
>
> Michael
>
> ----------- Please Consult the Following Before Posting -----------
> TeX FAQ: http://www.tex.ac.uk/faq
> List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
> List Archive: http://tug.org/pipermail/macostex-archives/
> TeX on Mac OS X Website: http://mactex-wiki.tug.org/
> List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex
>

```