[tex4ht] wheezy: how to get correct paragraph tags?

D. R. Evans doc.evans at gmail.com
Thu Aug 14 19:14:17 CEST 2014


I don't recall ever running into this question when running *buntu, but I'm
certainly encountering it in wheezy, and after a couple of days working on it,
still haven't been able to work around it.

I have a plain TeX file, five.tex, that compiles correctly with pdftex.

My notes say that in the past, the first step in creating an epub file was to run:
 httex five.tex "xhtml,html4.4ht,unicode.4ht,mathml.4ht"

When I do that, there are no obvious errors, and five.html is created. So far
so good, but when I look inside five.html, I see, for example:

----

class="ec-lmr-10x-x-105">The wind blows itself out shortly after dawn; the
rain lasts a little longer:</span>
<span
class="ec-lmr-10x-x-105">when I finally talk myself into getting out of bed at
half past seven, it is</span>
<span
class="ec-lmr-10x-x-105">still falling, lightly now, a barely audible patter,
no longer the thudding,</span>
<span
class="ec-lmr-10x-x-105">wind-driven sile of the night.</span>
<span
class="ec-lmr-10x-x-105">I draw back the curtain. The sky is gray, but the
clouds are high. After</span>
<span
class="ec-lmr-10x-x-105">fifty years on the island, I can read the harbingers
as well as do the animals.</span>
<span
class="ec-lmr-10x-x-105">The rain will stop soon, in an hour or two, by
mid-morning at the latest.</span>
<span
class="ec-lmr-10x-x-105">The thought cheers me.</span>
<span
class="ec-lmr-10x-x-105">It is early October, and last night’s storm was
the first of the season.</span>
<span
class="ec-lmr-10x-x-105">Anyone who has been thinking of making a late-season
visit to the island</span>
<span
class="ec-lmr-10x-x-105">should be adequately discouraged now.
Yesterday’s trickle of daytrippers</span>
<span
class="ec-lmr-10x-x-105">will be the last until March or April. Winter on the
island has</span>
<span
class="ec-lmr-10x-x-105">begun.</span>

----

The TeX source code that generates this is:

----

\noindent The wind blows itself out shortly after dawn; the rain lasts a
little longer: when I finally talk myself into getting out of bed at half
past seven, it is still falling, lightly now, a barely audible patter, no
longer the thudding, wind-driven sile of the night.

I draw back the curtain. The sky is gray, but the clouds are high. After
fifty years on the island, I can read the harbingers as well as do the
animals. The rain will stop soon, in an hour or two, by mid-morning at the
latest. The thought cheers me.

It is early October, and last night's storm was the first of the season.
Anyone who has been thinking of making a late-season visit to the island
should be adequately discouraged now. Yesterday's trickle of daytrippers
will be the last until March or April. Winter on the island has begun.

----

The problem is that the paragraphing information seems to have been completely
lost.

If I compare this to a (different) book generated last year under Kubuntu, the
html file looked like this:

----

</p><!--l. 12302--><p class="indent" ><span
class="ec-lmr-10x-x-105">As we walked towards the restaurant, there was a
distinct hint of</span>
<span
class="ec-lmr-10x-x-105">autumn in the air. My mind went back to the dales.
The farmers would be</span>
<span
class="ec-lmr-10x-x-105">on the lookout for winter up there, ready for the
onslaught of the</span>
<span
class="ec-lmr-10x-x-105">coming cold. That is one thing about living in Oxford
for which I</span>
<span
class="ec-lmr-10x-x-105">am grateful: real, wind-driven, bone-numbing cold is
unknown</span>
<span
class="ec-lmr-10x-x-105">here, and snow is rare. Still, I miss the freedom of
the dales and</span>
<span
class="ec-lmr-10x-x-105">fells.</span>
</p><!--l. 12309--><p class="indent" ><span
class="ec-lmr-10x-x-105">We passed Somerville and arrived at the entrance of
</span><span
class="ec-lmro-12x-x-87">Brown’s </span><span
class="ec-lmr-10x-x-105">shortly</span>
<span
class="ec-lmr-10x-x-105">before twenty past seven. There was no queue yet, and
I wanted to look at</span>
<span
class="ec-lmr-10x-x-105">the the menu outside the restaurant.</span>
</p><!--l. 12313--><p class="indent" ><span
class="ec-lmr-10x-x-105">“We can do that inside,” insisted
Jonathan, and it was just as well he</span>
<span
class="ec-lmr-10x-x-105">did, for at least two other couples would have
sneaked in front of us had we</span>
<span
class="ec-lmr-10x-x-105">dallied outside.</span>
</p>

----

As you can see, this has valid <p></p> tags to indicate the paragraphs. My
notes don't indicate that I did anything special to make this happen.

So either my notes are missing an important step, or wheezy is behaving
differently from Kubuntu. In either case, my question is (obviously): what do
I need to do in order to get correct paragraph tags in the resultant HTML file
after running httex?

If any of this isn't clear, please let me know and I'll try to explain further.

  Doc

PS I'm sure it doesn't make a difference, but this is 64-bit wheezy, with only
standard TeX-related stuff from the official debian repositories.

-- 
Web:  http://www.sff.net/people/N7DR

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://tug.org/pipermail/tex4ht/attachments/20140814/f7976c6f/attachment-0001.bin>


More information about the tex4ht mailing list