[tex4ht] wheezy: how to get correct paragraph tags?

Michal Hoftich michal.h21 at gmail.com
Thu Aug 14 20:02:34 CEST 2014


Hi,

for tex4ht, it  is important to use \document and \enddocument
somewhere in the document, so hooks for inserting html head and
closing tags for <body> and <html> can be called. See this post at
CVR's blog for some working plain sample (backslashes were stripped
from the post, unfortunately) :

http://www.cvr.cc/?p=752


In fact, you don't have to use amstex and define these macros to be
empty. I created simple TeX file, plain-4ht.tex

-----------------
\def\documentstyle#1{}
\documentstyle{tex4ht}
\csname tex4ht\endcsname
\def\document{}
\def\enddocument{\csname bye\endcsname}
-----------------

and modified your file:

-----------------
\input plain-4ht.tex
\document
\noindent The wind blows itself out shortly after dawn; the rain lasts a
little longer: when I finally talk myself into getting out of bed at half
past seven, it is still falling, lightly now, a barely audible patter, no
longer the thudding, wind-driven sile of the night.

I draw back the curtain. The sky is gray, but the clouds are high. After
fifty years on the island, I can read the harbingers as well as do the
animals. The rain will stop soon, in an hour or two, by mid-morning at the
latest. The thought cheers me.

It is early October, and last night's storm was the first of the season.
Anyone who has been thinking of making a late-season visit to the island
should be adequately discouraged now. Yesterday's trickle of daytrippers
will be the last until March or April. Winter on the island has begun.
\enddocument
-------------------

In now compiles correctly with command

     httex five "xhtml,html4.4ht,unicode.4ht,mathml.4ht"

--------------------

<?xml version="1.0" encoding="iso-8859-1" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!--http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd-->
<html xmlns="http://www.w3.org/1999/xhtml"
>
<head>

    <title>plains.html</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta name="generator" content="TeX4ht (http://www.tug.org/tex4ht/)" />
<meta name="originator" content="TeX4ht (http://www.tug.org/tex4ht/)" />
<!-- xhtml,html4.4ht,unicode.4ht,mathml.4ht,html -->
<meta name="src" content="five.tex" />
<meta name="date" content="2014-08-14 19:57:00" />
<link rel="stylesheet" type="text/css" href="five.css" />
</head><body
>
<!--l. 6--><p class="noindent" >The wind blows itself out shortly
after dawn; the rain lasts a little longer: when I finally talk myself
into getting out
of bed at half past seven, it is still falling, lightly now, a barely
audible patter, no longer the thudding, wind-driven
sile of the night.
</p><!--l. 11--><p class="indent" >    I draw back the curtain. The
sky is gray, but the clouds are high. After fifty years on the island,
I can read the
harbingers as well as do the animals. The rain will stop soon, in an
hour or two, by mid-morning at the latest. The
thought cheers me.
</p><!--l. 16--><p class="indent" >    It is early October, and last
night’s storm was the first of the season. Anyone who has been
thinking of making
a late-season visit to the island should be adequately discouraged
now. Yesterday’s trickle of daytrippers will be the
last until March or April. Winter on the island has begun.

</p>

</body></html>
------------------


Best regards,
Michal

2014-08-14 19:14 GMT+02:00 D. R. Evans <doc.evans at gmail.com>:
> I don't recall ever running into this question when running *buntu, but I'm
> certainly encountering it in wheezy, and after a couple of days working on it,
> still haven't been able to work around it.
>
> I have a plain TeX file, five.tex, that compiles correctly with pdftex.
>
> My notes say that in the past, the first step in creating an epub file was to run:
>  httex five.tex "xhtml,html4.4ht,unicode.4ht,mathml.4ht"
>
> When I do that, there are no obvious errors, and five.html is created. So far
> so good, but when I look inside five.html, I see, for example:
>
> ----
>
> class="ec-lmr-10x-x-105">The wind blows itself out shortly after dawn; the
> rain lasts a little longer:</span>
> <span
> class="ec-lmr-10x-x-105">when I finally talk myself into getting out of bed at
> half past seven, it is</span>
> <span
> class="ec-lmr-10x-x-105">still falling, lightly now, a barely audible patter,
> no longer the thudding,</span>
> <span
> class="ec-lmr-10x-x-105">wind-driven sile of the night.</span>
> <span
> class="ec-lmr-10x-x-105">I draw back the curtain. The sky is gray, but the
> clouds are high. After</span>
> <span
> class="ec-lmr-10x-x-105">fifty years on the island, I can read the harbingers
> as well as do the animals.</span>
> <span
> class="ec-lmr-10x-x-105">The rain will stop soon, in an hour or two, by
> mid-morning at the latest.</span>
> <span
> class="ec-lmr-10x-x-105">The thought cheers me.</span>
> <span
> class="ec-lmr-10x-x-105">It is early October, and last night’s storm was
> the first of the season.</span>
> <span
> class="ec-lmr-10x-x-105">Anyone who has been thinking of making a late-season
> visit to the island</span>
> <span
> class="ec-lmr-10x-x-105">should be adequately discouraged now.
> Yesterday’s trickle of daytrippers</span>
> <span
> class="ec-lmr-10x-x-105">will be the last until March or April. Winter on the
> island has</span>
> <span
> class="ec-lmr-10x-x-105">begun.</span>
>
> ----
>
> The TeX source code that generates this is:
>
> ----
>
> \noindent The wind blows itself out shortly after dawn; the rain lasts a
> little longer: when I finally talk myself into getting out of bed at half
> past seven, it is still falling, lightly now, a barely audible patter, no
> longer the thudding, wind-driven sile of the night.
>
> I draw back the curtain. The sky is gray, but the clouds are high. After
> fifty years on the island, I can read the harbingers as well as do the
> animals. The rain will stop soon, in an hour or two, by mid-morning at the
> latest. The thought cheers me.
>
> It is early October, and last night's storm was the first of the season.
> Anyone who has been thinking of making a late-season visit to the island
> should be adequately discouraged now. Yesterday's trickle of daytrippers
> will be the last until March or April. Winter on the island has begun.
>
> ----
>
> The problem is that the paragraphing information seems to have been completely
> lost.
>
> If I compare this to a (different) book generated last year under Kubuntu, the
> html file looked like this:
>
> ----
>
> </p><!--l. 12302--><p class="indent" ><span
> class="ec-lmr-10x-x-105">As we walked towards the restaurant, there was a
> distinct hint of</span>
> <span
> class="ec-lmr-10x-x-105">autumn in the air. My mind went back to the dales.
> The farmers would be</span>
> <span
> class="ec-lmr-10x-x-105">on the lookout for winter up there, ready for the
> onslaught of the</span>
> <span
> class="ec-lmr-10x-x-105">coming cold. That is one thing about living in Oxford
> for which I</span>
> <span
> class="ec-lmr-10x-x-105">am grateful: real, wind-driven, bone-numbing cold is
> unknown</span>
> <span
> class="ec-lmr-10x-x-105">here, and snow is rare. Still, I miss the freedom of
> the dales and</span>
> <span
> class="ec-lmr-10x-x-105">fells.</span>
> </p><!--l. 12309--><p class="indent" ><span
> class="ec-lmr-10x-x-105">We passed Somerville and arrived at the entrance of
> </span><span
> class="ec-lmro-12x-x-87">Brown’s </span><span
> class="ec-lmr-10x-x-105">shortly</span>
> <span
> class="ec-lmr-10x-x-105">before twenty past seven. There was no queue yet, and
> I wanted to look at</span>
> <span
> class="ec-lmr-10x-x-105">the the menu outside the restaurant.</span>
> </p><!--l. 12313--><p class="indent" ><span
> class="ec-lmr-10x-x-105">“We can do that inside,” insisted
> Jonathan, and it was just as well he</span>
> <span
> class="ec-lmr-10x-x-105">did, for at least two other couples would have
> sneaked in front of us had we</span>
> <span
> class="ec-lmr-10x-x-105">dallied outside.</span>
> </p>
>
> ----
>
> As you can see, this has valid <p></p> tags to indicate the paragraphs. My
> notes don't indicate that I did anything special to make this happen.
>
> So either my notes are missing an important step, or wheezy is behaving
> differently from Kubuntu. In either case, my question is (obviously): what do
> I need to do in order to get correct paragraph tags in the resultant HTML file
> after running httex?
>
> If any of this isn't clear, please let me know and I'll try to explain further.
>
>   Doc
>
> PS I'm sure it doesn't make a difference, but this is 64-bit wheezy, with only
> standard TeX-related stuff from the official debian repositories.
>
> --
> Web:  http://www.sff.net/people/N7DR
>


More information about the tex4ht mailing list