[l2h] questions about making perl style files and defining sectioning commands

Ross Moore ross@ics.mq.edu.au
Tue, 30 Jul 2002 19:46:00 +1000 (EST)


Hello Rita,

> 
>    Dear latex2html creators,
> 
>     I'm very happy with your program. And I have some questions:
> 
> 
>   - I used a \newcommand{} to define a command to be a sectioning
>     command, like:
> 
>              \newcommand{\bar}[1]{\section*{#1}...more code} 

 \bar  is already known to (La)TeX, as an accent in math-mode.
Redefining it is sure to cause problems, 
though perhaps only a warning message in LaTeX.

However, accents are tricky beasts for LaTeX2HTML,
perhaps requiring special entity representations in the final document;
so they are treated in a special way.
Such a redefinition is bound to cause problems, since it will not
automatically remove  'bar'  from the list of accents.

When the macro expansion contains a sectioning command, this will further
compound the difficulties.



>     And latex2html could not cope with it. I think this is because
>     first all newcommands are read, then splitting on sections is
>     performed and then newcommands are processed, so for my sectioning
>     command it is too late. ( But I'm not sure, I even had to learn perl 
>     to read latex2html)
>
>     How could I solve this?

Try choosing a different name for your macro.
Perhaps then your coding will work better.

> 
>   - I would like to define some of my latex style files in perl for latex2html.
>     As examples I checked what was already there, e.g.:
> 
>         sub do_cmd_H{join('',"H", $_[0]);}  
> 
>     Why does this command get as its argument the whole rest of the textpartition?        

It is not the string itself, but a pointer to it.
The subroutine de-references this pointer and extracts the (LaTeX) arguments
that it needs.

This way knowledge of the LaTeX command resides in just one place;
that is, the  &do_cmd_<name>  subroutine.

To do it differently would require both a subroutine and extra book-keeping
about how many arguments to expect and whether there are optional arguments
and other types of patterns. In short, it would be much harder to extend
LaTeX2HTML for packages than with the current coding.

As it is, some commands *do* require extra book-keeping;
because the ouput can be dependent on a number of diverse factors.
It is desirable to keep these to a minimum.


>     For me it would be easier if commands would just get all their real LaTeX
>     arguments and in the perl command definition I would only have to define a
>     command in terms of more basic perl commands.

It would be nice if it were that simple...

>     I would think that with some greppes of $next_pair_rx until there is none more
>     at every command this would easily be accomplished. But I don't know anything
>     about speed or ...? 
>     I'm curious as to what is the reason and would like some suggestions on how 
>     to easily program latex2html style files.

 ...but HTML is much stricter than TeX when it comes to tagging rules;
that is, concerning what can be nested inside what other tags.
TeX has vertical/horizontal modes, whereas HTML has 'block-level'
and 'text-level' modes.

In LaTeX, a font-switch or color-command is orthogonal to the modes,
whereas in HTML the corresponding tags have a definite mode, and must
always be correctly nested.  So the LaTeX coding cannot be assumed
to be 'clean' with respect to HTML's rules.

Thus it is necessary to keep track of what tags are open
for the current portion of the document source.
Sometimes the order of tags needs to be altered, so that everything
is properly nested. More frequently, several tags need to be closed
then reopened; e.g. for new paragraphs caused by blank lines.



When LaTeX2HTML encounters a control-sequence, such as  \textbf ,
it calls a subroutine  &process_command  with 'textbf' as parameter.
This does some work then ultimately calls  &do_cmd_textbf  with two
arguments, the first of which is a (pointer to) the following text string. 
        ^

The 2nd argument is a (pointer to) some tagging history, that may need to
be adjusted. Most macros do *not* need this history ...

> 
>   - A question about perl:
> 
>       join('', &do_cmd_arabic("${O}0${C}chapter${O}0$C"), ".", @_[0]) }
> 
>     why is @_[0] used and not $_[0] ? 
>     I understand that the former would be an array partition of 1, but why?    
>    
>   - An other question about perl:
> 
>        local($_) = @_;    
> 
>     What does this do?  Isn't the left side a scalar, which could cause loss 
>     of arguments if the argumentlist is longer than 1 ?   

   ...yes indeed; that is intentional for the subroutines corresponding to macros
that need to make no adjustments to the current list of open tags.


Other subroutines use both arguments; e.g.

sub do_cmd_rm { # clean
    my ($str, $ot) = @_;
    $ot = $open_tags_R unless(defined $ot);


>     why is @_[0] used and not $_[0] ?
>     I understand that the former would be an array partition of 1, but why?


For the special '_' name these give the same thing.
LaTeX2HTML contains coding written by various authors, who have used
different styles, and wrote when different versions of Perl were current.
Thus there is not complete consistency throughout the scripts.
It's pretty much a case of *if it ain't broke, don't fix it*.
($_[0] is the correct syntax now, with @_[0] deprecated, I think.)


> 
> 
>                              Thanks very much for any help!

Hope this helps,

	Ross Moore

    
>                                                           Rita Bijlsma
> 
> 
> -- 
>   Rita Bijlsma   BijR@oce.nl   tel: (31) 77 359 4797   loc: 3G.62.3 R&D
>   Oc_-Technologies B.V. Venlo, The Netherlands      http://www.oce.com/
> 
> _______________________________________________
> latex2html mailing list
> latex2html@tug.org
> http://tug.org/mailman/listinfo/latex2html