[l2h] lowercase_tags does attributes imperfectly

Stephen Gildea gildea@stop.mail-abuse.org
Fri, 15 Nov 2002 20:47:48 -0500


The regular expression in &lowercase_tags doesn't handle multi-word
attribute values nor the attribute following a minimized attribute.
For example, it turns this:

<link rel="ToC" title="Lessons Table of Contents." href="./" ONE TWO>

into this:

<link rel="ToC" title="Lessons table of Contents." href="./" one TWO>

Notice that the word "Table" has been incorrectly lowercased, while
TWO has failed to be lowercased.


The following diff against v2002-2-1 (1.70) corrects this problem.

Running lowercase_tags over the document is a bit of a kludge; the
output should be lowercase tags by default.  This diff takes another
(small) step in that direction by changing some of the tags (mostly in
the head) to lowercase.

Finally, this diff puts quotes around a table attribute that was missing
them.


--- latex2html-2002-2-1/latex2html.pin	Fri Aug 23 01:15:01 2002
+++ latex2html.pin	Fri Nov 15 11:16:18 2002
@@ -7125,18 +7125,18 @@ sub make_head_and_body {
 	, "* revised and updated by:  Marcus Hennecke, Ross Moore, Herb Swan"
 	, "* with significant contributions from:"
 	, "  Jens Lippmann, Marek Rouchal, Martin Wilck and others"
-	    . " -->\n<HTML>\n<HEAD>\n<TITLE>".$title."</TITLE>"
+	    . " -->\n<html>\n<head>\n<title>".$title."</title>"
 	, &meta_information($title)
 	,  ($CHARSET && $HTML_VERSION ge "2.1" ? 
-	      "<META HTTP-EQUIV=\"Content-Type\" CONTENT=\"text/html; charset=$this_charset\">" 
+	      "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=$this_charset\">" 
 	      : "" )
 	, $LATEX2HTML_META
-	, ($BASE ? "<BASE HREF=\"$BASE\">" : "" )
+	, ($BASE ? "<base href=\"$BASE\">" : "" )
 	, $STYLESHEET_CASCADE
-	, ($STYLESHEET ? "<LINK REL=\"STYLESHEET\" HREF=\"$STYLESHEET\">" : '' )
+	, ($STYLESHEET ? "<link rel=\"STYLESHEET\" href=\"$STYLESHEET\">" : '' )
 	, $more_links_mark
-	, "</HEAD>" , ($before_body? $before_body : '')
-	, "<BODY $body>", '');
+	, "</head>" , ($before_body? $before_body : '')
+	, "<body $body>", '');
 }
 
 
@@ -7235,7 +7235,7 @@ sub clear_styleID {
 
 sub make_address { 
     local($addr) = &make_real_address(@_);
-    $addr .= "\n</BODY>\n</HTML>\n";
+    $addr .= "\n</body>\n</html>\n";
     &lowercase_tags($addr) if $LOWER_CASE_TAGS;
     $addr;
 }
@@ -7528,7 +7528,7 @@ sub lowercase_tags {
     my ($tags,$attribs);
     $_[0] =~ s!<(/?\w+)( [^>]*)?>!
 	$tags = $1; $attribs = $2;
-	$attribs =~ s/ ([\w\d-]+)(=| |$)/' '.lc($1).$2/eg;
+	$attribs =~ s/ ([\w\d-]+)(=([^\"][^ ]*|\"[^\"]*\"))?/' '.lc($1).$2/eg;
 	join('', '<', lc($tags) , $attribs , '>')!eg;
 }
 
@@ -12464,7 +12464,7 @@ sub make_multipleauthors_title {
     local ($t_title,$auth_cnt) = ('',0);
     if ($MULTIPLE_AUTHOR_TABLE) {
 	$t_title = '<TABLE' .($USING_STYLES? ' CLASS="author_info_table"' : '')
-		.' WIDTH="90%" ALIGN="CENTER" CELLSPACING=15>'
+		.' WIDTH="90%" ALIGN="CENTER" CELLSPACING="15">'
 		."\n<TR VALIGN=\"top\">";
     }
     foreach $t_author (@authors) {
@@ -15521,8 +15521,8 @@ sub initialise {
     $delim = '%:%';		# Delimits items of sectioning information
 				# stored in a string
 
-    $LATEX2HTML_META = '<META NAME="Generator" CONTENT="LaTeX2HTML v'.$TEX2HTMLV_SHORT.'">'
-	. "\n<META HTTP-EQUIV=\"Content-Style-Type\" CONTENT=\"text/css\">"
+    $LATEX2HTML_META = '<meta name="Generator" content="LaTeX2HTML v'.$TEX2HTMLV_SHORT.'">'
+	. "\n<meta http-equiv=\"Content-Style-Type\" content=\"text/css\">"
 	      unless ($LATEX2HTML_META);
 
     $TeXname = (($HTML_VERSION ge "3.0")? "T<SMALL>E</SMALL>X" : "TeX");