[l2h] Debian Bug#355728: latex2html: please use typographic quotation marks
David Nebauer
davidnebauer at switch.com.au
Sat Jul 29 15:47:53 CEST 2006
I've been testing the advice given by Ross Moore regarding curly
quotation marks (2006/04/18) and "passing though" raw unicode (2003/07/06):
-------------------------------------------------------------------------
Curly quotation marks (2006/04/18):
===================================
$USE_CURLY_QUOTES = 1;
set this in an initialization file.
Also set the following
$USE_UTF=1;
OR
execute the job with options such as:
latex2html -split 0 -html_version 4.0,latin1,unicode,utf8 myfile.tex
4.0 = satisfy HTML 4.0 recommendations (4.1 might work for HTML 4.01)
latin1 = input encoding
unicode = use unicode code-points in the output
utf8 = use byte-sequences, rather than entity numbers (or names)
whenever appropriate.
Raw unicode (2003/07/06):
=========================
You may need to specify on the commandline something like:
latex2html -html_version 4.0,unicode ...other-options... <filename>
or
latex2html -html_version 4.0,unicode,utf8 ......
or even
latex2html -html_version 4.0,unicode,unicode ......
Basically, the problem will be that you do *not* want LaTeX2HTML
to assign special meaning to upper-8-bit codes and translate them
into something else.
-------------------------------------------------------------------------
In my testing I had three goals:
1. Output single quote marks as curly characters,
2. Output double quote marks as curly characters, and
3. Output raw unicode as unicode, e.g., —äß (em dash, a umlaut and
scharfe s).
Here are the results of my testing (display in monospace to align columns):
initialisation file html-version options single double raw
variable(s) quotes quotes
unicode
------------------------- ---------------------- ------ ------
-------
1. `' ``''
rubbish .1
2. USE_CURLY_QUOTES `' “”
rubbish .2
3. USE_CURLY_QUOTES USE_UTF ** ERROR
** .3
4. USE_CURLY_QUOTES latin1,unicode,utf8 `' “”
rubbish .4
5. USE_CURLY_QUOTES latin1,unicode,unicode `' “”
unicode .5
6. USE_CURLY_QUOTES USE_UTF latin1,unicode,utf8 `' “”
rubbish .6
7. USE_CURLY_QUOTES USE_UTF latin1,unicode,unicode ** ERROR
** .7
8. USE_UTF ** ERROR
** .8
9. USE_UTF latin1,unicode,utf8 `' ``''
rubbish .9
A. USE_UTF latin1,unicode,unicode ** ERROR
** .A
B. latin1,unicode,utf8 `' ``''
rubbish .B
C. latin1,unicode,unicode `' ``''
unicode .C
* Runs that errored terminated prematurely with the message: "Undefined
subroutine &main::convert_to_utf8 called at /usr/bin/latex2html line
7462." The latex2html version is '2002-2-1 (1.71)'.
In case this email's encoding gets screwed up in transmission, the runs
that resulted in curly double quotes were 2, 4, 5 and 6.
Some observations/conclusions:
- No method gave curly single quotes.
- The only method that output curly double quotes was the init file
variable "USE_CURLY_QUOTES".
- The only method that output raw unicode was html_version options
"unicode,unicode".
- The init file variable USE_UTF caused a fatal error unless 'utf8' was
included
as a 'html_version' option.
I'm curious to know two things. Firstly, is there is a way to get curly
single quote output from latex2html? Secondly, I couldn't find
documentation anywhere on USE_CURLY_QUOTES and USE_RTF after checking
the manual, perldoc, man and info files. Are there any other such
undocumented variables and, if so, where can I read up on them?
Regards,
David.
More information about the latex2html
mailing list