[l2h] Re: \H{} diaresis not working in l2h

Ross Moore ross at ics.mq.edu.au
Sat Feb 1 14:22:19 CET 2003


Hi Aaron,

> ő in Unicode does Hungarian-umlauted "o".
> 
> I've basically abandoned Latin-1.... PlanetMath uses UTF-8 Unicode.  All
> modern browsers support this (and the page can force the encoding both in
> the http header and with META tags, taking care of the case where the
> user sets another charset as default).

In that case, you should be using the command-line switches:

 latex2html -html_version 4.0,latin1,unicode,utf8,.....
                                     ^^^^^^^^^^^^

or have the $HTML_OPTIONS variable set to include these.

While you are thinking about this topic, please test this.
These extensions have been available for a long time now,
but I've not seen many examples of web-pages using them.

 
> In the last 3 years, I'd say the installed base of software has shifted to
> being able to handle Unicode, so I would definitely advise making it
> supported and default.

It would be nice to think that most of your audience is keeping up.
Hopefully that is true, at least in the more affluent countries.

> This page is a useful resource:
> 
>  http://www.unicode.org/charts/
> 
> The offending "o" is on the "Latin Extended-A" chart.

With the 'latin1,unicode'  options as above
 (actually the 'latin1' should be redundant)
then LaTeX2HTML should catch the \H{o} and replace it by ő

With also 'utf8' (or $USE_UTF8 = 1; )
then this ő should be replace by a 2-byte sequence, later in
the processing.


If you cannot get this to work, then report back to me with an example
(and a URL to the bad results).


Hope this helps,

	Ross

> Aaron
> 
> On Sat, Feb 01, 2003 at 12:15:07PM +1100, Ross Moore wrote:
> > 
> > Hi Aaron,
> > 
> > > Hi Ross, do you know if anything can be done about this issue:
> > > 
> > >  http://bugs.planetmath.org/cgi-bin/bugzilla/show_bug.cgi?id=101
> > 
> > \H{<letter>}  specifies the Hungarian umlaut.
> > 
> > So far as I know, there are no special entity names or numbers
> > for characters using this accent --- certainly not in ISO-8859-1
> > but maybe there is something in another encoding.
> > 
> > In TeX, the only way to support this (that I know of) is by making
> > an image of the required accented character.
> > In LaTeX2HTML, you need to set $ACCENT_IMAGES to obtain this;
> > e.g. $ACCENT_IMAGES = 'textrm';
> > so that you get a roman font in the image.
> > ($ACCENT_IMAGES = 'textit';  would give italiced accented chars.)
> > 
> >  
> > > I assumed at first that it was a bug, but maybe there's a reason it
> > > can't be done (is there any HTML entity support for it?)
> > 
> > Are there Unicode points for letters with Hungarian umlauts ?
> > If so, I've never been advised of what these are.
> > 
> > In any case, do browsers support these code-points?
> > (If not, then there is no point in using them, at this stage.)
> > 
> > 
> > I've not readdressed this problem for ~3 years.
> > Perhaps in that time new possibilities have arisen.
> > If so, please inform me of what these are.
> > 
> > 
> > Hope this helps,
> > 
> > 	Ross Moore
> > 
> > > Aaron
> > 



More information about the latex2html mailing list