[l2h] How to change 'charset'?

Ross Moore Ross Moore <ross@ics.mq.edu.au>
Sat, 4 Nov 2000 09:06:02 +1100 (EST)


Inel wrote:

> I am using v99.2b8 now.
> But, I cann't change  the charset to 'iso-2202-kr' (EUC-KR).
> ($CHARSET = 'iso-2202-kr'; in .latex2html-init)
> 
> What should I do?
> Thanks in advance.

That is a very timely question.
Simply setting the $CHARSET variable is not enough.

THe reason for this is that LaTeX2HTML is expecting to translate
special characters and ligature combinations into the requested
character-set; to do this requires loading a file which contains
appropriate Perl subroutines to perform these conversions.
When no such file is available, the $CHARSET is reverted to
being the standard  iso-8859-1 .

However, it *is* possible to disable that mechanism, allowing
an alternative charset to be used, without any support for special
characters or ligatures. Any non-ascii content will be placed into
the HTML files just "as it comes", as LaTeX2HTML has no rules to apply
for any sophisticated checking, or replacement rules.

So far I've only used LaTeX2HTML this way with 1 test file,
using a charset of  iso-8859-7  (for Greek)

It was achieved with the following settings in  .latex2html-init

########  for  .latex2html-init ###########

CHARSET='iso-8859-7';
$charset='iso-8859-7';

# do not convert raw characters to parameter entities:
sub replace_strange_accents { }

########  end of for  .latex2html-init ###########


How well this works with other character sets is completely unknown to me.
Please give it a try, and report the results to this email list.

> Best regards,
> inel

Hope this helps,

	Ross Moore