[l2h] The L2H 2002 Cannot deal CJK document correctly!

Werner LEMBERG wl@gnu.org
Wed, 24 Apr 2002 11:15:57 +0200 (CEST)


> The fix is easy, but first a question.
> You example HTML files correctly have  charset = text/big5 .
> Where is this done in the processing, or do you do it yourself
> after LaTeX2HTML has finished ?
> 
> By simply inserting 2 lines into  CJK.perl  the problem
> is fixed, and this charset is set automatically:
> 
> 
> 	package main;
> 
> 	$charset = 'big5'; 	## insert these 2 lines
> 	$CHARSET = 'big5';	##
> 
>         sub pre_pre_process {
>         ...
> 	...

The encoding must be deduced from the second argument of the CJK
environment!  For example

  \begin{CJK*}{Bg5}{...}

should create a `big5' charset tag.

Here a list of the used character sets/encodings

  Bg5     Big5
  Bg5+    Big5+
  GB      GB 2312-1980
  GBt     GB/T 12345-1990
  GBK
  JIS     JIS X 0208:1997
  SJIS
  KS      KS X 1001:1992
  UTF8
  EUC-TW
  EUC-JP

[I don't know which of them are allowed in HTML files.]

Since CJK can use more than a single encoding in a file, you should
also add a guard to prevent that (with an error message like: `Only
one encoding allowed in a single document').


    Werner