[tex4ht] unicode and lualatex

Ulrike Fischer news3 at nililand.de
Sat Jul 23 11:46:52 CEST 2011

An embedded and charset-unspecified text was scrubbed...
Name: warning1.txt
URL: <http://tug.org/pipermail/tex4ht/attachments/20110723/a5aed21d/attachment.txt>
-------------- next part --------------
Am Fri, 22 Jul 2011 21:47:32 -0700 schrieb Johannes Wilm:

> Hi,
> On the attached test file I tried to run
> *
> *
> *dvilualatex unicode.tex*
> *dvilualatex unicode.tex*
> *dvilualatex unicode.tex*
> *tex4ht -f/unicode.tex -cunihtf -utf8*
> I cannot figure out as what the characters are encoded in the output, but it
> doesn't seem to be utf8. Output has been attached.

Your main problem has nothing to do with tex4ht. While luatex can
handle utf8 *input* natively it has problems to output
non-ascii-chars without fontspec and "unicode fonts" on the output

Your document is using OT1-encoded fonts (which has 128 characters)
and so your non-ascii-chars are ending in nothingness. With
\usepackage[T1]{fontenc} result will be better but quite a lot chars
will be wrong (e.g. the german ?) 

In normal latex the inputenc/fontenc-combo manages the
input-output-translation, but you can't use inputenc with luatex.

Your best bet is something like this:

% -*- mode: TeX -*- -*- coding: UTF-8 -*-


The following characters should be converted to Unicode:

Spanish: ??????????
German: ??????
Danish: ??????


Which gave for me on miktex an utf8 encoded html with this command

xhluatex.bat unicode "html,charset=utf-8" "-cunihtf -utf8"

and the attached xhluatex.bat (I hope .bat come through) (I had to
change it and my tex4ht.env compared to the standard versions, see
the discussion some weeks ago in the archive in the mailing list.)

I'm not quite sure yet about the correct "html"-option but it
doesn't matter for utf8-tests..

Ulrike Fischer 

More information about the tex4ht mailing list