[tex4ht] unicode and lualatex

Johannes Wilm mail at johanneswilm.org
Sun Jul 24 07:21:59 CEST 2011


Hey again,

Ok, I think I found a quick fix that "really works" on linux. On ubuntu you
need the package uni2ascii.

*uni2ascii -a E unicode.tex > unicode-a.tex*
*dvilualatex unicode-a.tex*
*tex4ht -f/unicode-a.tex*
*t4ht -f/unicode-a.tex*
*ascii2uni -a E unicode-a.html | ascii2uni -a H > unicode.html*

This will produce a utf-encoded output file. The second decoding there is
for characters that have added as html-entities by tx4ht. If we use utf-8
characters anyways, we can as well do it all the way. If we want an
ascii-file instead (for older browsers?), we can instead do:

*ascii2uni -a E unicode-a.html | uni2ascii -a H > unicode.html*

The very first line of the script has to be executed for all input files if
there should be more of them. This assumes that there are only unicode
characters used in the body text and not the latex-markup. It also assumes
that nowhere in the text there is to be found a substring that starts with a
capital U and has 4 hexadecimal numbers following it.


With those exceptions,I believe this should take care of all unicode
characters.

-- 
Johannes Wilm
http://www.johanneswilm.org
tel: +1 (520) 399 8880
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/tex4ht/attachments/20110723/146d7843/attachment-0001.html>


More information about the tex4ht mailing list