# [tex4ht] unicode and lualatex

Johannes Wilm mail at johanneswilm.org
Sat Jul 23 12:25:03 CEST 2011

On Sat, Jul 23, 2011 at 2:46 AM, Ulrike Fischer <news3 at nililand.de> wrote:

> WARNING: This e-mail has been altered by the NFIT virus/spamfilter.  Please
> see below for a record of the changes made.
> . In case of problems consider contacting the sender or
> postmaster at nfit.au.dk
>
> -------Change report:
>
> An attachment named xhluatex.bat was removed from this document as it
> constituted a security hazard.  If you require this document, please
> contact
> the sender and arrange an alternate means of receiving it.
>

Could you send me the attachment off-list or paste the contents into an
email?

>
> Your main problem has nothing to do with tex4ht. While luatex can
> handle utf8 *input* natively it has problems to output
> non-ascii-chars without fontspec and "unicode fonts" on the output
> side.
>
> Your document is using OT1-encoded fonts (which has 128 characters)
> and so your non-ascii-chars are ending in nothingness. With
> \usepackage[T1]{fontenc} result will be better but quite a lot chars
> will be wrong (e.g. the german ك)
>
>
Oh, I thought I could use at least the first 256 characters. 128 is a bit
limited for sure.

btw -- would it then make sense to auto-replace the characters in question
before and after the transition? I am thinking of:

*cp unicode.tex /tmp*
*cd /tmp*
*rpl "ü" "ue5394" unicode.tex*
*dvilualatex...*
*.... *
*rpl "ue5394" "ü" unicode.html*

in which 5394 just is a random number so that I don't catch other instances
of "ue" when converting back. Hyphenation isn't applied, so it seems that
this would work, right?

--
Johannes Wilm
http://www.johanneswilm.org
tel: +1 (520) 399 8880
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/tex4ht/attachments/20110723/aa187e4c/attachment.html>