[tex4ht] [bug #242] Spurious semicolon produced by \textregistered command
michal.h21 at gmail.com
Thu Jan 22 21:51:52 CET 2015
2015-01-22 19:21 GMT+01:00 Karl Berry <karl at freefriends.org>:
> characters which should be replaced with entites in \url commands.
> I think that's right.
> this declaration is used only when `url-il2-pl` command line option is
> used. not all special characters are declared. now, the problem is
> with lualatex, as it is unicode engine, it reports invalid utf-8
> sequence even if it doesn't use url-encoders at all.
> as it is unlikely that anybody uses still latin2 encoding and special
> characters in urls at the same time, and given that list of these
> escaped special characters isn't comprehensive anyway, maybe we should
> take that away? because it causes compile errors every time tex4ht is
> used with lualatex.
> Oh. Clearly we need to solve it somehow, but I don't much like the idea
> of getting rid of functionality, even something as obscure and probably
> little-used as this. Plenty of people still use Latin N encodings, and
> there is an active TeX community in Poland -- I surmise that's who asked
> for that option in the first place.
OK. clearly someone needed it, as it is only configuration provided
for any input encoding for url-encoder.
> LuaTeX can certainly read files in any encoding, including plain bytes,
> not just UTF-8. I'm afraid I don't have any recipes at hand, though
> it seems like it should be doable.
it is possible to use luatex's callback to convert read file to utf8
on the fly. I did that when I tried to use callbacks to write html
directly from LuaLaTeX:
> But a simpler idea comes to mind: how about replacing the problematic
> characters with TeX's ^^xx notation? I'm not sure if the conversions
> will happen at the right time, given that \url is changing everything
> around anyway, but we can wait and see if anyone notices. At least it
> would go through at the input level and is one step beyond just deleting it.
> Another idea is to move that chunk of input to a separate file, which
> only gets read when that option is in effect.
that might be perhaps the best solution?
More information about the tex4ht