[tex-live] packages with characters > 127

Zdenek Wagner zdenek.wagner at gmail.com
Thu Dec 31 14:20:52 CET 2009

2009/12/31 Elie Roux <elie.roux at telecom-bretagne.eu>:
> Manuel Pégourié-Gonnard a écrit :
>> David Kastrup a écrit :
>>> I think that it is reasonable to have the _LaTeX_ format variant relying
>>> on LuaTeX to switch its input to byte-transparent by default, and have a
>>> specific inputenc package (with a different name) for LuaTeX revert this
>>> setting.
>>> Then the standard inputenc (and all files) will work just as good or bad
>>> as they do on other engines (namely using active characters and other
>>> contraptions), and a LuaTeX-specific inputencl or whatever is free to
>>> work better.
>> I like the idea. I'd like to review, improve and test luainputenc a bit
>> more,
>> and then discuss this idea on latex-l.
> I don't like the idea (if I understand it correctly) unless you rename the
> format: people who want to use the native utf8 shouldn't need to
> \usepackage{unactivatelatextrick}, it should be the default as it is for the
> engine, like under XeTeX... For me everything is fine as it is now, and will
> be even better when almost all packages will use ^^xx instead of 8-bit
> representation in their code.
^^xx is 8-bit representation but written in 7 bits. If you do not know
the encoding of the file, you are not able to read ^^xx properly. If
it's \catcode is set to 15, you get an error. If it's \catcode is 14,
you lose that character and everything till the end of the line. If
the character is in the body of a macro, you get strange errors when
the macro is expanded. Only LICR is safe (unless you redefine \v, \'

Encoding metadata will not help either. It should not only be machine
readable but also format and engine independent. Input is handled
differently by TCX tables, by the inputenc package, by encTeX. And the
TCX tables can define just one encoding, you cannot read files in
different encodings during a single TeX run. The inputenc package has
also weak points. If you load a package with an option and later load
the same package with another option, LaTeX issues an error. Thus if a
.sty file requires the inputenc package with one encoding and a user
needs another encoding for the document you get a conflict.

> The idea (on another branch of the thread) to make package writers do a
> \MyFileIsEncodedIn{} is, I think, useless as most new packages are UTF-8,
> and old packages won't want to use it... it may fix one very old package or
> two, but it's not worth the trouble.
> --
> Elie

Zdeněk Wagner

More information about the tex-live mailing list