[XeTeX] handling malformed UTF-8 input
Peter Dyballa
Peter_Dyballa at Web.DE
Thu Feb 21 12:32:13 CET 2008
Am 21.02.2008 um 11:12 schrieb Jonathan Kew:
> What do others think about this -- should "invalid UTF-8 byte
> sequence" be an error rather than a warning and fallback?
I'd like to write: make it an error starting with TeX Live 2010!
Right now XeTeX should behave in a more compatible mode and emit just
warnings.
In the end this or that process will fail, as already reported, so
there is no real compatibility mode in which XeTeX can work. And
since it might be able to produce something that works but is faulty
(10 % of code assumed as some senseless bytes?), producing an error
report and stopping work is more sensible.
IMO it's not that bad to include in some non-English language support
file comments in that language in non-7-bit US-ASCII. Those who will
use this supported language will be able to read and understand the
comments. The trouble comes with the TeX Live setup that uses a dozen
or more languages in its default setup *and* allows the use of
problematic characters. These two issues need a change from a XeTeX
point of view. It would be better if XeTeX would clean known
problematic files from their irritating comment lines – before
building a FMT file or such. The TeX code inside cannot be faulty ...
This could be a kind of compatibility mode until TeX Live 2010 is
released.
--
Greetings
Pete
A lot of us are working harder than we want, at things we don't like
to do. Why? ...In order to afford the sort of existence we don't care
to live.
– Bradford Angier
More information about the XeTeX
mailing list