[XeTeX] handling malformed UTF-8 input

Mike Maxwell maxwell at umiacs.umd.edu
Sat Feb 23 03:09:16 CET 2008


Ross Moore wrote:
> If there was to be malformed data in the name field,
> this should *not* cause correctly formed UTF8 data in the
> subsequent address field to be displayed in a "bytes" mode.

Can you reliably recover from such an error in UTF-8 data?  That is, 
assume that there is a mal-formed byte where you're expecting the first 
byte of a UTF-8 character.  How do you know where the next (and possibly 
correct, possibly incorrect) UTF-8 character should begin?
-- 
    Mike Maxwell
    What good is a universe without somebody around to look at it?
    --Robert Dicke, Princeton physicist


More information about the XeTeX mailing list