2014-07-05 15:29 GMT+02:00 Klaus Ethgen <Klaus+texlivelist at ethgen.ch>:
> Hi,
> Am Fr den  4. Jul 2014 um  9:42 schrieb Robin Fairbairns:
>> no, it's correct: iso 8859-1 has no "forbidden" octets (it does, iirc,
>> have some unassigned ones)
>> whereas
>> utf-8 rejects some octets in some contexts, since it's generating a
>> 32-bit glyph from 8-bit input.  (it's complicated.  honest.)
> See the output from the following line:
>   perl -MEncode -e 'for (my $i=0; $i < 256; $i++){printf "%d:\t%s\t%s\t0x%s\t0x%s\n", $i, chr($i), encode("UTF-8", decode("iso-8859-1", chr($i))), unpack("h*", chr($i)), unpack("h*", encode("UTF-8", decode("iso-8859-1", chr($i))));}' | less
> I use less as not all characters are nice to the terminal. On a full
> 8bit terminal that will output the latin1 char, the char after converted
> to UTF-8 and the hex value of both. On a UTF-8 terminal that will most
> likely not work (or perl tell you that ther is something wrong) or
> simply don't show any in the first column as it cannot display all
> octeds that can be in latin1.
Please, read and try to understand the unicode standard and the utf-*
transport formats. Your complaints are the same as if you complained
that using a screwdriver you are not able to unscrew the nail but you
can pull it out by pliers, therefore screwdrivers are absolutely
useless and should not ever be produced.

Zdeněk Wagner

