[tex-live] German characters stopped PDF generating

Mon Oct 12 12:41:12 CEST 2009

Robin Fairbairns writes:
 > Steven Woody <narkewoody at gmail.com> wrote:
 > 
 > > I am not sure this is a pdftex problem or a doxygen proble, so I post
 > > on both list and hope you can understand.
 > > 
 > > I was using Doxygen + DOT to generate PDF document for my C codes.
 > > Some of my C code have comments in German, so when I run 'pdflatex' on
 > > the Doxygen generated tex source file 'refman.tex', I received the
 > > following error:
 > > 
 > > <use d0/d5e/UPF__Appl_8h_8747a96c8ad4cf8687580c3158d11d9a_icgraph.pdf> [3840 <.
 > > /d0/d5e/UPF__Appl_8h_72b8343e0ba329905a174c6644d1e272_cgraph.pdf> <./d0/d5e/UPF
 > > __Appl_8h_8747a96c8ad4cf8687580c3158d11d9a_cgraph.pdf>]
 > > 
 > > ! Package inputenc Error: Unicode char \u8:ón:C not set up for use with LaTeX.
 > > 
 > > See the inputenc package documentation for explanation.
 > > Type  H <return>  for immediate help.
 > >  ...
 > > 
 > > l.504 ...--------------------------- Descipción: C
 > >                                                   hequea la validez de una t...
 > > 
 > > ?
 > 
 > at first sight, it looks like a proble in inputenc (perhaps mis-globbing
 > a unicode character), but it's quite impossible to tell for sure without
 > some code one might actually exercise.

well, on first sight it looks to me like an encoding problem on Steve's
machine rather than a bug anywhere.

what you see is that

 ó 

is interpreted as a UTF8 character because the document has

\usepackage[utf8]{inputenc}

set in the preamble (guess herem but that is what must be the case). However,
the document must have been encoded in some 8bit code page and not in UTF8 at
all.

Which code page I have no idea, but one that has ó (\'o) in position octal 303

as a result: LaTeX sees octal 303 (which is the start character for an
extended utf8 encoded character) it then looks up the next 3 8bit chars to
see which UTF8 character it has (picking up "n:C" (the space between : and C
is ignored as it can't appear in a UTF8 char sequence)) and then finds that
this doesn't form a valid UTF8 char it knows about (not surprising really if
the document is not in UTF8) and that gives you the error message:

 Unicode char \u8:ón:C not set up 

so the error is that Steve's document contains

\usepackage[utf8]{inputenc}

when his platform is not saving files as utf8

the fact that other documents may have worked in the past is probably due to
using English language (ie nothing other than ASCII chars - which are
typically all in the same position in different code pages including utf8

by the way, the comments aren't at all in German I can tell you that :-) looks
like spanish to me?

frank