[luatex] Problem with latin modern fonts

Heiko Oberdiek heiko.oberdiek at googlemail.com
Tue Aug 23 11:45:52 CEST 2011


On Tue, Aug 23, 2011 at 10:16:59AM +0200, Ulrike Fischer wrote:

> If I run the following document with lualatex (or xelatex) and then
> copy and paste the text from the pdf (in adobe reader 8.1.3)
> everything is fine. But if I uncomment the line with
> ^^^^006f^^^^0308 (o + dieresis) then the pdf still looks fine but
> all accents disappear when I try to copy the text. I get this:
> 
> aous AOU e e
> o.
> 
> Only characters on the same page as the ^^^^006f^^^^0308 are
> affected.
> 
> It looks like a latin modern bug for me (the effect disappears with
> arial). Can someone confirm it? (I'm using the latin modern font in
> miktex, version seems to be 2.004 from 2009)
> 
> \documentclass{book}
> \usepackage{fontspec}
> %\setmainfont{Arial}
> \begin{document}
> 
> äöüß ÄÖÜ é è 
> 
> %^^^^006f^^^^0308
> 
> \end{document}

Using TL 2011:
  LuaTeX beta-0.70.1-2011061416 (rev 4277)
  Latin Modern 2.004

I get a correct first line, but an "o" without dieresis at the end:

AR7/Linux:
  äöüß ÄÖÜ é è
  o
pdftotext 3.00:
  1 äöüß ÄÖÜ é è o
ps2ascii (ghostscript 9.04):
  *** Warning: composite font characters dumped without decoding.
  And the output is garbled, because the entry /ToUnicode is not
  evaluated.

In the pdf file I found (\pagestyle{empty}):
  /F16 9.96264 Tf 1 0 0 1 121.813 708.054 Tm
  [<00A001BA0243013D>-333<009F>28<01B90
  242>-333<00FB>-333<0115>]TJ
  1 0 0 1 121.813 696.099 Tm [<005100EE>]TJ

And the mappings:
  11 beginbfchar
  <0051> <006F>
  <009F> <00C4>
  <00A0> <00E4>
  <00EE> <0308>
  <00FB> <00E9>
  <0115> <00E8>
  <013D> <00DF>
  <01B9> <00D6>
  <01BA> <00F6>
  <0242> <00DC>
  <0243> <00FC>

* Why are the glyph positions moved thus that each program that
  doesn't look into /ToUnicode will fail with sensless output?

* The latter composition "o+dieresis" is not replaced by the
  glyph "odieresis". In the documentation of LuaTeX I found:
| 2.3 UNICODE text support
| ...
| Normalization of the Unicode input can be handled by a macro package
| during callback processing (this will be explained in section 4.1.2).

  This seems not be done by fontspec. Can this enabled/supported by fontspec
  or does another package exists for Unicode normalization?

Yours sincerely
  Heiko Oberdiek


More information about the luatex mailing list