[luatex] Problem with latin modern fonts
Heiko Oberdiek
heiko.oberdiek at googlemail.com
Tue Aug 23 11:45:52 CEST 2011
On Tue, Aug 23, 2011 at 10:16:59AM +0200, Ulrike Fischer wrote:
> If I run the following document with lualatex (or xelatex) and then
> copy and paste the text from the pdf (in adobe reader 8.1.3)
> everything is fine. But if I uncomment the line with
> ^^^^006f^^^^0308 (o + dieresis) then the pdf still looks fine but
> all accents disappear when I try to copy the text. I get this:
>
> aous AOU e e
> o.
>
> Only characters on the same page as the ^^^^006f^^^^0308 are
> affected.
>
> It looks like a latin modern bug for me (the effect disappears with
> arial). Can someone confirm it? (I'm using the latin modern font in
> miktex, version seems to be 2.004 from 2009)
>
> \documentclass{book}
> \usepackage{fontspec}
> %\setmainfont{Arial}
> \begin{document}
>
> äöüß ÄÖÜ é è
>
> %^^^^006f^^^^0308
>
> \end{document}
Using TL 2011:
LuaTeX beta-0.70.1-2011061416 (rev 4277)
Latin Modern 2.004
I get a correct first line, but an "o" without dieresis at the end:
AR7/Linux:
äöüß ÄÖÜ é è
o
pdftotext 3.00:
1 äöüß ÄÖÜ é è o
ps2ascii (ghostscript 9.04):
*** Warning: composite font characters dumped without decoding.
And the output is garbled, because the entry /ToUnicode is not
evaluated.
In the pdf file I found (\pagestyle{empty}):
/F16 9.96264 Tf 1 0 0 1 121.813 708.054 Tm
[<00A001BA0243013D>-333<009F>28<01B90
242>-333<00FB>-333<0115>]TJ
1 0 0 1 121.813 696.099 Tm [<005100EE>]TJ
And the mappings:
11 beginbfchar
<0051> <006F>
<009F> <00C4>
<00A0> <00E4>
<00EE> <0308>
<00FB> <00E9>
<0115> <00E8>
<013D> <00DF>
<01B9> <00D6>
<01BA> <00F6>
<0242> <00DC>
<0243> <00FC>
* Why are the glyph positions moved thus that each program that
doesn't look into /ToUnicode will fail with sensless output?
* The latter composition "o+dieresis" is not replaced by the
glyph "odieresis". In the documentation of LuaTeX I found:
| 2.3 UNICODE text support
| ...
| Normalization of the Unicode input can be handled by a macro package
| during callback processing (this will be explained in section 4.1.2).
This seems not be done by fontspec. Can this enabled/supported by fontspec
or does another package exists for Unicode normalization?
Yours sincerely
Heiko Oberdiek
More information about the luatex
mailing list