# [pdftex] pdftex - Encoding for metafont PK fonts

Ross Moore ross.moore at mq.edu.au
Sun Jun 26 00:56:48 CEST 2016

Hello Pali, and Paul

On Jun 26, 2016, at 7:21 AM, Pali Rohár <pali.rohar at gmail.com<mailto:pali.rohar at gmail.com>> wrote:

Maybe PDF readers could think that font is not in Latin1, but in
Unicode as IIRC Unicode at positions 128-255 have same characters
as Latin1 encoding.

Unicode character U+00E8 is for sure 'è'. So I bet this is reason
why PDF reader thinks that I selected character 'è' and not 'č'.

For Type 1 PFB fonts (even in IL2 encoding) this is not a problem,
because for each characters there is stored unified glyph name and
there is standard conversion table from glyph name to unicode
character.

So probably in PK fonts is not any conversion table from 8bit
character to unicode character and so something (pdftex? PDF
reader?) assume either Latin1 or Unicode.

What you need is a CMAP resource, which gets associated
with the Font-Descriptor dictionary, not with the font itself.

This is what is done with PFA and PFB fonts, and others.
So I don’t see why you cannot also do this with PK fonts.

The question then becomes “who or what creates the CMAP?”.

pdfTeX has a primitive  \pdfgentounicode  which if set to 1 (or higher)
causes an attempt to create the CMAP internally, based upon glyph names
and using a standard list of font character names.
Extra info can be provided by the  \pdfglyphtounicode  primitive,
as you have encountered in an earlier posting.

But Metafont-produced pk-fonts tend to use lazy generic names for
characters, such as /a1, /a2, /a3, etc.
This can imply non-uniqueness across several fonts, so is likely to be
unsuitable if you need to provide CMAP resources for several fonts
using the same character names.

The alternative is to construct the full CMAP resource externally,
as a text file. Then the contents of this file is loaded into the PDF
using pdfTeX’s  \pdffontattr  primitive, which reads in the file as a stream,
and creates the correct dictionary entry.

For details on how this can be done in TeX coding, consult the package
cmap.sty  .
Look at files such as  ot1.cmap, t1.cmap, t2.cmap etc. for the structure
of the kind of data file that is needed. These encode the unicode
mapping of the numbered character slots in a font.

Also it may be helpful to examine the coding that I’ve included below,
for attaching a CMAP resource to Xy-pic directional fonts,
This assumes that LaTeX can find a private file called:  xyd.cmap .
Another primitive  \pdfnobuiltintounicode  disables the attempt to create
the CMAP internally.

Yes, I agree that this is likely the case.  I do know that in PK
fonts, there is only a character (or no character) for each of the
positions 0-255, with no character names or additional coding
information.

When specifying Type 1 PFB font, it needed to add it into pdftex font
map file (primitive \pdfmapfile). And map line allows to specify
encoding vector file. That file contains for each character 0-255
position glyph name. And pdftex primitive \pdfglyphtounicode then maps
glyph name to unicode character. So For PFB fonts it is possible to do
that 0-255 position to unicode mapping.

As outlined above.

But it is pity that it is not possible to specify that enc file also for
PK fonts generated by MetaFont. Or it is somehow possible?

I see no reason why not, but could easily be wrong.
But I must admit that I’ve not tried it with a PK font.

Font outlines have been the preferred technology for ~20 years
or more, so I’ve not had the need with bit-mapped fonts.

In detail my question is: How to tell pdftex encoding of PK font
(generated from MetaFont)?

For information on virtual fonts:  use Google.

With above detailed description, are you sure that virtual fonts
could do this unicode mapping?

Are not virtual fonts again only 8bit (as opposite of glyph names
and unicode)?

Yes, virtual fonts are only 8 bits.  There are things called "omega
virtual fonts" which I think allow for larger-numbered characters,
but I don't know whether pdftex supports them.  I think that luatex
does.

Luatex is unicoded and it is possible to create virtual font which remap
latin2 to unicode (yesterday I tried that). But my question is about
pdftex right now.

Try what I suggest above.

% Supply CMAP files for Xy-pic's arrowhead fonts
% otherwise an Accessibility check fails for encoding of arrow tips.
%
\def\TPDF at xyd@encoding{xyd}
\def\TPDF at support@xyarrows{%
\IfFileExists{\TPDF at xyd@encoding.cmap<mailto:xyd at encoding.cmap>}%
}

\immediate\pdfobj stream file {\TPDF at xyd@encoding.cmap<mailto:xyd at encoding.cmap>}\relax
\xdef\TPDF at set@cmap at xyd##1{%
\noexpand\expandafter\pdffontattr\noexpand##1 {/ToUnicode \the\pdflastobj\space 0 R}}%
}
\def\TPDF at inhibitload@xyd{\gdef\TPDF at set@cmap at xyd##1{}}

% standard Xy tips
\def\TPDF at xyd@cmap at xy{%
\pdfnobuiltintounicode\xyatipfont
\TPDF at set@cmap at xyd{\xyatipfont}%
\pdfnobuiltintounicode\xybtipfont
\TPDF at set@cmap at xyd{\xybtipfont}%
}
% CM-style Xy tips
\def\TPDF at xyd@cmap at cm{%
\pdfnobuiltintounicode\xy@@atfont
\TPDF at set@cmap at xyd{\xy@@atfont}%
\pdfnobuiltintounicode\xy@@btfont
\TPDF at set@cmap at xyd{\xy@@btfont}%
}
% rebind the \UseTips  macro
\def\TPDF at UseTips{%
\LTX at UseTips
\TPDF at xyd@cmap at cm
}

\AtBeginDocument{%
\@ifpackageloaded{xy}{% activate CMaps for Xy-pic arrows
\TPDF at support@xyarrows
\TPDF at xyd@cmap at xy
\let\LTX at UseTips\UseTips
\let\UseTips\TPDF at UseTips
}{}%
}

--
Pali Rohár
pali.rohar at gmail.com<mailto:pali.rohar at gmail.com>

Hope this helps,

Ross

Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: ross.moore at mq.edu.au<mailto:rick.minter at mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>

[cid:image001.png at 01D030BE.D37A46F0]<http://mq.edu.au/>

CRICOS Provider Number 00002J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/pdftex/attachments/20160625/64f087fa/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 4605 bytes
Desc: image001.png
URL: <http://tug.org/pipermail/pdftex/attachments/20160625/64f087fa/attachment-0001.png>