[pdftex] ToUnicode CMap

Han The Thanh thanh at informatics.muni.cz
Thu Feb 22 11:27:02 CET 2001


> As known, there are two aspects of the language problem in pdf and
> ps documents: showing of a text and determination of a text
> content. The former is successfully solved in pdftex, dvips and
> dvipdfm by embedding of needed fonts but the latter is still open
> for Cyrillic and many other languages.
> 
> To solve the latter problem one should embed ToUnicode CMap with
> the font (PDF 1.3 Reference 5.9 sec.ed.). This CMap relates codes
> of font's glyphs to Unicode codes, it can easily be made when such
> relations are known.
> 
> It seems to be necessary to embed ToUnicode CMap in embedded font
> program. Data for ToUnicode CMap generation can be obtained from a
> separate file containing code-to-code relations. This file can be
> connected with corresponding font like encoding file in font-map.
> Is it possible in pdftex's future?

this feature was asked by a czech fellow a time ago. I provided a hook for
this purpose called \pdffontattr, which can be used to append further
entries into the font dictionary. I have to admit that I didn't use it yet
-- what is provided here is the fastest hook for the fellow to get his work
done. I am sure you know pretty well the pdf spec, so it should not be a
problem to create the ToUnicode CMap using the existing pdftex primitives.
You can also contact the man who needed this feature by email:
koala at informatics.muni.cz

Regards,
Thanh



More information about the pdftex mailing list