[pdftex] ToUnicode CMap

A.V. Kuznetsov kuzn at htsc.mephi.ru
Thu Feb 22 11:39:52 CET 2001


As known, there are two aspects of the language problem in pdf and
ps documents: showing of a text and determination of a text
content. The former is successfully solved in pdftex, dvips and
dvipdfm by embedding of needed fonts but the latter is still open
for Cyrillic and many other languages.

To solve the latter problem one should embed ToUnicode CMap with
the font (PDF 1.3 Reference 5.9 sec.ed.). This CMap relates codes
of font's glyphs to Unicode codes, it can easily be made when such
relations are known.

It seems to be necessary to embed ToUnicode CMap in embedded font
program. Data for ToUnicode CMap generation can be obtained from a
separate file containing code-to-code relations. This file can be
connected with corresponding font like encoding file in font-map.
Is it possible in pdftex's future?

A.Kuznetsov





More information about the pdftex mailing list