[luatex] luatex - font encoding for type 1 fonts

Sun Jun 26 16:59:53 CEST 2016

On Sunday 26 June 2016 16:29:14 Hans Hagen wrote:
> On 6/26/2016 4:13 PM, Pali Rohár wrote:
> > On Sunday 26 June 2016 13:25:11 Pali Rohár wrote:
> >>   dup 34 /quotedblright put
> >>   dup 92 /quotedblleft put
> >>   dup 254 /quotedblbase put
> >>   dup 255 /csquotedblright put
> > 
> > Hans, this remind me, how to print above characters with your font
> > 
> > loader? In pdftex it is done by:
> >   \chardef\clqq=254\sfcode254=0
> >   \chardef\crqq=255\sfcode255=0
> >   \def\uv#1{\clqq#1\crqq}
> >   
> >   \chardef\elqq=92
> >   \chardef\erqq=34
> >   \def\qq#1{\elqq#1\erqq}
> >   
> >   \uv{text in czech quotes}
> >   \qq{text in english quotes}
> > 
> > In czech and slovak text is used different style of quoting and
> > CSFonts are prepared for it. Right czech quote is very similar to
> > left english quote and for these reasons lot of fonts have only
> > one glyph for both right czech and left english (under name
> > quotedblleft).
> > 
> > But CSFonts differs between two characters and have two very
> > similar (but slightly different) glyphs. Left english quote is
> > under name quotedblleft and right czech under csquotedblright.
> > 
> > Glyph name csquotedblright is custom non-standard and so it is not
> > in any glyphtounicode mapping list. There is also no special
> > Unicode character for it and U+201C should be used.
> > 
> > Now I'm thinking that luatex must have problem if there are two
> > characters which needs to be mapped to U+201C...
> > 
> > But it should be enough if user is able to define own macro \uv to
> > works correctly (no need to type unicoded quotes in input tex
> > file). How to do that?
> > 
> > Also I think that your font loader could have problems with glyph
> > name "csquotedblright" as it is not any standard one...
> 
> these become private unicodes :
> 
> fonts           > tfm loading > glyph 'althyphen' in font 'csr10'
> with encoding 'csr' gets unicode U+F0000
> fonts           > tfm loading > glyph 'csquotedblright' in font
> 'csr10' with encoding 'csr' gets unicode U+F0001
> fonts           > tfm loading > glyph 'polishlcross' in font 'csr10'
> with encoding 'csr' gets unicode U+F0002

How is generated order of these three characters? It guaranteed that 
character with glyph name "althyphen" is always U+F0000?

> so, you need to use
> 
> \chardef\crqq="F0001

Nice, thank you!

Maybe it would be better to access these characters also by glyph name, 
not only Unicode number...

And another question, it is possible that in generated PDF will be this 
character mapped to U+201C (only when selecting text in PDF reader)? In 
pdftex it can be achieved by:

  \pdfglyphtounicode{csquotedblright}{201C}
  \pdfgentounicode=1

which generate CMAP table where glyph name csquotedblright is mapped to 
U+201C. Btw, CMAP table allows to map move glyphs to same unicode 
characters...

-- 
Pali Rohár
pali.rohar at gmail.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://tug.org/pipermail/luatex/attachments/20160626/f0f899bc/attachment-0001.bin>