[pdftex] Re: [Cjk] Conflict between pinyin and pifont?(About
CJKbookmarks)
Edward G.J. Lee
edt1023 at ms17.hinet.net
Mon Feb 13 05:28:49 CET 2006
[Cc to pdfTeX list also. Feel free to modify email Subject if need.]
On Thu, Feb 09, 2006, Heiko Oberdiek wrote:
> On Thu, Feb 09, 2006 at 11:12:16PM +0800, Edward G.J. Lee wrote:
>
> > Under CJK UTF8 environment when use CJKbookmarks and unicode option
> > of hyperref it leave UTF-8 encoding characters alone.
>
> No, you cannot use option unicode then. If there are mixed
> bookmarks, you can use \hypersetup to switch the behaviour
> of hyperref. However option unicode must also be added to
> \usepackage to load the support macros, e.g.:
>
> \usepackage[unicode]{hyperref}
>
> \hypersetup{unicode=false,CJKbookmarks=true}
Under this situation, pdf outline will lost 0xFEFF(BOM) and it will
be UTF-8 hexadecimal if I use \texorpdfstring.
http://edt1023.sayya.org/tex/tmp/utf8bks0.tar.gz
I must let it `unicode=true,CJKbookmarks=true' to reserve the octal
UTF16BE and insert 0xFEFF(octal \376\377) automatically.
http://edt1023.sayya.org/tex/tmp/utf8bks1.tar.gz
If we don't use \texorpdfstring then the UTF-8 characters will be
in the pdf outline and use UTF-8 hexadecimal, of course it's wrong.
> > But is it possible change the PDF outlines' encoding to UTF-16BE
> > via hyperref or CJK itself?
>
> I am not a CJK expert, something for Werner.
> The problem will be the recodings, I don't think someone wants to
> implement something like
> &Encode::from_to($char, "Big5", "UCS-2");
> at TeX macro level.
> hyperref offers two hooks where the outline strings can be
> manipulated:
> * \pdfstringdefPreHook: This hook is used before the
> string is expanded and is mainly used for redefinitions;
> I recommend to use the following wrapper to add something
> to the hook:
> \pdfstringdefDisableCommands{%
> \def\nastyMacro{nice contents}%
> }%
Thanks for the hint.
But I don't think I can write the TeX macro to convert the encoding
to UTF16BE [yet]. :)
> * \pdfstringdefPostHook#1: #1 contains the macro with the
> expanded bookmark string. Thus the bookmark string
> can be postprocessed.
>
> Also you can make feature requests for encoding conversions
> to the projects pdfTeX and/or ExTeX.
Actually pdfTeX should handle cjk pdf characters copy&search&paste(
just like dvipdfmx dose) and [maybe] cjk pdf outline(I'm not sure if
the encoding conversions should be the built-in of pdfTeX).
I also useing pdflatex to compile the same document,
http://edt1023.sayya.org/tex/tmp/utf8bks2.tar.gz
As you can see, no copy&search&paste on cjk characters even you use
asian version of acroread. And use Type 1 not Type 1 compact, so the
file is larger than dvipdfm[x]/dvips/ps2pdf produced, it's significant
in cjk document.
Edward
More information about the pdftex
mailing list