[pdftex] Re: [Cjk] Conflict between pinyin and pifont?(About CJKbookmarks)

Edward G.J. Lee edt1023 at ms17.hinet.net
Mon Feb 13 05:28:49 CET 2006

[Cc to pdfTeX list also. Feel free to modify email Subject if need.]

On Thu, Feb 09, 2006, Heiko Oberdiek wrote:
> On Thu, Feb 09, 2006 at 11:12:16PM +0800, Edward G.J. Lee wrote:
> >   Under CJK UTF8 environment when use CJKbookmarks and unicode option
> >   of hyperref it leave UTF-8 encoding characters alone.
> No, you cannot use option unicode then. If there are mixed
> bookmarks, you can use \hypersetup to switch the behaviour
> of hyperref. However option unicode must also be added to
> \usepackage to load the support macros, e.g.:
> \usepackage[unicode]{hyperref}
> \hypersetup{unicode=false,CJKbookmarks=true}

  Under this situation, pdf outline will lost 0xFEFF(BOM) and it will
  be UTF-8 hexadecimal if I use \texorpdfstring.


  I must let it `unicode=true,CJKbookmarks=true' to reserve the octal
  UTF16BE and insert 0xFEFF(octal \376\377) automatically.


  If we don't use \texorpdfstring then the UTF-8 characters will be
  in the pdf outline and use UTF-8 hexadecimal, of course it's wrong.

> >   But is it possible change the PDF outlines' encoding to UTF-16BE
> >   via hyperref or CJK itself?
> I am not a CJK expert, something for Werner.
> The problem will be the recodings, I don't think someone wants to
> implement something like
>         &Encode::from_to($char, "Big5", "UCS-2");
> at TeX macro level.
> hyperref offers two hooks where the outline strings can be
> manipulated:
> * \pdfstringdefPreHook: This hook is used before the
>   string is expanded and is mainly used for redefinitions;
>   I recommend to use the following wrapper to add something
>   to the hook:
>     \pdfstringdefDisableCommands{%
>       \def\nastyMacro{nice contents}%
>     }%

  Thanks for the hint.

  But I don't think I can write the TeX macro to convert the encoding
  to UTF16BE [yet]. :)

> * \pdfstringdefPostHook#1: #1 contains the macro with the
>   expanded bookmark string. Thus the bookmark string
>   can be postprocessed.
> Also you can make feature requests for encoding conversions
> to the projects pdfTeX and/or ExTeX.

  Actually pdfTeX should handle cjk pdf characters copy&search&paste(
  just like dvipdfmx dose) and [maybe] cjk pdf outline(I'm not sure if
  the encoding conversions should be the built-in of pdfTeX).

  I also useing pdflatex to compile the same document,


  As you can see, no copy&search&paste on cjk characters even you use
  asian version of acroread. And use Type 1 not Type 1 compact, so the
  file is larger than dvipdfm[x]/dvips/ps2pdf produced, it's significant
  in cjk document.


More information about the pdftex mailing list