[luatex] UTF-16 with pdfe.getstring()

Hans Hagen j.hagen at xs4all.nl
Thu Mar 18 02:18:03 CET 2021


On 3/17/2021 5:30 PM, Andreas Matthias wrote:
> I'm having a hard time with pdfe.getstring(). What am I supposed to do if it
> returns an UTF-16 encoded string? How to convert it to UTF-8?
> 
> Here is what I'm actually trying to do: I'm reading the /Contents of a
> Text-Annotation
> with pdfe.getstring(). The returned string happens to be UTF-16
> encoded. Now I want to
> use this string to create a pdf_annot whatsit. Of course this doesn't work:
> 
> This is LuaTeX, Version 1.13.0 (TeX Live 2021/dev)
>   restricted system commands enabled.
> (./test.tex
> ! String contains an invalid utf-8 sequence.
> l.17 }
> 
> I've attached an example to replicate this issue.
   contents = annot.Contents
   local t = { }
   for c in string.gmatch(contents,".") do
       t[#t+1] = string.format("%02X",string.byte(c))
   end
   contents = table.concat(t)
   local str = '/Subtype/Text/Contents <' .. contents .. '>'


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------


More information about the luatex mailing list.