[XeTeX] printing of characters above "FFFF with \string \meaning (and potentially \Uchar)

David Carlisle d.p.carlisle at gmail.com
Thu Apr 23 15:07:51 CEST 2015


Last year I asked about the possibility of adding \Uchar copied from luatex.

http://tug.org/pipermail/xetex/2014-May/025260.html

Bruno suggested a possible implementation, and I finally got round to
trying that
adjusted for the sources as in the texlive 2015 pretest tree (diff attached)

This seems to work fine for characters below "FFFF
but fails for non BMP characters above that.

See the attached xetexuchar.tex file and the log produced by
luatex and (patched) xetex.

It just uses the same print_char routine as \string so I thought I'd test
that.
See the file nonbmp.tex (which can be used with a non-patched xetex)

As can be seen with the attached logs this works with luatex with
\string on U+1D538 producing a single character, but with xetex it produces
two (presumably the UTF-16 surrogate pair, although I didn't check that).

Is my reading of this file correct and \string and meaning are turning
U+1D538  into two characters, and if so does anyone have a suggestion
of the best place this should be attacked in the source?


David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20150423/7fc86550/attachment.html>
-------------- next part --------------
*** xetex.web~	2015-04-16 20:53:45.000000000 +0100
--- xetex.web	2015-04-23 10:54:21.677243400 +0100
***************
*** 10478,10484 ****
  @d left_margin_kern_code=11
  @d right_margin_kern_code=12
  
! @d etex_convert_codes=right_margin_kern_code+1 {end of \eTeX's command codes}
  @d job_name_code=etex_convert_codes {command code for \.{\\jobname}}
  
  @<Put each...@>=
--- 10478,10486 ----
  @d left_margin_kern_code=11
  @d right_margin_kern_code=12
  
! @d XeTeX_Uchar_code = 13
! 
! @d etex_convert_codes=XeTeX_Uchar_code+1 {end of \eTeX's command codes}
  @d job_name_code=etex_convert_codes {command code for \.{\\jobname}}
  
  @<Put each...@>=
***************
*** 10499,10504 ****
--- 10501,10509 ----
  primitive("rightmarginkern",convert,right_margin_kern_code);@/
  @!@:right_margin_kern_}{\.{\\rightmarginkern} primitive@>
  
+ primitive("Uchar",convert,XeTeX_Uchar_code);@/
+ @!@:XeTeX_Uchar_}{\.{\\Uchar} primitive@>
+ 
  @ @<Cases of |print_cmd_chr|...@>=
  convert: case chr_code of
    number_code: print_esc("number");
***************
*** 10549,10554 ****
--- 10554,10560 ----
    scanner_status:=normal; get_token; scanner_status:=save_scanner_status;
    end;
  font_name_code: scan_font_ident;
+ XeTeX_Uchar_code: scan_int;
  @/@<Cases of `Scan the argument for command |c|'@>@/
  job_name_code: if job_name=0 then open_log_file;
  end {there are no other cases}
***************
*** 10576,10581 ****
--- 10582,10588 ----
      print("pt");
      end;
    end;
+   XeTeX_Uchar_code: print_char(cur_val);
  @/@<Cases of `Print the result of command |c|'@>@/
  job_name_code: print_file_name(job_name, 0, 0);
  end {there are no other cases}
***************
*** 30000,30005 ****
--- 30007,30014 ----
  XeTeX_selector_name_code: print_esc("XeTeXselectorname");
  XeTeX_glyph_name_code: print_esc("XeTeXglyphname");
  
+ XeTeX_Uchar_code: print_esc("Uchar");
+ 
  @ @<Cases of `Scan the argument for command |c|'@>=
  eTeX_revision_code: do_nothing;
  pdf_strcmp_code:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xetexuchar.tex
Type: application/x-tex
Size: 619 bytes
Desc: not available
URL: <http://tug.org/pipermail/xetex/attachments/20150423/7fc86550/attachment.tex>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xetexuchar.xetex.log
Type: application/octet-stream
Size: 407 bytes
Desc: not available
URL: <http://tug.org/pipermail/xetex/attachments/20150423/7fc86550/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xetexuchar.luatex.log
Type: application/octet-stream
Size: 507 bytes
Desc: not available
URL: <http://tug.org/pipermail/xetex/attachments/20150423/7fc86550/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nonbmp.tex
Type: application/x-tex
Size: 170 bytes
Desc: not available
URL: <http://tug.org/pipermail/xetex/attachments/20150423/7fc86550/attachment-0001.tex>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nonbmp.xetex.log
Type: application/octet-stream
Size: 292 bytes
Desc: not available
URL: <http://tug.org/pipermail/xetex/attachments/20150423/7fc86550/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nonbmp.luatex.log
Type: application/octet-stream
Size: 407 bytes
Desc: not available
URL: <http://tug.org/pipermail/xetex/attachments/20150423/7fc86550/attachment-0003.obj>


More information about the XeTeX mailing list