[XeTeX] :letterspace oddities

Ross Moore ross.moore at mq.edu.au
Fri Aug 26 23:13:40 CEST 2016


Hi Jonathan, Zdenek, Phil,

On 26/08/2016, at 19:42, "Jonathan Kew" <jfkthame at gmail.com> wrote:

> On 25/8/16 18:02, Philip Taylor wrote:
>> For some time now I have been partially aware of some oddities in the
>> XeTeX implementation of :letterspace, but it was only today that my
>> thoughts crystallised sufficiently for me to attempt to record them on-
>> list :
>> 
>> 1) Search functionality.
>> 
>> For :
>> 
>> \font \errorfont = "Copperplate Gothic Bold:letterspace=8,color=BF0000"
>> scaled 2260
>> 
>> \newbox \errorbox
>> 
>> \setbox \errorbox = \leftline {\errorfont +++ NOT AT TOP OF PAGE +++}


>> Adobe Acrobat 7.1 has no problem locating the string "+++" if the
>> contents of \errorbox end up in the PDF file; however, for
>> 
>> \font \errorfont = "Copperplate Gothic Bold:letterspace=16,color=BF0000"
>> scaled 2260
>> 
>> the same string cannot be found.


> 
> Remember that TeX doesn't treat spaces as "characters" but as glue, which means they don't end up as part of the *text* in the resulting DVI or PDF file; they are merely implied by the positioning of the visible glyphs.
> 
> As a result, consider what Acrobat must be doing: it can "see" the visible glyphs and their positions, but it "sees" no <space> characters separating words. It must be inferring which characters are adjacent in the text stream, and which are separated by spaces, purely from their positions. So when you add a substantial amount of letter-spacing, it seems likely that Acrobat will view the text as being "+ + +" rather than "+++".

Yes, this is a very good way of explaining it.

TeX's failure to include actual spaces in the output text-strings within the PDF is a double-edged sword. 
  On one hand, by treating spaces as glue, it is what allows TeX to produce the high-quality visual appearance that it does;
  but on the other hand this is the reason why normal TeX-produced PDFs do not work with Acrobat's 'Reflow' feature, when this setting is requested in the viewer.
(Think about how text in a web browser readjusts to fit a window when the viewing font size is increased, or when the window size is reduced.) 
For many people, especially those with eyesight difficulties, Reflow is extremely important. 

With small screens, as on smartphones and tablets, the lack of reflow within most PDF readers, is one of the biggest objections to use of PDF as a file format, as compared with HTML and XML-based formats, which do allow reflow.
As for the proliferation of PDF 2.0, PDF/UA and Tagged PDF formats generally, (e.g., as international standards) TeX will never be properly in the game unless the output is adjusted to include spaces within the output strings, in the font being use for the text.

Note that pdfTeX now has a mode that allows 'fake' spaces to be inserted, based upon the distance between letters, when sufficient for it to be reasonably inferred that a space must have been in the original input. But these are in a different font to the surrounding text, and as such are not regarded by Adobe Acrobat/Reader to be part of normal text strings, for the purpose of reflow.
Besides, the continual switching of fonts between text and fake spaces, adds quite a bit to the total size of the PDF file.

This is one direction that could be explored by the XeTeX, and dvipdfmx developers.
Develop a method to reinsert spaces into the PDF output, without altering the spacing in the non-reflowed view.


> It's possible that \XeTeXgenerateactualtext=1 would help,

How does this work?
Does it use a heuristic to infer that a space was originally present?
Or does it only work with syllables and special characters?
Can a user provide customized input to the actual-text strings, that will not affect typesetting?

> as I think it would annotate the letter-spaced "+++" as a unit with its actual text, allowing Acrobat to find it correctly despite the intervening spaces that *appear* to be present from just looking at the glyphs.

I'd certainly like to see the results of this kind of testing.

> 
> JK

Hope this helps.

    Ross




More information about the XeTeX mailing list