[pdftex] Latex source and pdf for ligature issue.

Leif Andersen leif at leifandersen.net
Wed Sep 12 20:58:59 CEST 2018


Okay, I just had a colleague try the document on macOS 10.13, and it looks
like it works fine there. So there is certainly some interaction with
preview (on 10.11) and acmart. What it is though I'm not yet sure.


~Leif Andersen

On Wed, Sep 12, 2018 at 2:51 PM, Leif Andersen <leif at leifandersen.net>
wrote:

> Very weird. I have this problem with OS X's preview, as well as Skim.
> (Which I believe uses Preview's rendering engine.)
>
> When I open the PDF in firefox I am able to copy/paste just fine, as well
> as the screen reader working.
>
> I should mention that I am using OS X 10.11.
>
> I think other latex styles get away with this by having only the visible
> text have the ligature, with copy/paste has the ascii `fi`, rather than the
> unicode `fi` character.
>
>
> ~Leif Andersen
>
> On Mon, Sep 10, 2018 at 6:28 PM, Ross Moore <ross.moore at mq.edu.au> wrote:
>
>> Hi Leif, Boris, and others
>>
>> On 11 Sep 2018, at 1:40 am, Leif Andersen <leif at leifandersen.net> wrote:
>>
>> As requested in github issue:
>> https://github.com/borisveytsman/acmart/issues/309#issuecomment-419690461
>> <https://protect-au.mimecast.com/s/xk1fCJyBZ6t8rwNJfzd6g8?domain=github.com>
>>
>> Here is an example of a pdf where `first` get's read as `rst`.
>>
>> Also note that I'm using the latest version of ACMAART on the ACM's
>> webpage: https://www.acm.org/publications/proceedings-template
>> <https://protect-au.mimecast.com/s/SBQXCK1DOrCqEWVohAjLfB?domain=acm.org>
>>
>> ~Leif Andersen
>> <test.pdf><test.tex>
>>
>>
>>
>>
>> Here is your original post suggesting a problem:
>>
>> If you have the word first in a document, a screen reader only sees rst.
>> You also see this if you try to copy/paste the word first from a pdf to
>> a text file.
>>
>> This seems to be unique to acmart, as all of the other document classes
>> I've tried seem to properly have the text first.
>>
>>
>> I tried your example PDF, with the attachments as included above, without
>> any change or recompilation.
>> Copy/paste of the complete text gave ‘first’ as expected,
>> since the /ToUnicode resource correctly maps the fi ligature to the pair
>> of letters `f i’ .
>> So there is no error in that regard.
>>
>> I did the Copy/Paste using 3 different PDF viewers on a Mac. :  Adobe
>> Acrobat Pro, Apple's Preview and TeXShop’s Preview.
>> (The latter 2 should give the same results, as they did.)
>>
>> So I have to ask you what software you were using for the Copy/Paste ? On
>> what platform?
>>
>> As for screen readers…
>>
>> … there is no uniformity in how they get the stream to be read aloud.
>> You can try using the  accsupp  package to set /ActualText  and/or  /Alt
>>  for the ligature, or the whole word.
>> (Whole word is probably better, as this would have less effect on
>> hyphenation.)
>> But even then, don’t expect a uniform result.
>> In my experience, you need a fully tagged PDF; that is, tagged for both
>> structure and content, to have
>> much effect on what a screen-reader sees. Even then, it is different with
>> different software.
>>
>>
>> For me, Adobe’s  Read Out Loud  did a fine job, apart from the small
>> print in the footnotes.
>> Presumably the spacing is too narrow to be treated as a word-space, so
>> most of it is spelt out.
>> (Hence the need for  \pdfinterwordspaceon !)
>>
>> Apple’s VoiceOver was OK, but splits the word into  'firs t', saying
>> "firs tee"
>> So I cannot reproduce the problem you described.
>> I know of now way to affect what VoiceOver actually reads, in such a case
>> of a single syllable word,
>> despite all the options that its Utility provides (e.g. read numbers as
>> words or digits — NB. Adam).
>>
>>
>> Interestingly there is a small problem in the PDF, with regard to fonts.
>> There are 2 different subsets sharing the same name:
>> RBUZNK+LinLibertineT
>>
>>
>> Acrobat Pro’s Preflight lists this as an error — though not a critical
>> one.
>> It doesn’t affect the visual layout, nor should it affect any text
>> extraction, as the subsets are given
>> different PostScript names within the page content stream.
>> Nevertheless the subsets should be given different  6-letter subset
>> prefixes, so this is technically
>> an error by  pdfTeX – which is why this message is being copied to
>> pdftex at tug.org .
>>
>> So Lief, I must ask also what version of TeXLive, or pdfTeX, are you
>> using?
>> Including the  .log  file would have provided this information.
>>
>>
>> Next, I renamed and compiled with TeXshop, using:
>>
>> This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017) (preloaded
>> format=pdflatex)
>>
>> Again the same subsetting prefix was shared by two font dictionaries.
>> I’ve not updated to TeXLive 2018, so don’t know if this is fixed there
>> already.
>>
>>
>> Hope this helps.
>>
>>    Ross
>>
>>
>> * Dr Ross Moore*
>>
>> *Mathematics Dept **|* 12 Wally’s Walk, 734
>> Macquarie University, NSW 2109, Australia
>>
>> *T:* +61 2 9850 *8955  |  F:* +61 2 9850 8114 <%2B61%202%209850%209695>
>> *M:*+61 407 288 255 <%2B61%20409%20125%20670>*  |  *E:
>> ross.moore at mq.edu.au <rick.minter at mq.edu.au>
>>
>> http://www.maths.mq.edu.au <http://mq.edu.au/>
>>
>>
>> <http://mq.edu.au/>
>>
>> <http://mq.edu.au/>
>>
>>
>>
>>
>> CRICOS Provider Number 00002J. Think before you print.
>> Please consider the environment before printing this email.
>> <http://mq.edu.au/>
>>
>> This message is intended for the addressee named and may
>> contain confidential information. If you are not the intended
>> recipient, please delete it and notify the sender. Views expressed
>> in this message are those of the individual sender, and are not
>> necessarily the views of Macquarie University. <http://mq.edu.au/>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/pdftex/attachments/20180912/2d7f0cf5/attachment-0001.html>


More information about the pdftex mailing list