Google Disk (was: XeLaTeX to Word/OpenOffice - the state of the art?)

Ross Moore ross.moore at
Sun Mar 17 19:56:49 CET 2019

Hi Andrew,

On 18/03/2019, at 0:18, "Andrew Cunningham" < at< at>> wrote:


It is also dependent in the fonts themselves and the scripts the language is written in.


Depending on the language and script the only way to ensure accessibility is to include the ActualText attributes for each relevant tag.

Indeed, provided you have supplied tagging at all, as of course should be done.

Considering how complex opentype fonts  can become for some scripts the simplistic To Unicode mappings in a PDF can be insufficient.

Yes, but it is better for the CMaps to at least be appropriate, rather than inaccurate or missing altogether, as can be the case. Different software tools get information from different places, so ideally one needs to provide the best values for all those possible places.

And text in a PDF may by WCAG definition be non-textual content.

Presumably you mean, adding descriptive text to graphics that convey meaningful information; e.g. a company logo, and most illustrations.
Of course this should be done too. But this can only be useful if the alternate descriptive text can be found via the structure tagging; hence the need for fully tagged PDF, navigable via that tagging.

And Zdenek's comment emphasises how what might work well in one language setting can be quite insufficient for others. We need to be able to accommodate all things that are helpful.
That is surely what the U (for Universal) means in PDF/UA.



On Sunday, 17 March 2019, Ross Moore <ross.moore at<mailto:ross.moore at>> wrote:
Hi Karljūrgen,

On 17/03/2019, at 1:42, "Karljürgen Feuerherm" <kfeuerherm at<mailto:kfeuerherm at>> wrote:

> Ross,
> Your reply caught my eye, and I am now looking at the pdfx package documentation.
> May I ask, if accessibility is a concern, why a-2b/-2u rather than -ua-1, which seems directly targeted at this?

PDF/UA and PDF/A-1a,2a,3a  require a fully tagged PDF.
This is a highly non-trivial task, which requires adding much extra to the document, done almost entirely through \special commands. The pdfx package does not provide this, but is useful for meeting the Metadata and other requirements of these formats.

Abstractly, accessibility is about having sufficient information stored in the PDF for software tools to be able to build and present a description of the content and structure, other than the visual one. The same can be said of software for converting into a different format.

A significant part of this is being able to correctly identify each character in the fonts used within the TeX/produced PDF. Even this is a non-trivial problem, due to TeX's non-standard font encodings, and virtual font technique.

> Many thanks,
> K
>> You should use the  pdfx  package and prepare for  PDF/A-2b or -2u.
>> This fixes many of these things that affect conversions, as well as Accessibility and Archivability.
>> It's not fully tagged PDF, but handles many other technical issues.

Hope this helps.


Andrew Cunningham at< at>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the XeTeX mailing list