[accessibility] Some questions about tagged PDF

Ross Moore ross.moore at mq.edu.au
Sun Dec 11 23:04:50 CET 2016

Hi Jonathan,

On 12 Dec 2016, at 00:53, Jonathan Fine <jfine2358 at gmail.com<mailto:jfine2358 at gmail.com>> wrote:

Hi Ross

Good to talk with you again.  We wrote:

2. Could a suitable tool create a useful HTML or XML document from a tagged PDF?


3. It there already such a tool?

Yes. Adobe's Acrobat Pro does this already.
It also exports into RTF and Word formats.
So Tagged PDF provides a good solution for submitting TeX PDFs to a journal that only accepts manuscripts done in M$ Word.

This is interesting. If we can produce (good enough) tagged PDF, we
can from this also produce (good enough) HTML, XML and Word documents.
And I believe that from (good enough) XML we ought to be able to
produce (good enough) tagged PDF.

So we are, in part, also talking about round-tripping typesetting, and
LaTeX to XML.

Yes; but really only “kind of”.

XML is really just a Meta-format rather than a format in itself.
It depends upon just what kind of information you want to have within the XML file.

PDF has a concept of attribute “/O-wner”, which seems to govern where this
information can be exported.

Attributes with owner /Layout  seem to be exported to HTML, but not to XML-1.00 .

I need to do more exploration into this, as I continue to support more and more LaTeX
environments, for Tagged PDF.


Hope this helps


Dr Ross Moore
Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.moore at mq.edu.au<mailto:ross.moore at mq.edu.au>


[cid:image001.png at 01D030BE.D37A46F0]

CRICOS Provider Number 00002J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/accessibility/attachments/20161211/dfc3dee8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 4605 bytes
Desc: image001.png
URL: <http://tug.org/pipermail/accessibility/attachments/20161211/dfc3dee8/attachment-0001.png>

More information about the accessibility mailing list