[pdftex] Tagged PDF Support

Dominik Klein dominik.klein at outlook.com
Fri Jun 19 10:43:13 CEST 2015


Dear all,

support for tagged PDFs is quite important in some areas, as some publishers and institutions require that all published PDFs must be accessible.

It seems there is an experimental version pdftex that allows to generate tagged pdfs here:
https://foundry.supelec.fr/scm/viewvc.php/branches/tagged-pdf/?diff_format=c&root=pdftex&pathrev=670

1.) Is  there any documentation about pdftex commands with which one can generate the tagged structure tree? The most important points are here probably how to mark content as text, how to mark content as artifacts, and how to mark content with alternative text (i.e. images, formulas). That is something along the lines of:

\startgroup{Sect}
\beginmarkedcontent{H1}
This is a header
\endmarkedcontent{H1}

\startmarkedcontent{P}
This is a paragraph
\endmarkedcontent{P}

\beginmarkedcontent{artifact}
\thispagenumber
\endmarkedcontent{artifact}

\beginmarkedcontent{artifact}

\beginmarkedcontent{figure}
\includegraphics{ ... }
\setalttext{Textual description of the graphic}
\endmarkedcontent{figure}

\beginmarkedcontent{formula}
$\sqrt{5}$
\setalttext{The square root of five.}
\endmarkedcontent{formula}

\endgroup{Sect}

If I know these primitives, I can think of how to automatically (Tex, Python or the like) generate a tagged pdf from a standard latex document.

2.) Any instructions on how to compile? I tried this:
svn checkout --username anonsvn https://foundry.supelec.fr/svn/pdftex/branches/tagged-pdf
seems the password anonsvn works
cd tagged-pdf/source
bash build-pdftex.sh

I then get a new pdftex file. However I do not know how to generate pdflatex.fmt?
pdftex -ini seems not to work.

3.) Suppose I can finally build the above:
- Which version does this branch stem from? In particular can I replace the binary pdflatex from a TexLive 2014 with this build? Or rather TexLive 201x?
- How to achieve dummy spaces? In the TexLive 2014 there seems to be a new command \pdfgeninterwordspaceon. Is this command available in this branch, and will it ensure that spaces are included in the tagged portions of tags?

Apologies for this lengthy mail and lots of questions, but looking at http://tex.stackexchange.com there seems to be quite some interested in accessible pdf documents generated pdflatex, and I'd really like to bring this forward.

cheers

- Dominik
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/pdftex/attachments/20150619/75401e9c/attachment.html>


More information about the pdftex mailing list