[accessibility] [BlindMath] Latex Table Question
Alex Watson
alexander.watson at ucl.ac.uk
Mon May 13 14:04:25 CEST 2024
Hi Ross
If I may broaden the discussion a bit...
On 11/05/2024 01:14, Ross Moore wrote:
>
>> On 10 May 2024, at 7:30 pm, Alex Watson <alexander.watson at ucl.ac.uk>
>> wrote:
>>
>> - but it was not accepted because the maintainer (very reasonably)
>> did not want to introduce custom markup without an accepted practice
>> that would align with future tagged PDF practices etc. that Ulrike et
>> al are developing.
>>
> … since without a fully tagged PDF structure tree, there isn’t any
> other way to tell when you are in a table cell, header or otherwise,
> or even the table itself.
> It has to be done by whatever software is interpreting the LaTeX source.
>
> Although not a complete solution, most tables will be of the type
> where the 1st cell in a row is <TH>
> and the first non-compounded (using \multicolumn) cell in a column is
> similarly a <TH>.
> Using Booktabs, so that \midrule can be the boundary between <THead>
> and <TBody>, is certainly a good idea.
>
> Using these ideas, call them heuristics if you like, you can get a
> long way into producing fully tagged tables,
> whether for Tagged PDF or for HTML.
>
> I gave a talk on precisely this topic at TUG 2022:
> https://www.youtube.com/watch?v=E1oFa3DbyoE&list=PLLt9mKFAx-FaKzET1DNj-wD-g8YG3_r1m&index=17
> <https://www.youtube.com/watch?v=E1oFa3DbyoE&list=PLLt9mKFAx-FaKzET1DNj-wD-g8YG3_r1m&index=17>
>
>
> Links to example PDFs, and conversions to HTML, can be found at:
> http://web.science.mq.edu.au/~ross/TaggedPDF/TUG2022/
> <http://web.science.mq.edu.au/~ross/TaggedPDF/TUG2022/>
I saw your talk when it came up in 2022, and the mechanism for this is
very interesting, but I think (and I wonder if you agree) that some kind
of interface will eventually be required, no matter how clever the
heuristics.
For example, it is very hard to distinguish between tables where both
the first row and first column contain headers (as in all the tables in
your 'real world' examples) and tables where only the first row contains
headers. While there might be additional heuristics (e.g. a boldface
first column), there will always be a lot of potential for ambiguity.
Given that progress on tagged PDF is likely to be ongoing for several
years, it would be nice to have some kind of interface for document
authors to indicate table heading cells. If there was a stable
interface, maintainers of HTML translation packages (tex4ht, lwarp,
latexml etc.) could implement this as a core part of their system,
rather than relying on ad hoc solutions like the custom config.cfg that
Michal offered in this thread. Until this happens, most tex-generated
HTML in the wild will simply not have accessible tables.
Have the tagged PDF team given any thought as to what this interface (or
a minimal functional subset of it) might look like, and whether it could
be made public in advance of the corresponding work on tagging?
For instance, a very minimal solution might be as I suggested: provide a
macro to explicitly tag header cells, and if it appears in a table, then
abandon the heuristics and just follow the author's explicit requests.
This would work just for HTML production (but I don't know if it would
be sufficient for generating a valid tagged pdf table). A more advanced
interface might include explicitly selecting a heuristic, etc. etc.
Best wishes,
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/accessibility/attachments/20240513/e89d8283/attachment.htm>
More information about the accessibility
mailing list.