<div dir="ltr">Simon,<br><div><div class="gmail_extra"><br><div class="gmail_quote">On 23 February 2016 at 14:12, Simon Cozens <span dir="ltr"><<a href="mailto:simon@simon-cozens.org" target="_blank">simon@simon-cozens.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 23/02/2016 13:54, Andrew Cunningham wrote:<br>

> PDF/UA for instance leaves the question deliberately ambigious.<br>

> ActualText is the way to make the content accessible, but developers<br>

> creating tools for PDF do not actually have to process the ActualText.<br>

<br>

</span>Yeah. (Sorry to keep banging the drum but) I've just done some tests<br>

with SILE, which includes some support for tagged/accessible PDFs. Even<br>

when the ActualText includes the correct Devanagari, I am still seeing<br>

the same problems with cut-and-paste. I'm not sure what needs to be done<br>

to get it right.<div dir="ltr"><div><div dir="ltr"><br></div></div></div></blockquote><div><br></div><div>In terms of SILE ... supporting generation of other formats like XPS as an alternative to PDF is probably the only way forward for complex script languages.<br><br></div><div>If SILE is tagging the PDFs and adding ActualText attributes , then it is doing everything it should be doing. The problems are with the PDF specification itself, what it was originally designed to be (a pre-print format based on the Postscript language) and the limitations placed on it by the developers of the spec.<br><br></div><div>Andrew <br></div></div>

</div></div></div>