Google Disk (was: XeLaTeX to Word/OpenOffice - the state of the art?)

Zdenek Wagner zdenek.wagner at gmail.com
Sun Mar 17 16:36:22 CET 2019


ne 17. 3. 2019 v 14:18 odesílatel Andrew Cunningham
<lang.support at gmail.com> napsal:
>
> Ross,
>
> It is also dependent in the fonts themselves and the scripts the language is written in. Depending on the language and script the only way to ensure accessibility is to include the ActualText attributes for each relevant tag. Considering how complex opentype fonts  can become for some scripts the simplistic To Unicode mappings in a PDF can be insufficient. And text in a PDF may by WCAG definition be non-textual content.
>
>
Yesm this is particularly true for Indic scripts derived from Brahmi
becase the glyph order is not the same as the codepoint order. XeTeX
can already generate ActualText by setting

\XeTeXgenerateactualtext 1


Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz
>
> On Sunday, 17 March 2019, Ross Moore <ross.moore at mq.edu.au> wrote:
>>
>> Hi Karljūrgen,
>>
>> On 17/03/2019, at 1:42, "Karljürgen Feuerherm" <kfeuerherm at kfeuerherm.ca> wrote:
>>
>> > Ross,
>> >
>> > Your reply caught my eye, and I am now looking at the pdfx package documentation.
>> >
>> > May I ask, if accessibility is a concern, why a-2b/-2u rather than -ua-1, which seems directly targeted at this?
>>
>> PDF/UA and PDF/A-1a,2a,3a  require a fully tagged PDF.
>> This is a highly non-trivial task, which requires adding much extra to the document, done almost entirely through \special commands. The pdfx package does not provide this, but is useful for meeting the Metadata and other requirements of these formats.
>>
>> Abstractly, accessibility is about having sufficient information stored in the PDF for software tools to be able to build and present a description of the content and structure, other than the visual one. The same can be said of software for converting into a different format.
>>
>> A significant part of this is being able to correctly identify each character in the fonts used within the TeX/produced PDF. Even this is a non-trivial problem, due to TeX's non-standard font encodings, and virtual font technique.
>>
>> >
>> > Many thanks,
>> >
>> > K
>> >
>> >> You should use the  pdfx  package and prepare for  PDF/A-2b or -2u.
>> >> This fixes many of these things that affect conversions, as well as Accessibility and Archivability.
>> >>
>> >> It's not fully tagged PDF, but handles many other technical issues.
>> >>
>>
>>
>> Hope this helps.
>>
>> Ross
>>
>
>
> --
> Andrew Cunningham
> lang.support at gmail.com
>
>
>



More information about the XeTeX mailing list