[luatex] CIDSet in PDF/A documents

luigi scarso luigi.scarso at gmail.com
Wed Jun 24 08:57:35 CEST 2015


On Wed, Jun 24, 2015 at 12:46 AM, Reinhard Kotucha <reinhard.kotucha at gmx.de>
wrote:

> On 2015-06-23 at 06:07:13 +0200, luigi scarso wrote:
>
>  > > Hi,
>  > > at
>  > >
>  > >   http://tracker.luatex.org/view.php?id=434
>  > >
>  > > Hans wrote
>  > >
>  > >  > pdf/a demands a cidset but we will forget about this till we find a
>  > >  > proper example
>  > >  >
>  > >  > there is still some reported problem with some stream objects not
>  > >  > properly being formatted cf. pdf/a but it is not clear what is
>  > >  > going on there
>  > >
>  > > This message is now five years old.  Any new perceptions?
>  > >
>  > > I'm using a TrueType font (CharisSIL) and one of five PDF/A validators
>  > > complains about a bad CIDSet.
>  > >
>  > > I created a small PDF file which only contains the string "abc",
>  > > extracted the TTF from the PDF file, and disassembled it with TTX.
>  > > Then I assembled the CIDSet manually according to the instructions
>  > > given in the PDF/A-1 specification (ISO 19005-1).  I've got the same
>  > > result as LuaTeX, hence it's unclear to me what's going wrong.
>  > >
>  > > The PDFtron
>  > >
>  > >   https://www.pdftron.com/
>  > >
>  > > validator sais
>  > >
>  > >   <Error Code="e_PDFA356" Message="CIDSet in subset font is
> incomplete"
>  > > Refs="95, 101"/>
>  > >
>  > > I tend to believe that the validator is wrong.  On the other hand
>  > > PDFtron offers software which creates PDF/A files and I can't imagine
>  > > that their validator complains about their own products.
>  > >
>  > > Did anybody investigate?  The nasty thing is that PDF/A is for long
>  > > term preservation and any file we create today has to comply with the
>  > > standard unconditionally.  And for us TeX users, the fact that there
>  > > are zillions of invalid PDF/A files around just because old versions
>  > > of the Acrobat preflight tool ignored most errors, is not an excuse.
>  > > We should do better.
>  > >
>  > > >From the results of my own investigations I deduce that LuaTeX
>  > > provides a standard compliant CIDSet.  Maybe different people
>  > > interpret the standard in a different way.  But it would be nice to
>  > > know whether somebody else investigated this issue already.
>  > >
>  > > Regards,
>  > >   Reinhard
>  > >
>  > >
>  > which pdf/a ?
>
> PDF/A-1b
>
>  > https://pdfbox.apache.org/
>  >
>  > says
>  > """
>  > Preflight
>  > Validate PDF files against the PDF/A-1b standard.
>  > """
>
> PDFbox was one of the validators I used.  It didn't complain.
>
> There is probably a newer release.  Somebody said on a mailing list
> that he submitted a patch which adds object numbers to error
> messages.  Sounds very reasonable and useful.
>
> BTW, I validated both validators before.
>
>   http://ms25.no-ip.info/pdfa/validate-pdftron.html
>   http://ms25.no-ip.info/pdfa/validate-pdfbox.html
>
> I also used the online validators
>
>   http://www.validatepdfa.com/online.htm
>
> and
>
>   http://www.pdf-tools.com/pdf/validate-pdfa-online.aspx
>
> Finally I asked Ross Moore to check my file with Acrobat Pro v.11.
>
> Only PDFtron complained about an uncomplete CIDSet.
>
> I think we can assume  Acrobat Pro v.11 as reference, so PDFtron
could be wrong.  You can contact the pdftron team and
show them the problem, with  the report from
Acrobat Pro v.11 --- it would be nice to hear their answer.


Validating a pdf/a-1b is a quite complicate task, so in the real life you
want
validate with the "best" validator --- and cross the fingers.
Acrobat Pro is one of the best validators around  so if it says that it's
ok, then you have strong reasons to say that it's ok.
In my opinion, pdf/a is a great thing, but the lack of free & solid
pdf/a-1a validators still limits its adoption.

-- 
luigi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/luatex/attachments/20150624/1fb9799a/attachment-0001.html>


More information about the luatex mailing list