[texworks] Backward compatibility of the Qt bindings in poppler

Albert Astals Cid aacid at kde.org
Sat Jun 12 00:20:20 CEST 2021


El divendres, 11 de juny de 2021, a les 9:09:23 (CEST), Oliver Sander va escriure:
> Hi Stefan,
> 
> thanks for the feedback!
> 
> > std::unique_ptr<Poppler::Page> page;
> > {
> >      page = Poppler::Document::load("1.pdf")->page(0);
> > }
> 
> I see the problem.  On the other, this code snippet was buggy with the
> old interface as well.  Previously, the problem was a leaking Document
> object; now it is a use-after-free which is likely to lead to crashes.
> 
> Regarding your suggested solutions: My understanding of the poppler design
> is too limited for an opinion here.  That is something that Albert has to
> decide.  I would like to mention option 3), however, which is:
> "Keep everything as it is, but document the behavior better".

yeah, the initial code is wrong, you're not supposed to do that, previously it leaked, now it crashes, that is an actual improvement since you can see that the code is wrong in a much more effective way :)

> 
> As for the Q_DISABLE_COPY questions, that's for Albert, too.

The Q_DISABLE_COPY is there to signal that you can't copy a Page object, if you did it would not do "what you wanted to do", you would complain that you called Page::addAnnotation in one Page, and the annotation did not show up in its copied one.

Sure you can call Poppler::Document::page multiple times and get the same page, but that is again, you doing stuff wrong, with C++ there's only so much things a library developer can do from preventing people to shoot themselves on the feet.

I have not a lot of interest in making the "users of poppler are doing things wrong let's protect them from themselves" use case work. It complicates code a lot. But yes, gitlab welcomes documentation patches too :)

Cheers,
  Albert

> 
> Viele Grüße,
> Oliver
> 
> > 
> > The resulting code compiles, but is prone to segfault as the Poppler::Document goes out of scope and is destroyed at the } while the Poppler::Page lives on. As a result, some calls such as page->pageSizeF() still succeed, while others crash, such as page->label() or page->renderToImage(). The root cause, of course, is that the page seems to keep a pointer to the document to which it belongs, but this document is destroyed prematurely without the page noticing. Presumably, this problem isn't new, i.e. delete'ing a Poppler::Document* before all of its Poppler::Page's in previous versions had the same effect. But now, with std::unqiue_ptr, it's easier than ever that this could happen "accidentally" (BTW: it also doesn't seem to be documented which methods require Poppler::Document-access and which ones don't).
> > 
> > So far, I basically had two ideas how this could potentially be resolved:
> > 1) Make Poppler::Page* be owned by Poppler::Document instead of the calling code (i.e., make Poppler::Document::page return raw pointers again instead of std::unique_ptr and "cache" the unique_ptr somewhere in Poppler::Document). The reasoning behind this would be that pages "belong" to a document, and they should be destroyed when the document is destructed.
> > 2) Make the underlying data needed for all methods of Poppler::Document 
> and Poppler::Page to work a std::shared_ptr (or similar). I.e., on the surface, decouple Document and Pages (they can be created and destroyed more or less independently - there are no method calls between them or pointers from one to the other), but have a "hidden state" that lives on until 
> the very last potential user (Document or Page) is destroyed.
> > 
> > To me, 1) seems to make slightly more sense philosophically, as pages belong to their document and I don't see a good (fundamental) reason ATM for allowing pages to "outlive" their document. At the same time, 1) seems 
> a lot more cumbersome; it drastically changes ownership rules; it would require page caching in Poppler::Document.
> > 
> > On a somewhat related note, I noticed that Poppler::Page is marked as Q_DISABLE_COPY(Page), which philosophically sort of makes sense to me, but 
> seems to be trivially circumvented by calling Poppler::Document::page() multiple times as this always constructs a "new Page()" and returns a std::unique_ptr. So it would seem to me that the object can be copied quite easily (on the surface, no extensive deep cloning of meta-data seems required), it's just disabled.
> > 
> > Please forgive my rambling, philosophical lifetime issues are not really my strong suite. But hopefully, you can give some comments and insights 
> into these matters, that maybe also help me figure out how to best approach this in TeXworks' multi-threaded wrapper code.
> > 
> > Best,
> > Stefan
> > 
> > PS: I haven't looked into the other instances of unique_ptr, such as Links, Annotations, Fonts, ... yet.
> 
> 







More information about the texworks mailing list.