Does anyone here actually work with Zotero or some other entity that working on similar problem?

Mike Marchywka marchywka at hotmail.com
Sun Jun 20 13:56:02 CEST 2021


If anyone is interested, just to complete this investigation and motivate anyone interested in further
research into bibtex aids for authors, I took out some of the hacks in my code to produce the possible
bibtex entries associated with the link 

https://www.nature.com/articles/s41429-021-00430-5.pdf

for which the Zotero web form still returns a failure message.

My code "TooBib" run with the link on the clipboard,

 echo -e "clip xxxx all" | ../toobib.out -legacy   2> aaaa

Returned 18 bibtex entries from different strategies ( the "all" modifier to the "clip" command 
has it return everything ). Since I label all my hits with information for later analysis of the
method ( and even re-fetch if better info exists ) , I can look at the search strategies.
In this case, the original guess to use the nature domain method worked 
as did the generic mutate strategy to try a related url ( which is not surprising
since the nature handler just does that lol ). These are redundant but that
is useful for development,   ( guessnature is a dedicated handler for a url with the nature domain,
the other handlers are generic and use things implied by their name such as meta data, doi,
or ldjson entries ), 

 cat xxxx | grep mjmhandler
% mjmhandler: toobib guessnature<-handledoi
% mjmhandler: toobib guessnature<-handledoixml
% mjmhandler: toobib guessnature<-handlegsmeta(html)
% mjmhandler: toobib guessnature<-handlegsmeta(scraper)
% mjmhandler: toobib guessnature<-handleldjson2
% mjmhandler: toobib guessnature<-handleadhochtml<-citation
% mjmhandler: toobib guessnature<-handleadhochtml<-DC
% mjmhandler: toobib guessnature<-handleadhochtml<-og
% mjmhandler: toobib handledoi
% mjmhandler: toobib handlepdf (pdftotext)
% mjmhandler: toobib handlemutate<-handledoi
% mjmhandler: toobib handlemutate<-handledoixml
% mjmhandler: toobib handlemutate<-handlegsmeta(html)
% mjmhandler: toobib handlemutate<-handlegsmeta(scraper)
% mjmhandler: toobib handlemutate<-handleldjson2
% mjmhandler: toobib handlemutate<-handleadhochtml<-citation
% mjmhandler: toobib handlemutate<-handleadhochtml<-DC
% mjmhandler: toobib handlemutate<-handleadhochtml<-og


The "citation" entry is just fine but the ldjson2 continues to
provide additional useful information. 

% mjmhandler: toobib handlemutate<-handleadhochtml<-citation
% date 2021-06-20:07:41:35 Sun Jun 20 07:41:35 EDT 2021
% srcurl: https://www.nature.com/articles/s41429-021-00430-5 https://www.nature.com/articles/s41429-021-00430-5.pdf
% citeurl: https://www.nature.com/articles/s41429-021-00430-5
@article{mechanismsactionAsiyaKamberZaidi,
article_type = {Review Article},
author = {Asiya Kamber Zaidi and Puya Dehgani-Mobaraki},
author_institution = {Member, Association Naso Sano Onlus, Umbria Regional Registry of volunteer activities, Corciano, Italy and Mahatma Gandhi Memorial Medical College, Indore, India and President, Association Naso Sano Onlus, Umbria Regional Registry of volunteer activities, Corciano, Italy},
doi = {10.1038/s41429-021-00430-5},
firstpage = {1},
fulltext_html_url = {https://www.nature.com/articles/s41429-021-00430-5},
fulltext_world_readable = {},
issn = {1881-1469},
journal = {The Journal of Antibiotics},
journal_abbrev = {J Antibiot},
journal_title = {The Journal of Antibiotics},
language = {en},
lastpage = {13},
online_date = {2021/06/15},
pdf_url = {https://www.nature.com/articles/s41429-021-00430-5.pdf},
publisher = {Nature Publishing Group},
reference = {available deleted for space},
title = {The mechanisms of action of Ivermectin against SARS-CoV-2: An evidence-based clinical review article},
url={https://www.nature.com/articles/s41429-021-00430-5.pdf},
srcurl={https://www.nature.com/articles/s41429-021-00430-5.pdf},
xsrcurl={https://www.nature.com/articles/s41429-021-00430-5},
citeurl={https://www.nature.com/articles/s41429-021-00430-5}

}






________________________________________
From: texhax <texhax-bounces+marchywka=hotmail.com at tug.org> on behalf of Mike Marchywka <marchywka at hotmail.com>
Sent: Thursday, June 17, 2021 9:06 PM
To: texhax at tug.org
Subject: Does anyone here actually work with Zotero or some other entity that working on similar problem?

This is probably getting a little off topic for this list so if anyone can suggest another
one that would be great. But I'm still left wondering how people manage to
create a bibliography now lol.

Nature is now failing on Zotero,

https://www.nature.com/articles/s41429-021-00430-5.pdf

and as I mentioned before they used to return ris but appear to have abandoned that.
They appear however to have converted to the other more common approaches.
When I run my code and request all possibilities, I get 26 or which a few
are really valid ( I'm hacking up my code so this is expected right now ).
It appears they too have adopted ld+json. I have not seen any indication Zotero
makes use of this when it exists.

So, I guess I would just comment Zotero does not seem immune from the problems
I was worried about and a development strategy to  deal with publisher
changes would be helpful. In my case, if the code was not hacked up, it
would have realized the nature domain things was failing and then try other
generic approaches ultimately returning something useful and diagnostic
information.

The "hacking" right now is due to a more accurate hierarchial parsing system.
The html is so dirty that I have found at least one case that sed and grep worked
where a real parser failed but in most cases the real parser will be more robust
and some pieces work well with json and html trees.

It may be however that ld+json is getting pretty uniform acceptance and that
may be the solution for a long time.

Thanks.



note new address
 Mike Marchywka 306 Charles Cox Drive Canton, GA 30115
470-758-0799
404-788-1216





More information about the texhax mailing list.