METS derived bibtex from UFL pdf PhD Thesis

Mike Marchywka marchywka at
Tue Mar 22 11:30:51 CET 2022

Here is a good example of a pdf which is difficult to cite
manually or with the Zotero web form,

There is no obvious way AFAICT to find a link or identifier.
In this case, UFL has a link that can be derived from the

For many sites there are obvious transforms to try but this one
is a quirky domain specific one that could change on a whim.
Support for something like this as a one-off is kind of debatable
unless you can get a lot of them, find a pattern, and have
a large enough usage to justify maintenance.

However, once you get to that page there are a lot of idetnfiers
but AFAICT no DOI and just METS/MARC descriptors. Generally
when encountering a non-bibtex source I try to find
a way to capture everything that is there but Zotero
seems to just find things it "knows." Often there is
a pattern and adding sensical bibtex fields is not
hard. However, in this case for the METS file I tried
to just pick out a few entries and came up with
the below. 
Besides the info itself, it tries to document the strategy
and any modifications it made. 
This took several days to create but if I find more
METS files or quirky URL transforms they should be 
easy to add. Sure, there is probably code somewhere and
plenty of translation junk for XML but it probably
would have taken a while to find :)   

 % mjmhandler: toobib guessufl<-handlexmlformats
% date 2022-03-22:06:07:18 Tue Mar 22 06:07:18 EDT 2022
% srcurl:
% citeurl:
X_TooBib = {date: 1954},
X_TooBib = {year: 1954,  infield_fix_dates },
X_TooBib = {publisher: ReWriteParse be.get(s)= be.get(dest)=},
X_TooBib = {journal: ReWriteParse be.get(s)= be.get(dest)=},
X_TooBib = {urldate: FixBeKvp s= cmd=date "+%Y-%m-%d" d=2022-03-22 dn=urldate},
X_TooBib = {author: Hertz , Joel John},
aleph = {024250310},
author = {Hertz , Joel John},
author_orig = {Hertz, Joel John},
committee = {Becker, Charles H. and Husa, William J. and Johnson, Carl H. and Stearns, Thomas W. and Lauter, Werner M.},
copyright = {Copyright Joel John Hertz. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.},
date = {1954},
date_orig = {1954},
discipline = {Pharmacy},
institution = {University of Florida},
language = {English},
level = {Doctorate},
notis = {AEK7303},
oclc = {25743313},
school = {University of Florida},
sobekcm = {AA00004956_00001},
title = {stability and solubility study of riboflavin and some derivatives},
type = {thesis},
updated_date = {2020-08-13T12:53:58Z},
urldate = {2022-03-22},
year = {1954},
final_assembly ={ TooBib handler handlexmlformats ( mets )},



mike marchywka
306 charles cox
canton GA 30115
USA, Earth 
marchywka at
ORCID: 0000-0001-9237-455X

More information about the texhax mailing list.