www type bibtex entries - generating bibtex for webpages + prior theme.

Peter Flynn peter at silmaril.ie
Sun Sep 15 23:49:47 CEST 2019


On 15/09/2019 13:56, Mike Marchywka wrote:
[...]
>> I'm not clear what "explicit" XML is (as opposed to what?)
> 
> Anything that is XML but called something different, mostly things
> ending in ML :)

Ah. OK, thanks. They're actually all XML; XML is the name of the 
standard, not an actual vocabulary of tags like HTML or TEI. But yes, 
they do tend end in ML.

> Bad editing, I meant that if you hit that link and then look at the content,
> there is another link featured on the right of the page that looks like this,
> 
> Source document
> E590001-007.xml

My misunderstanding, sorry. And you're right, it's badly phrased. That 
is the source document from which the web page is made (in real time). 
It never occurred to me that it also implies that it is the only source 
of the document. I need to edit that and  make it clearer. Thanks for 
the prompt.

> Thanks, that is exactly what I needed for this site but I'm not sure
> how you could have easily found that- they have one button "share" things
> on the other page but you have to dig up the citation.

The target community for these pages is the Early Irish scholar, and 
they know that certain things are in certain places (CELT has been going 
since 1991) so I think we have been lazy in not getting the pages up to 
date.

>> I'm not clear what "flow" means in this context. 
 >
> I had to pick a word for the style- if you try to read it you can't just sit
> down and read it you have all the "XML junk" to read around. 

That's called "markup" and it's what makes TEI useful. XML was NEVER 
designed to be readable like an unmarked text.

> See below but the latex-like syntax does not imply specific
> presentation of the info it just is better visually organized even
> before typesetting into a specific rendition.

Exactly. You'll see that the XML text has no mention of any rendition at 
all. LaTeX isn't better organized per se, it just uses lighter markup. 
This is why humanities texts are stored in XML, and if you need a PDF or 
a web page, you use XSLT to create a LaTeX or HTML file — or whatever 
the format du jour is (currently Markdown).

> I understand all of that and mostly just object on the "human readability."

Horses for courses, I think. Personally, I find the XML perfectly 
readable, but then I've been dealing with it for many decades. I also 
find LaTeX perfectly readable, which many users don't.

> [...] AFAICT provide similar capabilities with varying human
> readability. 

If there was any demand for a LaTeX version, we would. So far, no-one 
has asked.

> There is no reason that a latex-like document needs to have any
> formatting stuff- all those commands can be logical rather than "what
> it looks like" and you can choose rendering algorithms when
> displaying.

Mostly, yes. It's pretty trivial to write XSLT to convert the XML to

\documentclass{book}
\usepackage{celt}
\begin{document}
\title{A Brife description of Ireland: made in this yeere.
	  1589. By Robert Payne}
\author{Robert Payne}
\maketitle
\begin{quotation}
Let not the reportes of those that haue spent all
their owne and what they could by any meanes get
from others in England, discourage you from
Irela\textit{n}d, although they and such others by
bad dealinges haue wrought a generall discredite to
all English men, in that countrie which are to the
Irishe vnknowen.\par

(although harder if you want all the preamble). But even LaTeX has a lot 
of excise, and many users object just as much to backslashes and curly 
braces as they do to pointy brackets.

>> Use biblatex for formatting, not BiBTeX, because the older formats tend not
> ok, I have to see what is involved as I migrated recently not sure I looked
> at bib details.

Depends if you want to use one of the many standard citation/reference 
formats, or roll your own.

> I guess this is kind of open yet. Although in the second link it is 
> funny they mention "plain" style my earlier latex was so old I wrote
> a plainurl bst  that included a url lol.
We had to do this before biblatex.

> Well, the bibtex you found looks nice but it also seems like a
> research task just to find it. I guess if it was on the same page as
> the share features( some journals have a cite button near the
> shares) that would be easier but at least it exists.

I also wrote a bash script to do this once. It tries the whois database 
for the site owner, but that's not much use now they have gone all shy 
and are hiding their identities behind their registrar. But if you 
screenscrape what Google returns from a search for the company name 
(insert spaces and add "Inc") it's not hard to get the location.

> I guess it is kind of an almost irrelevant point but I was curious 
> about authors- both intended content and where to scrape. Probably
> any reader who wanted to look would just hit the link and not care
> how it was written. Ultimately the point of the bibliography is
> documentation and aid to reader.

Right, and if you have a specialist URI reference format, you ight or 
might not want to omit the author — your choice.

> Yeah I get that feeling too I guess links or shares are most of the
> publicity.

I think that's about it.

Peter


More information about the texhax mailing list