[tug-summer-of-code] Project ideas

Kittipat Virochsiri kittipat at thisbluepla.net
Sun Mar 8 17:16:02 CET 2009


On Sat, Mar 7, 2009 at 10:29 PM, Peter Flynn <peter at silmaril.ie> wrote:

> Karl Berry wrote:
> > Hi Kittipat,
> >
> > You wrote:
> >
>
>> I'd like to participate in GSoC as a student. I saw the Dublin Core
>>> idea on the GSoC idea list page and I'm quite interested. However,
>>> there's a point in the key deliverables that I couldn't
>>> understand. Specifically, what is the important of "methods for
>>> package authors to declare new metadata element sets and
>>> vocabularies"? It should be no problem if "an implementation of the
>>> Dublin Core Abstract Model in TeX" has already been done, isn'tit?
>>>
>>
> As I understand this, it hinges on the word "new". Kittipat is quite right,
> if the DCAM is implemented in [La]TeX, then using it isn't the problem. It's
> when a new element set or vocabulary comes along that we need the ability to
> adapt to it without reinventing the wheel. The huge increase in usage of
> Z39.88 because of Zotero is one example.
>
> There are lots of abstract models for metadata, from the relatively simple
> bibliographic level (eg BIBTeX) up to the humungously complex one in TEI
> (implemented for standalone metadata in EAD). I think the point is meant to
> be, we should be able to adapt as the demand and the usage changes, without
> having to rewrite everything.
>
> I'm just back from the IUISC (www.iuisc.ie) and one of the perennial hot
> topics was the reuse of metadata. But with 100 librarians in a room, and you
> ask what model you should use, you get >100 answers :-)
>
> YMMV
>
> ///Peter
>
Then, implementing RDF should provide more generic solution. There should be
one set of instructions for declaring namespaces, resources, properties,
statements, etc. These instructions should store the data into token lists
for later processing. Another set of instructions would read the data and
then export them into target format. This should provide a fairly generic
framework for metadata defining and exporting. Then, adding support to any
metadata model that can be expressed in RDF is just the matter of declaring
the elements. On top of that, a less verbose interface can be defined for
end-users uses.

For example, steps in defining DC terms would look like:
\rdfnamespace{dcterms:}{http://purl.org/dc/terms/}
\rdfproperties{dctitle}{dcterms:title} %The first argument is an alais for
using in declaring statements
...
Defining statement would like:
\rdfstatement{http://www.example.com/document}{dctitle}{Example}
And ther could be shortcuts for end-user like:
\dctitle{Example}
or probably more generic:
\metatitle{Example}
which would map to the term defined in vocabulary set.

Introducing Z39.88 into the system is, in fact, adding more statements.
However, to make it more convenient to end-user, the commands like
\metatitle have to be modified in order to automatically generate Z39.88
metadata. The ability to mapping one metadata statement to multiple metadata
models is, of course, subject to the compatibility of term definitions.

The limitation of this approach is that the metadata model must be able to
be encoded in RDF. The more generic solution would be implementing XML
library at the base level but I think that would be too complicated.

Kittipat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://tug.org/pipermail/summer-of-code/attachments/20090308/14df6582/attachment-0001.html 


More information about the summer-of-code mailing list