May | 2007 | Metalogger

May 9, 2007

Resource types in repositories

Filed under: Metadata,Repositories — Neil Godfrey @ 6:09 am

http://www.bl.uk/services/bibliographic/meeting.html

I’ve had a bit of an interesting exercise recently in setting out the different “resource type” thesauri used by the Eprints, DSpace, Fez and ARROW-VITAL repository defaults, and comparing these with those used by the Dublin Core Metadata Initiative, LOC’s MARC and MODS, and the “research outputs” listed by NZ’s PBRF and UK’s RAE, and even with an online real life experience of Canada’s Carl harvester. The Carl harvester experience summed up the lack of consistency in this search parameter by picking up “yes” and “no” as data values entered into repository resource type fields! Following I’m really thinking “aloud” and nothing more . . . .

The most consistently rational “resource types” list I finally decided was the most useless one for repositories — DCMI. Yeh right, great conclusion, Neil. DCMI of course has this beautiful catch-all resource type “text” to cover everything from articles and books to patents and working papers. But that is where the problem is for repositories. “Text” is a type of resource as “musical recording” is a type of resource. While “musical recording” has a quite narrow range of possible subtypes, “text” covers musical scores as well as up to twenty subtypes across the default types used within the repositories listed above.

The breakdown of DCMI’s “text” types in repositories appears to me to be essentially the result of a breakdown of “for whom different texts are directly written”. I don’t mean the audiences which is for whom they are ultimately written, but those who must approve, collate and publish the text: book publishers, book editors, conferences, journals, academic faculties, the repository itself, government or private business organizations and such. Some may blink at my including the repository itself in that list, but at least one repository I know did respond to an academic’s request to publish a discussion paper he had written that had not been published elsewhere. He wanted a central place of reference for it for his national and international peers and the repository seemed like the logical choice.

Out of these we have the following default resource types:

resource types doc

My second thoughts after listing the RAE and PBRF “research output” types were that whatever respositories do should not hang on what comes out of a government reporting list that will surely be no more stable than the names of government departments themselves. Repository metadata needs to be able to cover whatever changing needs there may be but not be an extension in any way of, in this case, the RQF.

Is this a problem? Does there need to be uniformity? Well if things are “working well enough” now then clearly there does not “need to be uniformity”. But these are early days and repositories are all about ‘_bility’ buzz words like sustainability, interoperability, extensibility. Every institution has its own local types as well as others they share with the rest of the academic community. And different terminologies for the same things. But obviously the fewer standard terms there are for the same things the easier it’s going to be to negotiate searches.

And what about creative works? Where do they fit in? Should they? What of academics who produce scripts, artworks, poetry, that may have no copyright barrier to being made public? They may be produced by scholars but would not normally be classified as “scholarly works” in the usual sense of the term. And they are part of “the intellectual output of the institution” — an oft stated raison d’etre of repositories.

What do others think of the following thesaurus as a (gradual) more encompassing replacement of the above?

But before I finished the above I learned of RDA and DCMI getting together to fit the library into the semantic web. Sounds like this will mean there will emerge a new DCMI-RDA standard set of terms that will presumably be usable for resource types, element names and the rest.

Is this a time to follow best practice and do nothing till the best practice emerges? Attempt to coordinate thoughts and monitor happenings that will affect the larger scene? And/or work towards a temp set of solutions that will work till the final standard emerges?

Comments Off

Metalogger

May 9, 2007

Resource types in repositories

Blogroll