Metalogger

November 20, 2007

More MODS advantages over MARC

Filed under: MARC,MODS — Neil Godfrey @ 1:34 pm

I have recently had opportunities to work with MODS in creating templates for research and publications data external to the university repository environment.

Multiple note fields can be added each with its own customized display label in MODS. One can add multiple 500 general note fields in MARC but MARC does not recognize a $i display text to differentiate them.

Example:

<note displayLabel=”Objectives” />
<note displayLabel=”Background” />
<note displayLabel=”Methodology” />
<note displayLabel=”Progress” />
<note displayLabel=”Implication” />

In the same way displayLabels can be used to distinguish quite different “title” types — e.g. the titles of research program and subprogram as opposed to the title of specific research activity.

And multiple affiliations and addresses can be nested with each personal name in MODS. MARC allows only one — NR (not repeatable) — $u subfield for an affiliation or address in each 100 or 700 field.

<name type=”personal”>
<namePart />
<affiliation>$researcher[‘affiliation’]</affiliation>
<affiliation>$researcher[‘address’]</affiliation>
</name>

Those are TWO HUGE advantages as anyone trying to squeeze nontraditional data into digital archives will appreciate.

Advertisements

September 16, 2007

I still do like MODS

Filed under: MODS — Neil Godfrey @ 9:41 am

My recent post titled Criticisms of Mods was a response to encountering some pretty heavy criticisms of it at a recent conference. The criticism did take me by surprise, but there was at the same time a certain disconnect with a number of conference presentations that demonstrated that use of MODS was more widespread than I had realized. (I have discussed previously the Texas Digital Library’s use of MODS for its Thesis and Dissertation collection, and on another blog my own reasons for recommending its use where possible in repositories.)

I have since tried to find support for the specific criticisms without success. The technical issues seemed more strengths than weaknesses, as Jenn Riley has also helpfully explained re the name options in MODS.

My comments on the wide range of dates was meant not as a criticism but as another plus for MODS in the context of digital resources in repositories.

The Minerva project also recommends MODS (note especially slide #43) for a collection of digital resources. A Minerva MODS record for a web site can be seen here.

In support of the benefits of MODS already alluded to above, MODS elements seem particularly suited for the sorts of digital repositories I’ve been working with: (more…)

September 6, 2007

some criticisms of MODS

Filed under: MODS,Repositories — Neil Godfrey @ 3:59 am

Although MODS has been widely adopted in metadata application profiles and as a schema in its own right as both a companion and alternative to MARC, MODS has had its critics. Some of these were raised in informal discussions at the recent DC-2007 Conference in Singapore; and some were repeated in ad lib comments on a metadata seminar from the podium. Some of these criticisms (with a few of my comments tossed in) are: (more…)

June 26, 2007

MODS for Theses

Filed under: E-Theses and ETD conference,MODS — Neil Godfrey @ 2:57 am

I hope to discuss this more fully in a later post but am making available here a MODS application profile for theses in repositories.

Thanks to Adam Mikeal (from the Texas Digital Library consortium) for forwarding me this. Though it is no doubt also online elsewhere.

MODS application profile for theses

June 8, 2007

MODS and MARC — what losses are there in crosswalking? do they matter?

Filed under: MARC,MODS,Repositories — Neil Godfrey @ 12:04 am

Is there any real disadvantage with using MODS as opposed to MARC for repository data storage?

Yes, there is some loss of data if one tries to walk from MARC to MODS. But why and when would one want to ever make that journey? But more importantly, what loss exactly does occur? The thought of losing data sounds fraught with horrendous potential to cataloguers so it pays to see exactly what is lost and then decide that the question is not whether there is loss of data but whether it matters — in the context of institutional repositories.

Check the title data as shown on the mapping guide at http://www.loc.gov/standards/mods/mods-mapping.html

245 $a$f$g$k maps to <title> <titleInfo
245 $b
maps to <subTitle>
245 $n (and $f$g$k following $n)
maps to <partNumber>
245 $p (and $f$g$k following $p)
maps to <partName>
245 ind2 is not 0 maps to <nonSort>

So the granularity lost here is what is found in $f $g $k

Which are:

$f Inclusive dates (NR)

The time period during which the entire content of the described materials was created.

$g Bulk dates (NR)

The time period during which the bulk of the content of the described materials was created.

$k Form (R)

A term that is descriptive of the form of the described materials, determined by an examination of their physical character, the subject of their intellectual content, or the order of information within them.

How often are those used in academic libraries? What will fall apart if they are combined in the one field with MODS?

What about the personal author field? (I’m not including the corporate author field because I do not yet know how the equivalent of a 110 would be entered into a repository that is for the purpose of archiving the works of academics. If the repository is to showcase the work of its academics, what room is there for a corporately authored document? Libraries store books by conferences and corporate bodies. Repository databases store the individual works of each author.
100, 700 <name> maps to type=”personal”
100 maps to <role><roleTerm> type=”text”
use text “creator” if desired, to maintain indication of “main entry”

100, 700 $a$q maps to <namePart>
100, 700 $d maps to <namePart> with type=”date”
100, 700 $b$c maps to <namePart> with type=”termsOfAddress”
100, 700 $e maps to <role><roleTerm> with type=”text”
100, 700 $4 maps to <role><roleTerm> with type=”code”
100, 700 $u maps to <affiliation> under <name>

This means we lose the first MARC indicator that defines whether the personal name is to be a Forename (0), Surname (1) or Family Name (3). Is that going to be a problem in the repository’s archive?

It also means that the fuller form of the name ($q) is not going to be demarcated from the initial entry of the name. If this is really an issue I am sure a stylesheet can be written to recognize the brackets surrounding that name and its granularity will still be preserved anyway.

$b for a name’s numeration is melded with $c, the titles and other words associated with the name. The $b only applies when the entry is a Forename anyway, which is not granularized either. I don’t personally see a problem with placing III’s and Sir’s in the one data entry field.

No time to make a complete comparison. Maybe a future post I can explore more. But title and main author are key elements and I don’t see any real issue in “lossed” data over using MODS in place of MARC with these.

November 19, 2006

MODS (originally posted October 18 2006)

Filed under: MODS — Neil Godfrey @ 10:23 pm

I’d love to hear from anyone in library-land who has used/heard of/rejected/embraced/toyed with MODS (metadata object description schema) as the next evolutionary step from MARC.

I asked this question more locally about a year ago when I first started seriously to inform myself about metadata and got either blank or ‘don’t touch that’ looks back then. Since then the MODS question has hit me again and I am wondering what I don’t know about it for it not to be more broadly known and used. From my reading it appears to be a brilliant alternative to MARC in the new world of things like interoperability, repositories, electronic databases…. It looks damn easy to understand (no marc tags for the uninitiated to navigate) and maps well to anything from the simplest Dublin Core to the most complex ONIX.

One negative comment I did hear was that it cannot catch all MARC data but when I wrote up a table comparing all the fields one could want in a repository I could not see any problem that way at all: the only thing it loses is the excess MARC bits and pieces that are simply not relevant to repositories and the sharing and searching of collections in the new info world. Things like the 246 indicator that codes a varying title as being a “spine” title — what’s a spine in an electronic resource anyway?
Thoughts? Would love feedback since am planning on soon applying all I’ve read about it to the real test world and seeing what happens.