Metalogger

September 6, 2007

some criticisms of MODS

Filed under: MODS,Repositories — Neil Godfrey @ 3:59 am

Although MODS has been widely adopted in metadata application profiles and as a schema in its own right as both a companion and alternative to MARC, MODS has had its critics. Some of these were raised in informal discussions at the recent DC-2007 Conference in Singapore; and some were repeated in ad lib comments on a metadata seminar from the podium. Some of these criticisms (with a few of my comments tossed in) are:

1. In a MODS entry the family and given names are separated. This is not the case in MARC or DC. This is also inconsistent with MARC name authorities. How well used are the equivalent authorities for MODS, i.e. MADS? What difficulties might be entailed for maintaining this form of the name? Any? On the other hand, would not the breakdown of family and given names allow give computers more flexibility in manipulating layouts and presentations of the names in different contexts?

‹mods:name type=”personal”›

‹mods:namePart type=”family”› Cushman ‹/mods:namePart›

‹mods:namePart type=”given”› Charles Weever ‹/mods:namePart›

‹/mods:name›

Compare DC:

‹dc:creator›Cushman, Charles Weever ‹/dc:creator›

Compare MARC:

100 1_ $a Cushman, Charles W., $d 1896-1972

2. If one person has multiple affiliations it may not be clear in MODS if the record refers to one person with more than one affiliation or more than one person with the one affiliation each. I wonder how serious this would be as an issue, however, especially in comparison with MARC. In MARC the affiliation $u subfield is not repeatable within each name field. There is additional scope for sponsors etc. How often have we ever wanted to assign multiple affiliations to authors? But I am not at all clear exactly how and at what level multiple affiliations appearing within each ‹mods:name› node is potentially confusing anyway.

3. MODS separates the personal name value and that name’s role value from each other, too.

‹mods:name type=”personal”›

‹mods:namePart type=”family”› Cushman ‹/mods:namePart›

‹mods:namePart type=”given”› Charles Weever ‹/mods:namePart›

‹mods:namePart type=”date› 1896-1972 ‹/mods:namePart›

‹mods:role›

‹mods:roleTerm authority=”marcrelator” type=”text”› photographer ‹/mods:roleTerm›

‹mods:roleTerm authority=”marcrelator” type=”code”› pht ‹/mods:roleTerm›

‹/mods:role›

‹/mods:name›

Compare the DC marc relator refinement:

‹marcrel:pht› Cushman, Charles Weever ‹/marcrel:pht›

The DC enables one to say a specific name is the photographer. The specificity is lost in MODS. This is the criticism. Again I would like others to explain exactly how any specificity is lost in MODS. Not denying there is — still waiting to be informed.

Though it has been said that these above “bad” features of MODS make for potential difficulties in crosswalks and interoperability, I do not see how that would be the case. Again, feedback welcome.

Criticisms in Context?

Not knowing the detailed history the following thoughts are speculative.

Sally McCallum in a 2004 article An Introduction to the Metadata Object Description Schema (MODS) speaks of early MODS drafts containing “mixed content” (mixing sub-elements with content — e.g. ‹title›‹nonSort›The‹/nonSort›world of learning‹/title›) which “XML developers are reluctant to use . . . . because of the possibility of ambiguity and difficulty in referencing the content for processing. . . . ” (p.87). But these concerns were met by the MODS developers eventually deciding against that approach. In the case of the example given here they enveloped the individual elements (title, nonSort) — tagging each within the envelope ‹titleInfo›. I don’t understand how this still allows for ambiguity. Is some of the current criticism a legacy of past issues?

There’s another context too — one in which I would have no argument at all — discussed below.

The other side

Sally McCallum’s article is worth reading for an overview of the “good” things about MODS, by the way, including how it can fit in with the 4 FRBR categories and how to preserve data when working on round trip transformation with MARC21.

Another is Rebecca Guenther’s Using the Metadata Object Description (MODS) for resource description: guidelines and applications.

MODS is supported by LC and is widely adopted as a schema in its own right and as parts of widely used application profiles.

Comparing RDF

On the other hand, Bruce D’Arcus on his darcusblog (see Blogroll in right margin) has some pretty heavy views of MODS in his post Plugging into frbr – killing marc which he also posted to a MODS query of mine on LINT (again see Blogroll in right margin).

But if objections to MODS are made with a view to arguing for the superiority of RDF then I don’t see any argument at all. Where there’s an option between using MODS or RDF then of course one would be mad not to opt for RDF.

My interest in MODS is as a schema within the context of metadata management within specific academic (library) digital repository solutions (VITAL, Fez) — alongside those of MARC or VRA etc. It does have advantages over MARC in this context — but MARC also has its own advantages.

RDF is coming and will hopefully bring repositories into the semantic web before too long. But that’s another subject distinct from the context in which I have been addressing the issue.


Note for rubric partners: The above is an edited version of a discussion written up on the (nonpublic) rubric partner wiki


Advertisements

4 Comments

  1. I wasn’t at DC 2007, so I can’t shed any light on MODS conversations that happened there, but I think I can provide some insight on a few issues you raised in this post.

    Re: the name granularity issue: While MODS *allows* you to separate, for example, family names from given names, it doesn’t *require* you to do so. The granularity you put in your MODS records can vary based on your needs and your data sources. So both of the following are perfectly valid MODS:

    (this would follow MARC authority practice)

    ‹mods:namePart>Cushman, Charles Weever‹/mods:namePart›
    ‹mods:namePart type=”date”› 1896-1972‹/mods:namePart›

    (this if you mapped from the less granular DC and didn’t try to parse out the dates)

    ‹mods:namePart>Cushman, Charles Weever, 1896-1972‹/mods:namePart›

    It’s my opinion that the MODS designers made a good choice not limiting name granularity to MARC practice, but also not mandating any one specific granularity.

    Re: connecting the role to the name: The encoding you cite does in fact connect the role explicitly to that one person. (I should know, you’re citing my records.) Note that role is a *child* of name, meaning that role applies to that name and no other. If one person has multiple roles on a resource, you can repeat role within a single name. (Or repeat the name, but that’s probably not good practice.) This connection by virtue of hierarchy is a fundamental principle of XML. The relationship is not stated as explicitly as it would be in RDF, but this is just the way XML works. If (and again, I wasn’t there, so I can’t say that it is) this is the criticism that was made of MODS at the DC 2007 Conference, it was based on a misunderstanding of the encoding.

    The same principle applies to the affiliation issue you mention. Two people would each have their own affiliation subelement of name (even if they had the same affiliation), one person with two affiliations would have one name element with two affiliation subelements. The encoding makes a clear distinction between the two.

    Comment by Jenn Riley — September 9, 2007 @ 10:33 pm

  2. It’s a while since I looked very closely at MODS and noticed a while ago I had prepared some MODS templates without separating out the given and family names and was beginning to wonder if I had gone a bit off the rails. Thanks for the timely clarification.

    Would you like to expand on why you think it was a good idea for MODS not mandating the granularity level? Do you mean to imply that it facilitated interoperability with MARC and DC while also enabling a match with other schema that do separate the given and family names? (Given the way other XML applications like FOAF etc seem to habitually parse out the first and last names it seems to be a sensible thing that MODS allows this option.)

    One other nifty MODS feature is its ability to handle about 70% of the different date types in the JISC’s Functional Requirements:

    The JISC functional requirements list:

    Date available
    Date of modification of a copy
    Date of formal publicatio
    Date copyrighted
    Date created
    Validity periods
    Dates of submission to / acceptance by publisher/conference etc.
    Dates of submission of theses/dissertations
    Date captured – could be used for digitized versions
    Date of deposit [out of scope – administrative metadata]

    The MODS elements:

    dateIssued
    dateCreated
    dateCaptured
    dateValid
    dateModified
    copyrightDate
    dateOther

    I don’t recall from my cataloguing days ever having to enter up to ten dates for any catalogue entry!

    Comment by neilgodfrey — September 11, 2007 @ 8:34 am

  3. Re: why I personally think the name granularity flexibility is a good idea… Yes, you’ve guessed at one rationale behind my opinion, that it’s good to plan for interoperability with other major metadata formats. I think requiring one specific granularity would be a deal-killer for many applications that might otherwise use MODS. I like to think we can demonstrate the benefits of storing names in more granular formats, and let people gravitate in that direction over time. I think it would be less effective to mandate a specific granularity and expect everyone to re-process legacy metadata.

    Re: dates: just because all those date elements are there doesn’t mean you have to enter them all! It’s kind of odd that MODS made these separate elements rather than a type attribute on a more generic date element (it reflects a different approach than they took with, for example, names), but nonetheless, I don’t think we should interpret the presence of all of these as an instruction to provide them. We should rather look a the dates we *do* have, and just put them in the appropriate element.

    Comment by Jenn Riley — September 15, 2007 @ 5:23 pm

  4. Thanks, Jenn.

    Re the dates, no no, I didn’t mean to suggest using them all is a requirement — but those options do sit very nicely with so many of the date options that are wanted in repositories for scholarly works. They are in fact very useful to have as part of MODS. (Not all repositories currently allow for data storage in MODS but some do and one hears rumours of more on the way.)

    (My wry comment about comparisons with my cataloguing days was just a wry comment.)

    Comment by neilgodfrey — September 15, 2007 @ 9:36 pm


RSS feed for comments on this post.

%d bloggers like this: