Is there any real disadvantage with using MODS as opposed to MARC for repository data storage?
Yes, there is some loss of data if one tries to walk from MARC to MODS. But why and when would one want to ever make that journey? But more importantly, what loss exactly does occur? The thought of losing data sounds fraught with horrendous potential to cataloguers so it pays to see exactly what is lost and then decide that the question is not whether there is loss of data but whether it matters — in the context of institutional repositories.
Check the title data as shown on the mapping guide at http://www.loc.gov/standards/mods/mods-mapping.html
245 $a$f$g$k maps to <title> <titleInfo
245 $b maps to <subTitle>
245 $n (and $f$g$k following $n) maps to <partNumber>
245 $p (and $f$g$k following $p) maps to <partName>
245 ind2 is not 0 maps to <nonSort>
So the granularity lost here is what is found in $f $g $k
$f Inclusive dates (NR)
The time period during which the entire content of the described materials was created.
$g Bulk dates (NR)
The time period during which the bulk of the content of the described materials was created.
$k Form (R)
A term that is descriptive of the form of the described materials, determined by an examination of their physical character, the subject of their intellectual content, or the order of information within them.
How often are those used in academic libraries? What will fall apart if they are combined in the one field with MODS?
What about the personal author field? (I’m not including the corporate author field because I do not yet know how the equivalent of a 110 would be entered into a repository that is for the purpose of archiving the works of academics. If the repository is to showcase the work of its academics, what room is there for a corporately authored document? Libraries store books by conferences and corporate bodies. Repository databases store the individual works of each author.
100, 700 <name> maps to type=”personal”
100 maps to <role><roleTerm> type=”text”
use text “creator” if desired, to maintain indication of “main entry”
100, 700 $a$q maps to <namePart>
100, 700 $d maps to <namePart> with type=”date”
100, 700 $b$c maps to <namePart> with type=”termsOfAddress”
100, 700 $e maps to <role><roleTerm> with type=”text”
100, 700 $4 maps to <role><roleTerm> with type=”code”
100, 700 $u maps to <affiliation> under <name>
This means we lose the first MARC indicator that defines whether the personal name is to be a Forename (0), Surname (1) or Family Name (3). Is that going to be a problem in the repository’s archive?
It also means that the fuller form of the name ($q) is not going to be demarcated from the initial entry of the name. If this is really an issue I am sure a stylesheet can be written to recognize the brackets surrounding that name and its granularity will still be preserved anyway.
$b for a name’s numeration is melded with $c, the titles and other words associated with the name. The $b only applies when the entry is a Forename anyway, which is not granularized either. I don’t personally see a problem with placing III’s and Sir’s in the one data entry field.
No time to make a complete comparison. Maybe a future post I can explore more. But title and main author are key elements and I don’t see any real issue in “lossed” data over using MODS in place of MARC with these.