May 23, 2008

Meta-reflections 4: team management issues

Filed under: Repositories — Neil Godfrey @ 2:24 pm

Till I resume paid work I’m taking time to go over some of the metadata things I learned and did in my last job here. It feels like I’m reflecting on a past life in the Ark, especially since at the same time I’m attempting to catch up with reading what’s what in new developments and projects. I’m in a schizoid zone — a luxury that comes with reviewing one’s past work for job applications and attempting to keep up with new developments for job applications.

Reflecting here on a critical ingredient for the success of a project team like RUBRIC was. It worked best when there were regular weekly meetings where the current immediate goal would be clarified, and each member of the team would leave with a specific set of tasks toward that end, and also aware of each other team member’s tasks. Sitting in the same room and being able to talk directly with each other was almost as good as Instant Messaging. This enabled real team-work — we’d consult with each other to ensure that the metadata issues were effectively applied within the requirements of the IT issues and vice versa. And the immediate goal we were working on was the next stage of assisting repository managers to implement their repository.

That sort of communication continued, but circumstances happen and the weather sometimes changes. It is only in hindsight I can see that some misunderstandings and tensions, normal at times in any collaborative effort, between repository managers and the central team may have been bypassed had that starting model of the weekly meetings continued. New repository managers are grappling with a wide variety of issues, from technical to business to public relations and more. When they raise an issue with a member of a technical team they may be seeking an answer to something that has broader ramifications than the technical issue itself. It would seem logical that the chances of the team being able to meet their real needs would be enhanced by having all members who are working on some aspect of each manager’s work to share the manager’s issues with each other. Without this, the tech team and metadata person seemed (in hindsight now) to be playing catch up with each other. Cooperation was increasingly a matter of resolving post hoc requests. Each doing one’s own thing inevitably led to some collisions up ahead that needed further resolving.

Maybe this is a biased view of some of the slight wonkiness that seemed to enter relationships between the central team and some of the different university partners. The model on which the project began was similar to the model I had introduced when I had a chance to be the acting head of a small cataloguing team. (Not that the early RUBRIC model had anything to do with that.) Regular meetings that called for the involvement and contributions of all members in tandem with re-evaluation of the immediate priorities, and in between the constant team referral back to these priorities and related tasks at hand. (Thinking back in the Ark still, I’d like to write up a detailed description of that too.)

The need for cooperation and team work in a pioneering field, especially one that calls on expertise from widely varying sources, should be a truism, but managing to achieve it does take disciplined effort. A spinoff is good morale all round. But that’s a banal truism too. But it’s nice when it happens just the same.

May 16, 2008

Reflecting on falling through the cracks, and segmented leadership in the Australian repository scene

Filed under: Harvesting,Repositories — Neil Godfrey @ 1:01 am

My last post recollecting the time it took to learn about the difference between a DCMI rule and an OAI-PMH rule for the meaning of dc:identifier — a difference that only made sense in the context of the politics of what repositories are about — in hindsight looks very embarrassing. It’s obvious when you know.

But it was not obvious to everyone I spoke to who is closely tied up with the DCMI community. And asking strangers via email questions about something complex which one is learning from scratch can be fraught with the cloudiness of not quite understanding what the real issue is, and therefore how to frame the question, and the two parties not quite understanding the frames of reference of each other.

The answer only finally became obvious after several face to face encounters at a conference, and then finally finding “the key person” — a harvester woman! — to talk to, with pen and paper and lots of doodling diagrams. Till then, some who were specialists in their particular area were saying the conflict ought not to exist, and that it needed to be fixed. I was beginning to think I knew more of the issues and questions than the veterans, if not the answers. But it was really only a question of finding which one of the scores of people in the room to single out for this particular question. A question relating to DC did not necessarily mean that a DC specialist theorist would know how to answer it.

Lesson: in something as complex and new as repositories and their related activities such as harvesting, we cannot rely on the normal channels of communication and learning that work for well-established protocols and systems, as we have with normal library functions. I found massive background reading was essential, and even then there turned out to be gaps that were only filled by direct personal exchanges.

It’s a team sport, with all players needing to share their experiences and issues, and to get together often (not exclusively virtually either) to plan and discuss what they are doing, hassling over, etc. Then the simple and obvious things really are.

But that’s hardly the optimum way of operating — it’s too easy for one to fall through cracks along the way and wait to be picked up and dusted off.

That was when I was involved with simple “first generation” repositories. Deposit an object, retrieve an object, with all the preservation and authentication bits in between.

There were other issues too even at this basic level. Some harvesters complained that the data they were picking up from repositories included a lot of “noise”. Sometimes a maverick repository would use a DC element for data unrelated to its real purpose. In other cases multiple terms would be used to describe the one type of resource (e.g. periodical, newspaper, journal). And in other cases there would be too many of the same DC elements coming through (e.g. Date) without any obvious differentiation (e.g. date published, date copyrighted, date awarded, etc).

None of those was or is insuperable. Why not (relatively) simply set up a program that will enable the harvester to streamline the data it receives — so that the known common alternatives (e.g. periodical, magazine) were all dumped under the resource type “journal” or whatever the desired appropriate standard? Or in the case of multiple undifferentiated dc elements like dc:date, then would it be too difficult for a specialist harvester to take the initiative and introduce a slightly modified dc schema (a DC application profile, possibly one already in successful use elsewhere) for, say, theses?  There are other work-arounds, but a business-case / cost-benefit study should help assess the best alternative for the long-term.

One reason they have not been resolved here until now may be, I think, because Australia has lacked a coordinating or leadership body relating to these areas. Ad hoc team-work has its limits. Australia has had a number of bodies — ARROW, the National Library’s Discovery Service and the Australasian Digital Thesis Program and other libraries who  relate only to one or two of these — working on their own remits without a real coordinating vision.

Each of these bodies grew up like Topsy. There is now MACAR, and that body is looking at recommending metadata standards for repositories. What it is working on is important. But it does not have the resources to meet often enough and to make its presence felt strongly enough, or to address comprehensively enough the key issues affecting all stakeholders, to be seen as a coordinating leader providing the vision and programs needed to smooth out the issues each separate body feels are part of the way things must be for the foreseeable future. Leadership in Australian libraries has traditionally come from the National Library. What I missed when learning of the multi-faceted issues of repositories and metadata was something like a National LIbrary coordinating leadership in this area. Such a nationally recognized body (or one with clear  sponsorship by the National Library) might have had the means to lead in smoothing out the respective issues faced by each discrete part of the repository-harvesting picture.

But now there are other developments on the horizon that appear to have the potential to augment the very purposes and functionings of repositories. Till now Australian repositories have mainly been storage bins for single objects, sometimes multi part or multi file objects. They are often promoted as vehicles to showcase an institution’s (and an individual academic’s) scholarly output. But the next stage may be to use repositories as tools for research as and with the needs of end users being the main rationale.

Going beyond first generation repositories, — in scholarly communities a single work can consist of many parts — a text discussion, datasets relating to the text, specialized types of images that are not only illustration but the very source and object of analysis. The sort of idea now being worked out is that of a user being able to draw out a representation of such an image from one repository and compare it with data harvested from another research repository in a single operation.

Developers are currently testing ways to harvest not just the representations of single pdf or jpeg objects from repositories, but to harvest, say,  URI’s assigned to selected parts different objects across a number of repositories. In simplest terms, it may be possible, for example, to “harvest” or “create” a complete journal edition from the multiple journal articles scattered across a range of repositories. Okay, why. But think through the possibilities once it is understood that’s the sort capability we want to establish.

The implications are vast. Different types of repositories from a range of institutions need to be part of a framework that has the sort of consensus that will make this possible. The technological infrastructure. The institutional support for each repository and the agreement on standards and policies that will support this, and the growth of a research community program that will facilitate the use of all this.

The spadework for some of this has now begun with OAI-PMHs sponsorship in the OAI-ORE project. Proposals for an Australian Data Commons are now being tabled. With effort, planning and maturing these early-day visions and testings will generate their own leadership.

But in the meantime university library repositories have proved how responsive they are willing to be to a national leadership plan and vision. They all focussed on “what to do now” with the immediate future in mind when that was clearly spelled out as RQF. Okay, money and a bit of compulsion were at play there too. But they did not exercise their collective freedom to dig in and protest.

Libraries like authorities, whether AACR2 or the National Library. And being able to confidently adapt an authority to one’s own institutional requirements, without sacrificing anything important, makes some tasks worthwhile. In that sense, the authorities are seen as friendly guides towards the vision, with whom they willingly cooperate.

Till now changes have been happening so fast that there has scarcely been time for an acknowledged leadership in these areas to emerge. Everyone is grappling to learn their areas — and sometimes something can fall through the cracks for a time, waiting to be rescued along the way. The leaderships that do exist are in segmented areas.

May 12, 2008

May 10, 2008

May 9, 2008

May 8, 2008

Back again to Metalogger

Filed under: Uncategorized — Neil Godfrey @ 11:01 pm

It’s been a long time since I’ve updated Metalogger. After the wind up of the original charter of RUBRIC I took a change of pace doing informal tutoring for a couple of months, then a holiday visiting my partner’s home country, Thailand, plus a side-trip to Cambodia, then spent more time hosting guests holidaying back home, and am back now job-hunting. Which means back into the metadata business.

Today I caught up again with the Metadata Advisory Committee for Australian Repositories (MACAR) in a teleconference, and was pleased to discover that 1800 teleconference calls are free on Skype. And yesterday I also caught up with some light reading from the Australian National Data Service Technical Working Group — the proposal for an Australian National Data Service. And a job vacancy in UKOLN has alerted me to more progress on the SWAP front under way there. So my head is beginning to spin with the issues once more after a few months clean break.

I have decided that my first project, (in between job hunting and looking at MACAR agendas) will be to review what RUBRIC achieved on the metadata front, and where to from here. That means writing up draft installments as I think through what was accomplished and what the current and future needs appear to be.

So I can hereby declare this blog reopened for business after a good holiday and more.