ERCIM News No.33 - April 1998

EU-NSF Working Group on Metadata

by Thomas Baker and Clifford Lynch

The first of two meetings of the EU-NSF Working Group on Metadata was held on 2-3 February 1998 at the Coalition of Networked Information in Washington DC. This is one of five groups funded jointly by the National Science Foundation (through the University of Michigan) and by the European Union (through ERCIM) on strategic issues in technology for digital libraries. Their final reports will identify areas appropriate for joint international research collaboration.

Present in Washington were Clifford Lynch (Coalition of Networked Information, Washington DC), Thomas Baker (GMD, Germany; currently a visiting professor at AIT in Bangkok), Rachel Heery (UK Office of Library Networking, Bath), Gene Alloway (University of Michigan, Ann Arbor), Anne-Marie Vercoustre (INRIA, France; currently a visiting researcher at CSIRO in Australia), Jose Borbinha (INESC, Portugal), Howard Besser (University of California, Berkeley), Ole Husby (BIBSYS, Norway; Nordic Metadata Project), Stuart Weibel (Online Computer Library Center, Ohio), Renato Iannella (Distributed Systems Technology Centre, Australia), Carl Lagoze (Cornell University; coordinator of the EU-NSF Working Group on Resource Indexing and Discovery in a Globally Distributed Digital Library), and Shigeo Sugimoto (University of Library and Information Science, Japan. Rose Gombay of International Programs, National Science Foundation, joined us for the opening session.

In its most common sense, 'metadata' is structured data for helping users find and process documents and images. Scientists and librarians work hard to define standard categories for cataloging information in their specialized fields. Our focus was not on the semantics of such systems, but on the architectures, tools, and models needed to manage metadata in the networked environment.

From our initial brainstorming and position papers, we compiled a long list of research issues, then grouped these into clusters. One cluster bears the heading Architectures and Technologies - eg, how to embed or associate metadata with objects, how to format metadata, and how to store or mirror metadata for harvesting, together with related tools and query languages. Another cluster includes Metadata for New Types of Resources other than traditional documents (eg, collections as a whole, multimedia objects, time-based media, and dynamically generated objects) and New Types of Metadata (eg, for administrative uses, authenticity and certification, terms and conditions of use, ratings, and privacy policies). A third cluster calls for Evaluation and Metrics on the effectiveness of metadata - both technically (ie, controlling quality, tracking deployment and usage patterns, and assessing the effectiveness of metadata for retrieval), and economically (ie, costs versus benefits of metadata use).

When we took a straw vote on the relative importance of the clusters, over half of the votes went to the Management of Interoperability cluster. Everyone agreed that much work needs to be done on the technology of registries and crosswalks for mapping, inheriting, and extending metadata schemas of diverse types and in multiple languages. Many members of the group are involved in an effort to deploy the emerging Resource Description Framework of the World-Wide Web Consortium for a distributed registry of the Dublin Core, a set of metadata elements for simple resource description available in many of the world's major languages.

A final cluster, Theory and Foundations, defined several areas in need of fundamental research: Formal Models for underpinning registries, crosswalks, and schemas; methodologies for supporting the Evolution of Metadata on the part of diverse, distributed, multilingual user communities; and issues of legal and trans-border policy.

The group plans to complete a first draft of its report in time for a presentation in July. The second and final meeting of the group will take place 17-18 September 1998 in Bonn, Germany, at which the draft recommendations will be finalized.

