ERCIM endorses Warwick Metadata Workshop

by Thomas Baker

Now that the World Wide Web has reached a size of 40 to 50 million pages of images and text at perhaps 500,000 locations, we rely increasingly on indexing robots. But these services retrieve more information than one normally can use, and most of it is irrelevant. Descriptive tags, however rudimentary, could help in narrowing searches ­p; if only one could get enough Web publishers, from libraries and government agencies to local organizations and ordinary users, to describe their own materials in a consistent, internationally standard way.

The dual challenges of tagging existing web documents with descriptive data and designing an extensible framework for metadata in general were the subject of a workshop held from April 1 to 3, 1996, at the University of Warwick. The workshop was organized by the United Kingdom Office for Library and Information Networking (UK) and the Online Computer Library Center (USA), with endorsement from the Joint Information Systems Committee (UK), the National Center for Supercomputer Applications (USA), DLib (the digital library forum of the Corporation for National Research Initiatives and the High Performance Computing and Communications program, USA), and ERCIM. The participants included representatives of several national libraries (USA, Britain, the Netherlands, Norway, Finland, and Australia) as well as software companies and universities.

The Warwick workshop achieved consensus on a limited core of description elements and on the basic design of a container architecture for encapsulating these core elements with other sets of metadata. Examples of such metadata, or data about data, include library catalog records, specialized descriptions (eg, for maps), terms and conditions of use, pricing and payment information, and labels for sexual or violent content.

The architects of this so-called Warwick Framework ­p; led by Carl Lagoze of Cornell and Clifford Lynch of the University of California ­p; are seeking to implement the container concept in a variety of ways. The simpler implementations will use existing HTML tags and compound MIME-typed documents and will work on today's World Wide Web with minor extensions. More powerful implementations will use, for example, a CORBA-like distributed object framework. Carl Lagoze says: "We need metadata methods that work today. But it is equally important to design a digital library infrastructure not constrained by existing technology and, in fact, to provide guidance on how that technology should evolve."

The workshop in Warwick followed a workshop held in March 1995 at OCLC in Dublin, Ohio, which resulted in the Dublin Core - a method for describing information resources with thirteen common-sense elements such as author, title, publisher, subject, and language. Ongoing working groups are designing implementations of the Warwick Framework, defining its syntax, and preparing user documentation. For further information, see

