Increasing the Power of Semantic Interoperability for the European Library

by Martin Doerr


With the support of the DELOS Network of Excellence, IFLA and ICOM are merging their core ontologies. This is an important step towards semantic interoperability of metadata schemata across all archives, libraries and museums, and opens new prospects for advanced information integration services in the European Digital Library. The first draft of the combined model will be published in June 2006.

Semantic interoperability of Digital Libraries (DLs) requires compatibility of both the employed Knowledge Organization Systems (KOS; eg classification systems, terminologies and authority files) and of the employed metadata schemata. Currently, the notion and scope of DLs covers not only traditional publications, but also scientific and cultural heritage data. The grand vision is to see all these data integrated so that users are effectively supported in searching for and analyzing data across all domains. Even though the Dublin Core Metadata Element Set is well accepted as a general solution, it fails to describe more complex information assets. These include multimedia and learning objects, and data from characteristic domains such as archaeological finds or observational data from geosciences.

Core ontologies describing the semantics of metadata schemata are the most effective tool to drive global schema and information integration, and provide a more robust, scalable solution than tailored 'cross-walks' between individual schemata. Information and queries are mapped to and from the core ontology, which serves as a virtual global schema and has the capability to integrate complementary information from more restricted schemata. Many scientists question the feasibility of such a global ontology across domains. On the other side, schemata like Dublin Core reveal the existence of overarching concepts. Ideally, the European Digital Library would be based on one sufficiently expressive core ontology, not by selection, but by harmonization and integration of the relevant alternatives. The challenge is to explore practically the limits of harmonizing conceptualizations from relevant domains.

The CIDOC Conceptual Reference Model (CRM) has been developed since 1996 under the auspices of the International Committee on Documen-tation (CIDOC) of the International Council for Museums (ICOM) Documentation Standards Working Group. This is occurring with the initiative and support of ICS-FORTH, Heraklion, and the CRM is about to be accepted as ISO standard (currently ISO/DIS 21127) in 2006. It is a core ontology aiming to integrate cultural heritage information. It already generalizes over most data structures used by highly diverse museum disciplines, archives, and site and monument records. Even the common library format MARC ('MAchine Readable Cataloguing') can be adequately mapped to it. Its innovation is to centre descriptions not around the things, but around the events that connect people, material and immaterial things in space-time. Further, it explicitly describes the discourse on relations between identifiers and the identified, a powerful feature for the integration of information assets.

Quite independently, the FRBR model ('Functional Requirements for Bibliographic Records') was designed as an entity-relationship model by a study group appointed by the International Federation of Library Associations and Institutions (IFLA) during the period 1991-1997. It was published in 1998. Its innovation is to cluster publications and other items around the notion of a common conceptual origin – the 'Work' in order to support information retrieval. Its focus is domain-independent and can be regarded as the most advanced formulation of library conceptualization.

Initial contacts in 2000 between the two communities eventually led to the formation in 2003 of the International Working Group on FRBR/CIDOC CRM Harmonisation. It is headed by Martin Doerr from ICS-FORTH and Patrick LeBoeuf from BNF Paris, and brings together representatives from both communities. The common goals are to express the IFLA FRBR model with the concepts, ontological methodology and notation conventions provided by the CIDOC CRM, and to merge the two object-oriented models thus obtained. This Working Group is now being supported by the DELOS NoE, and in June 2006 will publish the first complete draft of FRBROO, ie the object-oriented version of FRBR, harmonized with CIDOC CRM. This formal ontology is intended to capture and represent the underlying semantics of bibliographic information and to facilitate the integration, mediation and interchange of bibliographic and museum information. Its major innovation is a realistic, explicit model of the intellectual creation process (see Figure). Work will continue with modelling information about authority records and performing arts.

Partial model of the intellectual creation process.
Partial model of the intellectual creation process.

The potential impact can be high. The domains explicitly covered by the combined models are already immense. Further, they seem to be applicable to the experimental and observational scientific record for e-science applications. From a methodological perspective, the endeavour experimentally proves the feasibility of finding viable common conceptual grounds even if the initial conceptualizations seem incompatible. Even though this process is intellectually demanding and time-consuming, we hope the tremendous benefits of nearly global models will encourage more integration work on the core-ontology level. A recent practical application of these models is the derivation of the CRM Core Metadata schema, which is compatible and similar in coverage and complexity to Dublin Core, but much more powerful. It allows for a minimal description of complex processes, scientific and archaeological data, and is widely extensible in a consistent way by the CRM-FRBR concepts. CRM Core can be easily used by Digital Libraries.

Links:
IFLA: http://www.ifla.org
ICOM: http://icom.museum
Definition of the CIDOC CRM: http://cidoc.ics.forth.gr
Definition of CRM Core: http://cidoc.ics.forth.gr/working_editions_cidoc.html
Definition of FRBR: http://www.ifla.org/VII/s13/frbr/frbr.htm
DELOS NoE deliverable 5.3.1: http://delos-wp5.ukoln.ac.uk/project-outcomes/SI-in-DLs

Please contact:
Martin Doerr, ICS-FORTH, Greece
Tel: +30 2810 391625
E-mail: martin@ics.forth.gr