A Tool for Converting Bibliographic Records

by Trond Aalberg

The FRBR model for bibliographic information enables libraries to accommodate a broad range of user needs, and is considered to be an important step towards the next generation of library information systems. To support the application of FRBR in current library catalogues, solutions are needed to the problem of interpreting or converting MARC-based information. At the Norwegian University of Science and Technology, we have developed a conversion tool for this purpose.

The Functional Requirement for Bibliographic Records (FRBR) was published by the International Foundation for Library Associations and Institutions in 1998 and is widely acknowledged within the library community as an important contribution to the modernizing of library cataloguing and information systems. The ER-model proposed by the FRBR Working Group is a formal conceptualization of the entities, attributes and relationships of concern in bibliographic information. For the end user, the model promises to support a broad range of expectations and needs.

The heart of the FRBR model is a set of entities that represent the key objects of interest to users of bibliographic information. The products of intellectual or artistic endeavour that are named or described in bibliographic records are represented by the entities work, expression, manifestation and item. The entities person and corporate body represent those responsible for the content, production, dissemination or custodianship of the product entities. An additional set includes entities that serve as the subjects of works. For each entity a set of attributes is defined and the model includes an extensive set of possible relationships that may exist between the entities.

Although many projects have explored the use of FRBR in different contexts and some tools exist, there is little support for the systematic processing of all information in all MARC records into a proper representation that directly reflects the entities, attributes and relationships of the FRBR model. Due to a paucity of reusable solutions, researchers beginning work in this area typically need to reinvent the conversion process and write their own interpretation system. The transformation from MARC to FRBR is a complex task that in many ways is different from a simple transformation such as the conversion from MARC to Dublin Core. Entities such as work or person may have duplicate descriptions in numerous records, and to be able to create a consistent set of entities with a proper set of relationships, the process needs to be based on an extensive set of rules and conditions. The final output of the process should be a normalized set of unique entities with a proper set of attributes and relationships. Additionally, the conversion process needs to support solutions to numerous problems and exceptions. These may be caused by inconsistencies and errors resulting from erroneous registrations, data imported from low-quality sources or changes made to the catalogue.

The process of transforming MARC records into a representation based on the FRBR model.
The process of transforming MARC records into a representation based on the FRBR model.

This issue - that of transforming MARC records into a representation that directly reflects the FRBR model - has been investigated by the Norwegian University of Science and Technology in a joint project together with the Norwegian library service center BIBSYS and the National Library of Norway. The purpose of this project has been to support and explore the application of the FRBR model on existing MARC-based library catalogues. The project has two major goals: the identification and modelling of the various tasks required in a conversion process, and the development of a conversion tool that is based on the use of rules and conditions to define the transformation from MARC to FRBR. The tool is based on the use of XML, includes the automatic generation of the XSL transformation files used in the conversion, and the solution is independent of any specific MARC format and cataloguing rules. Because of this, the tool is reusable across catalogues and MARC formats. It uses records in the MarcXchange format as input and produces output in a format that is based on MarcXchange, but has specific attributes for describing the types defined in the FRBR model and elements for describing the relationships between entities. The conversion tool has successfully been used to transform the 4 million records of the BIBSYS database into an FRBR-ized prototype that is available on the Web. This prototype database is primarily intended to demonstrate the results of the transformation and can be used to search and navigate the BIBSYS database in the shape of FRBR entities and relationships. The actual conversion performed on this particular catalogue is still far from perfect, and the set of conditions and rules must be extended to support the exceptions and errors in the initial data. However, the tool enables librarians to easily specify and test various rules and conditions, and the overall result can be evaluated by inspecting or querying the resulting database.

The application of the FRBR model as a common ground for interchange and integration between libraries fits well with the current focus on cross-domain semantic interoperability in digital libraries. NTNU is participating in the DELOS NoE activity on development of the FRBROO ontology, and future activities include adapting the conversion system to produce bibliographic information encoded as RDF using the FRBROO ontology, for the purpose of cross-domain integration and interoperability using semantic Web technology.

BIBSYS FRBRized prototype: http://november.idi.ntnu.no/frbrized

Please contact:
Trond Aalberg, Norwegian University of Science and Technology, Norway
Tel: +47 7359 7952
E-mail: Trond.Aalberg@idi.ntnu.no