Harmonise: An Ontology-Based Approach for Semantic Interoperability
by Michele Missikoff
One of the services that we expect to find with the advent of the Semantic Web is semantic interoperability. The goal of semantic interoperability is to allow the (seamless) cooperation of two Information Systems that were not initially developed for this purpose. The European Project 'Harmonise' aims at building a technological infrastructure based on a shared ontology, to enhance the cooperation of European SMEs in the tourism sector.
The Semantic Web is expected to enhance the development of semantic interoperability among Information Systems. The aim is to allow Information Systems to cooperate without requiring modifications to their software or data organisation. In the IST European Project 'Harmonise', information interoperability has been addressed within a 'Local As View' (LAV) approach. The innovative issue of our LAV solution is represented by the use of a shared ontology to build a common view of the business sector in which the cooperation takes place.
The Harmonise Approach to Semantic Interoperability
The advantages of flexibility and openness provided by the Internet in the connection of computer systems are not matched in the connection of software applications. The primary techniques aimed at application interoperability are adapters (typically in Enterprise Application Integration) and exchange formats, such as Electronic Data Interchange (EDI) or Knowledge Interchange Format (KIF). Due to the limited success of existing solutions, Harmonise proposes an approach that would start from the above but be centrally based on a domain ontology. To this end, project activities have investigated three main areas.
Interoperability clashes, caused by differences in the conceptual schemas of two applications attempting to cooperate. The possible clashes have been classified in two main groups:
- Lossless clashes, which can be solved with no loss of information. Examples include naming clashes, when the same information is represented by different labels; structural clashes, when information elements are grouped in a different way; and unit clashes, when a scalar value (typically an amount of money, or a distance) is represented with different units of measure.
- Lossy clashes, which include the clashes for which any conceivable transformations (in either direction) will cause a loss of information. Typical cases are information represented at different levels of granularity, refinement, or precision. For example, in expressing the distance of a hotel from the airport, one application simply reports 'near airport', while another expresses it in terms of miles, eg 'five miles from airport'. Another example is when one hotel specifies the presence of an 'indoor swimming pool', while another just says 'swimming pool'.
Ontology, which represents a common, shareable view of the application domain. This is used to give meaning to the information structures that are to be exchanged between applications. In Harmonise the ontology is based on the Object, Process, Actor Modelling Language (OPAL) representation method, which follows the Frame-Slot-Facet paradigm. It includes constructs such as ISA (with inheritance) and aggregation hierarchies, similarity and various kinds of built-in constraints (such as cardinality constraints, enumeration). Based on OPAL, we have developed the ontology management system, SymOntoX.
|The ontology 'chestnut'.
The concepts in a domain ontology can be seen to be organised according to a (complex) hierarchical structure shaped like a chestnut (see Figure). In the top part (Upper Domain Ontology) we have generic concepts, such as 'process', 'actor', 'event' and 'goal'. In the bottom part (Lower Domain Ontology: LDO) we have elementary concepts, such as 'price', 'streetNumber', 'cost' and 'internet Address'. Generally, for two cooperating partners, it is relatively easy to reach a consensus on the concepts of these two parts. The difficult section is the middle part - the Application Ontology. Here concepts and definitions depend strongly on the specific application, the kind of problems addressed and the method used to solve them, not to mention the underlying technology (which often contaminates the conceptual model) and the cultural aspects. Typical concepts in this layer are 'invoice', 'customer', 'discount', 'reliableCustomer', 'approval', or more sector-dependent concepts such as 'hotelReception', 'confirmedReservation', 'advancePayment', 'lightMeal' or 'gymTrainer' (in the tourism sector, for example).
Semantic annotation, achieved to represent the meaning of a local conceptual schema, expressed by using the concepts available in the ontology. These provide a unique semantic reference for each application wishing to expose an interface, referred to as a Local Conceptual Schema (LCS) for both exporting and importing information. In essence, every piece of information in the LCS will be annotated using the ontology content. For a given application concept (represented as a data structure in the LCS), annotation consists in identifying the LDO elements of the reference ontology and using them to define its meaning. These semantic annotations are used, at intentional level, to generate the mapping between the local conceptual schema and the reference ontology. Mapping rules (among concepts) correspond to the transformation rules at extensional level. Transformations are applied to local data to code them according to the Harmonise Interoperability Representation (HIR) used to actually exchange data.
The Harmonise approach to semantic interoperability is inherently different from the approaches that propose an interchange format, such as KIF. These latter approaches are 'neutral' with respect to the application domain, while HIR allows information to be exchanged by using ontology terms only. Another approach which is in the line of our proposal is PIF (Process Interchange Format). However, the main difference is that PIF proposes a format based on a predefined (limited) process ontology, which is given with the standard. Conversely, HIR is based on a rich domain ontology, the content of which is not provided by the method. It is initially built by the domain experts, and continuously evolves to keep pace with evolving reality. Accordingly, HIR evolves, since its vocabulary is (a subset of) the ontology.
Michele Missikoff, IASI-CNR
Tel: +39 06 7716 422