Semantic Search in Peer-to-Peer-Based Digital Libraries

by Hao Ding and Ingeborg Torvik Sølvberg


Advances in peer-to-peer overlay networks and Semantic Web technology will have a substantial influence on the design and implementation of future digital libraries. However, it remains unclear how best to combine their advantages in digital library construction. Research in the IF group at the Norwegian University of Science and Technology (NTNU) is evaluating possible solutions to advance developments in this field.

One of the most important features of the digital library of the future will be that it is accessible from anywhere, by anyone and at any time. Achieving this goal requires that the digital library be investigated as an integrated whole rather than as the sum of its individual parts. The approaches used in peer-to-peer overlay networks and Semantic Web technology show promises for aspects of communication infrastructure and semantic processing respectively. However, little work has been done to determine how best to combine these two technologies to form a total solution for digital library construction. NTNU researchers, under the framework of the IKT/WEB-TEK project sponsored by the Research Council of Norway, have developed a semantic search framework for peer-to-peer based digital libraries.

Our work, as illustrated in Figure 1, has involved comparing and identifying the strengths and weaknesses of both peer-to-peer and Semantic Web technology. Based on our analysis, we concluded that these two fields are complementary, and that there are great advantages to be gained by combining them in conducting semantic searches in a large-scale distributed environment. One major weakness in the current peer-to-peer systems is their limited search capabilities, which is due to their lack of power in responding to queries. The Semantic Web and ontologies as a semantic tool provide a basis for a shared understanding across a group of individuals, such as in detecting similar concepts among ontologies and integrating multiple ontologies at no cost to the end users. By applying ontologies, the search capability in peer-to-peer networks can be strengthened via semantic information processing. The inference engine can also be specifically adapted to achieve more reliable results by deducing predefined rules.

Figure 1
Figure 1. Combining P2P and Semantic Web for Constructing Digital Libraries.

However, while the Semantic Web and ontologies provide us with a mechanism for facilitating semantic information management and processing, they focus more on local and static situations, rather than a distributed and dynamic environment. Because they are innately decentralized, peer-to-peer systems can help exploit the full potential of the Semantic Web's capabilities. In other words, peer-to-peer systems can act as a fundamental platform for the searching and sharing of distributed information by using the Semantic Web technology.

In our survey of existing peer-to-peer systems, our project has concentrated mainly on scalability and autonomy. From a technical perspective, digital libraries need a common infrastructure that is highly scalable, customizable and adaptable. To this end, peer-to-peer systems have been suggested as one method for facilitating cooperation among digital libraries and for improving the accessibility of library services. Another critical goal of digital libraries is the sharing of resources with a wider audience. However, many inconsistencies exist across platforms, applications and capabilities. This means that library systems must often sacrifice autonomy to reach agreement with each other, so as to enable better searching and sharing of information. In comparison with client/server architecture, peer-to-peer systems provide a more open architecture by decentralizing the control from servers, allowing nodes (eg digital libraries) to be loosely coupled. As a consequence, system scalability and robustness can be improved with a small overhead in running specific communication protocols on these nodes.

As an intermediate goal, a tentative benchmark has been proposed for selecting an appropriate peer-to-peer networks for information searching in various digital library applications. In particular, our project has extended classic super-peer-based networks with load-balancing and self-organizing functionalities, thereby catering for dynamic situations that characterize digital libraries, such as continuous departures of peers, overload caused by the joining of peers, or even a system catastrophe. Evaluation results are illustrated in Figure 2.

Figure 1Figure 1
Figure 1
Figure 2. Evaluation Results: from top left:
(a). Self-organizing under a scenario of continuous leaving of peers
(b) Load-balancing under a scenario of continuous joining of peers
(c) Catastrophe Recovery.

In studying the use of Semantic Web technology to enhance search performance in digital libraries, this project investigates not just ontology-enriched metadata searching, but also the use of rules to express more complicated relations that exist among metadata records. We have compared the performance of a super-peer-based digital library by applying searches based on global schemas and ontology mapping. Currently, we are evaluating the potential introduction of rule-based reasoning in all applications.

Link:
http://www.idi.ntnu.no/grupper/if/

Please contact:
Hao Ding, NTNU, Norway
Tel: +47 73594168
E-mail: hao.ding@idi.ntnu.no