ERCIM News No.37 - April 1999
EUROgatherer - Personalised Information Gathering System
by Costantino Thanos
The explosive growth of Internet and the World Wide Web makes an enormous amount of information widely available; however, this abundance of information can easily become a mixed blessing without methods to filter and control the potentially unlimited flux of information from sources to their receiving end-users (the well-known information overload syndrome). The EUROgatherer project aims at alleviating this problem by making information available to users in the appropriate form, amount and level of detail, and at the right time.
The EUROgatherer project aims at designing and implementing a system which provides a personalised information gathering service on the Internet. In particular, the EUROgatherer system will provide functionalities:
- to access a variety of information sources
- to create meaningful abstracts of the retrieved documents and classify them appropriately
- to acquire and retain user profiles and act upon one or more goals based on such profiles
- to support a relevance feedback mechanism.
A service-based architecture has been designed to meet the project requirements and is now under development. The set of EUROgatherer services (UserProfiling, Delivery, Pull, Push, and Wrapper services) provided by the system are visible to the user and accessible both separately or as a member of a cluster of services (for example, Pull + Profiling + Delivery or Push +Profiling + Delivery, etc.). One service can access another creating a cluster of interoperable services. Thus, an Internet user can access a single EUROgatherer service or a composite cluster of services. Such a cluster should provide a more complete service (for example a personalized push or pull service).
The openness of the architecture is guaranteed by the fact that all the system components:
- share a common user profile model which complies with the standard P3P (see the Profiling service below)
- share a common category taxonomy
- use standard communication mechanisms/protocols.
The EUROgatherer system is composed of the following services:
- User Service - the main entry point to the system services. It provides the users with a global view of system functionality and supports them when accessing the system components.
- Profiling Service - responsible for the management (storage, maintenance and retrieval) of the user profiles.
- Dispatcher Service - responsible for dispatching information requests to other system services.
- Delivery Service - responsible for delivering the results of information requests to the users according to specified delivery modes.
- Pull Service - responsible for collecting information meeting user needs expressed both as long term needs (specified in a user profile) as well as short term needs (ad hoc queries, results on demand)
- Push Service - mainly responsible for filtering an information flow with respect to user long term information needs.
- Wrapper Service - responsible for retrieving HTML pages by querying online Web databases and search engines and transmitting them to the Pull and Push services.
- Gatherer Service - responsible for retrieving HTML pages by spidering the Web and transmitting them to the Pull and Push services.
- News Service - responsible for collecting continuous information flows transmitted by News Agencies or Usenet news discussion groups using several transmission means (satellite, internet, teletext, videotext) and transmitting them to the Push service.
The project adopts an advanced federated approach to the Internet services domain (see the figure). It will build on existing standards and contribute to the development of new ones.
The users of the EUROgatherer system can be divided in two categories:
- Casual users - casual users are generic Internet users who use the EUROgatherer system like any other Web search engine. They have at their disposal the AdHocQuery functionality, provided by the EUROgatherer User service, to issue queries to be executed by the system. This functionality satisfies the short term information needs of a user.
- Registered users - registered users are users who have specified their long term information needs through the definition of a profile. This profile is stored in a database managed by the Profiling service. Retrieval of information specified in a user profile in pull/push mode can be triggered either directly by the user or by the dispatcher service on behalf of the user. Two interactive modes are available: Result on Demand and Scheduled Queries.
The EUROgatherer project is funded by the Telematics Information Engineering programme. It is conducted by a consortium composed of the following partners: Italia On Line, Italy; Xerox Research Centre Europe, France; CINET, Spain; Eurospider Information Technology, Switzerland; Consiglio Nazionale delle Ricerche - Istituto di Elaborazione dellInformazione, Italy; University of Dortmund, Germany; Dublin City University, Ireland.
Costantino Thanos - IEI-CNR
EUROgatherer Project Coordinator
Tel: +39 050 593 492