Session B: Databases, Multilinguality

Group Report


Christos Nikolaou
ICS-FORTH


The database part of the session started with an introduction on database technology by Keith G Jeffery, from the ERCIM member CLRC-RAL In his presentation, he defined database technology as a technology that provides an INTERNAL Representation (model) of the EXTERNAL world of interest. It is concerned primarily with maintaining the internal representation consistent with external reality.

Database technology is important because business in much of world depends on it. Database technology is a CORE TECHNOLOGY with links to: information management / processing, data analysis / statistics, data visualisation / presentation, multimedia and hypermedia, office and document systems, business processes, workflow, CSCW (computer-supported cooperative work). Modern DB systems depend on an infrastructure of: networks both LAN (local area network) and WAN (wide area network), client-server computing architecture, skilled data analysis and DB design and skilled systems development method(s).

In the context of Mediterranean cooperation, database technology can be helpful in developing systems for: culture&scientific information, tourism, telemedicine, natural resources management, production engineering.

ERCIM can help in the creation of such systems because of its strong competence in the area of databases. Each Institute has a strong team in database technology, with its own national projects backed by participation in EU projects. Each team has links to academics (through teaching / research) and links to commerce / industry (through projects and consultancy). In addition, the database groups in the ERCIM institutes formed EDRG, the ERCIM Database Research Group, in 1991.

Integrated Geographical Information Systems were then discussed by Poulicos Prastacos, from the ERCIM member, Institute of Applied and Computational Mathematics of the Foundation for Research and Technology-Hellas (FORTH). GIS systems have been used increasingly for visualizing information in a spatial context. Their use has been primarily in the production of digital maps for resources management and land information systems. However a new trend is emerging where GIS systems are combined with mathematical models to form decision support toolkits that can be used for guiding policy makers in evaluating the impact of alternative decisions.

Some of these integrated applications can be the subject of Meditteranean cooperation since most of the problems they solve are common to most countries. One such application is an integrated system for environmental monitoring with emphasis in water resources and another one is a system for urban planning.

An example of a cultural information system was presented by A.C. Kakas, of the Department of Computer Science of the University of Cyprus. It creates a presentation for one of the major monasteries in Cyprus. The long term aim is to develop similar systems for other cultural sides and link these together thus providing a medium appropriate for tourism, and other activities such as research in the culture and history of the island.

Multilinguality was introduced by Carol Peters, from the ERCIM member Istituto di Elaborazione della Informazione Consiglio Nazionale delle Ricerche (CNR). With the recent rapid diffusion over the international computer networks of world wide distributed document bases, the question of multilingual access and multilingual information retrieval is becoming increasingly relevant.

So far research and development activities have been concentrated on monolingual environments and, in the large majority of cases, the default language has been English. Although English admittedly tends to play a predominant role in international communications, there are many risks inherent in allowing this predominance to remain unchallenged. The diversity of the world's languages and cultures gives rise to an enormous wealth of knowledge and ideas. It is thus essential to study and develop computational methodologies and tools that allow us to preserve and exploit this heritage rather than helping to destroy it. Ideally, it should be possible for users throughout the world - independently of their native tongue - to have access to the massive volumes of information of all types - scientific, economic, literary, news, etc. - now available over the networks, and in particular through Internet and the World Wide Web. Attention must be paid to assisting non-expert users. They must be provided with easy-to-use, flexible tools that help and guide them in the search for knowledge. This is especially important for developing countries where one of the main keys to progress is education - and the main path to education is access to knowledge.

However, the question of multilingual access is an extremely complex one. Two basic issues are involved:
multiple language representation, manipulation, and display is the first, and the second is multilingual search and retrieval. An ERCIM-sponsored project - the SAMOS project, will be addressing both of them. The SAMOS project aims at the development of a networked computer science technical report library. A digital library architecture will be developed which provides Internet access to a distributed, decentralised multi-format collection of documents and includes a multilingual interface.

The remaining two presentations addressed specific areas of multilinguality. One was Multilingual Electronic Commerce, presented by A. Lehtola, from the ERCIM member VTT in Finland. The other was a presentation by Fathi Debili, of CNRS - idl, on the automatic or interactive construction of dictionaries for bilingual or monolingual expressions. This poses the problem of matching pairs of monolingual or bilingual text. Experimentation is under way for the following pairs of languages: french-english and french-arab. Monolingual experimentation was performed for french texts.