ERCIM News No.37 - April 1999
Cross-Language Information System Evaluation
by Carol Peters
The development of methodologies and measures for system evaluation is an important topic in the information retrieval field. In recent years, the US National Institute of Standards and Technology (NIST) has done a great service to the IR community with the organisation of the Text Retrieval Conference (TREC) series in which various types of IR systems are evaluated by assessing their performance against set tasks (http://trec.nist.gov). Since 1997, TREC has included a track for cross-language information retrieval (CLIR) system evaluation.
The interest shown in the cross-language track since its introduction into the TREC conferences in 1997 (TREC-6) shows the importance of this emerging area. There are many applications where information should be accessible to users regardless of language. In the global information society, situations when a user is faced with the task of querying a multilingual document collection are becoming increasingly common. Many users have some foreign language knowledge, but their proficiency may not be good enough to formulate queries to appropriately express their information needs. Such users will benefit enormously if they can enter their queries in their native language, because they are able to examine relevant documents in other languages even if they are untranslated. Monolingual users, on the other hand, can use translations aids to help them understand their search results. There is thus a growing demand for efficient cross-language query systems; the aim of the CLIR evaluation effort is to stimulate research in this area by providing a forum for the exchange of ideas and communication of results and to help create the necessary test collections for effective evaluation.
The CLIR track at TREC-8
The main task of this years CLIR track is to search documents in a multilingual pool containing news documents in four different languages: English, German, French, and Italian. The goal is to retrieve documents from all languages, rather than just a given pair. This is to encourage groups to work with as many languages as possible. For each topic (ie, query), a list of document identifiers is to be submitted which are ranked in decreasing order of the estimated relevance of the documents; the list will usually contain documents in all four languages. The main evaluation will be based on this list, although a simplified task will be considered with only English and a second language. In addition, there will be a special subtask. This task consists of a second data collection from the social science field. The rationale of this subtask is to study CLIR in a vertical domain (ie social science) where a German/English thesaurus is available.
The experience of the first CLIR track showed that it is difficult to produce topics in several languages at a single site. Thus, since TREC-7, a distributed approach to topic creation and results assessment has been chosen. There are currently four different sites, each located in an area where one of the topic languages is spoken natively:
- English: NIST, Gaithersburg, MD, USA
- French: University of Zurich, Switzerland
- German: Social Science Information Centre, Bonn/University of Koblenz, Germany
- Italian: IEI-CNR, Pisa, Italy
The track coordinator is Peter Schäuble, EIT, Zurich, Switzerland.
Each site is responsible for creating a certain number of topics and for translating the topics created by the other sites into the local language. In this way, an equivalent set of topics is created for each of the four languages. Sites are also responsible for the evaluation of results against the local language document collection.
Although the CLIR track is coordinated in Europe, the results are presented in the United States, at the TREC Conferences in November. It is thus probably not surprising that, so far, the participation has been dominated by US groups. At TREC-7, there were nine CLIR participants: 6 from North America - 5 US and 1 Canada - and just 3 from Europe (1 each from France, Switzerland and the Netherlands). This is clearly not representative of the European situation where multilingual issues are an everyday reality. We very much hope that this imbalance will be at least partially redressed this year at TREC-8.
CLIR System Evaluation in 2000
Together with the TREC coordinators at NIST, we are now looking towards the future. It seems evident that issues regarding multilingual information retrieval system effectiveness will gain in importance in the next few years. We are thus seriously considering the possibility of setting up a European forum specifically for cross-language and multilingual system assessment. In this way, we could extend the scope of the current CLIR evaluation activity to cover other important, related issues, such as cross-language evaluation for multi-modal systems, monolingual evaluation for non-English IR systems, etc. Such a European forum would continue to participate with NIST but all coordination activities would be centred in Europe, including the maintenance of a repository of tools, training data and evaluation suites for use by the scientific community for system evaluation activities.
Martin Braschler - EIT, Zurich
Tel: +41 1 365 3052