Cross-Language Information System Evaluation
by Carol Peters
The development of methodologies and measures for system evaluation
is an important topic in the information retrieval field. In recent
years, the US National Institute of Standards and Technology
(NIST) has done a great service to the IR community with the organisation
of the Text Retrieval Conference (TREC) series in which various
types of IR systems are evaluated by assessing their performance
against set tasks (http://trec.nist.gov). Since 1997, TREC has
included a track for cross-language information retrieval (CLIR)
system evaluation.
The interest shown in the cross-language track since its introduction
into the TREC conferences in 1997 (TREC-6) shows the importance
of this emerging area. There are many applications where information
should be accessible to users regardless of language. In the global
information society, situations when a user is faced with the
task of querying a multilingual document collection are becoming
increasingly common. Many users have some foreign language knowledge,
but their proficiency may not be good enough to formulate queries
to appropriately express their information needs. Such users
will benefit enormously if they can enter their queries in their
native language, because they are able to examine relevant documents
in other languages even if they are untranslated. Monolingual
users, on the other hand, can use translations aids to help them
understand their search results. There is thus a growing demand
for efficient cross-language query systems; the aim of the CLIR
evaluation effort is to stimulate research in this area by providing
a forum for the exchange of ideas and communication of results
and to help create the necessary test collections for effective
evaluation.
The CLIR track at TREC-8
The main task of this years CLIR track is to search documents
in a multilingual pool containing news documents in four different
languages: English, German, French, and Italian. The goal is to
retrieve documents from all languages, rather than just a given
pair. This is to encourage groups to work with as many languages
as possible. For each topic (ie, query), a list of document identifiers
is to be submitted which are ranked in decreasing order of the
estimated relevance of the documents; the list will usually contain
documents in all four languages. The main evaluation will be based
on this list, although a simplified task will be considered with
only English and a second language. In addition, there will
be a special subtask. This task consists of a second data collection
from the social science field. The rationale of this subtask is
to study CLIR in a vertical domain (ie social science) where a
German/English thesaurus is available.
The experience of the first CLIR track showed that it is difficult
to produce topics in several languages at a single site. Thus,
since TREC-7, a distributed approach to topic creation and results
assessment has been chosen. There are currently four different
sites, each located in an area where one of the topic languages
is spoken natively:
- English: NIST, Gaithersburg, MD, USA
- French: University of Zurich, Switzerland
- German: Social Science Information Centre, Bonn/University of
Koblenz, Germany
- Italian: IEI-CNR, Pisa, Italy
The track coordinator is Peter Schäuble, EIT, Zurich, Switzerland.
Each site is responsible for creating a certain number of topics
and for translating the topics created by the other sites into
the local language. In this way, an equivalent set of topics is
created for each of the four languages. Sites are also responsible
for the evaluation of results against the local language document
collection.
Although the CLIR track is coordinated in Europe, the results
are presented in the United States, at the TREC Conferences in
November. It is thus probably not surprising that, so far, the
participation has been dominated by US groups. At TREC-7, there
were nine CLIR participants: 6 from North America - 5 US and 1
Canada - and just 3 from Europe (1 each from France, Switzerland
and the Netherlands). This is clearly not representative of the
European situation where multilingual issues are an everyday reality.
We very much hope that this imbalance will be at least partially
redressed this year at TREC-8.
CLIR System Evaluation in 2000
Together with the TREC coordinators at NIST, we are now looking
towards the future. It seems evident that issues regarding multilingual
information retrieval system effectiveness will gain in importance
in the next few years. We are thus seriously considering the
possibility of setting up a European forum specifically for cross-language
and multilingual system assessment. In this way, we could extend
the scope of the current CLIR evaluation activity to cover other
important, related issues, such as cross-language evaluation for
multi-modal systems, monolingual evaluation for non-English IR
systems, etc. Such a European forum would continue to participate
with NIST but all coordination activities would be centred in
Europe, including the maintenance of a repository of tools, training
data and evaluation suites for use by the scientific community
for system evaluation activities.
Please contact:
Martin Braschler - EIT, Zurich
Tel: +41 1 365 3052
E-mail: braschler@eurospider.ch