Cross-Language Evaluation Forum - CLEF 2003
by Carol Peters
The results of the fourth campaign of the Cross-Language Evaluation Forum were presented at a two-day workshop held in Trondheim, Norway, 21-22 August, immediately following the seventh European Conference on Digital Libraries.
The main objectives of the Cross-Language Evaluation Forum (CLEF) are to stimulate the development of mono- and multilingual information retrieval systems for European languages and to contribute to the building of a research community in the multidisciplinary area of multilingual information access. These objectives are realised through the organisation of annual evaluation campaigns and workshops.
Each year, CLEF offers a series of evaluation tracks designed to test different aspects of mono- and cross-language system performance. The intention is to encourage systems to move from monolingual text retrieval to the implementation of a full multilingual multimedia search service. CLEF 2003 offered eight tracks evaluating the performance of systems for monolingual, multilingual and domain-specific information retrieval, multilingual question answering, cross-language image and spoken document retrieval.
The main CLEF multilingual corpus now contains well over 1,600,000 news documents in nine languages, including Russian. A secondary collection used to test domain-specific system performance consists of the GIRT-4 collection of English and German social science documents.
CLEF 2003 posed a number of challenges, in particular with respect to the multilingual and bilingual tracks where the aim was to encourage work on many European languages rather than just those most widely used. There were two distinct multilingual tasks; the most challenging involved retrieving relevant documents from a collection in eight languages: Dutch, English, Finnish, French, German, Italian, Spanish and Swedish, listing the results in a single, ranked list.
Tasks offered in the bilingual track involved 'unusual' language pairs: Italian -Spanish, German - Italian, French - Dutch, Finnish - German. We were very pleased by the number of groups that attempted these difficult tasks.
Another positive aspect of CLEF 2003 was the number of new tracks offered as pilot experiments. These included mono- and cross-language question answering, and cross-language image and spoken document retrieval. The aim has been to try out new ideas and develop new evaluation methodologies, suited to the emerging requirements of both system developers and users with respect to today's digital collections. This year's interactive track included full cross-language search experiments where the user attempts to find relevant documents using a complete interactive cross-language system which provides assistance in both query formulation and document selection.
Participation in the CLEF 2003 campaign was slightly up with respect to the previous year with 43 groups submitting results for one or more of the different tracks: 10 from N.America; 30 from Europe, and 3 from Asia. As in previous years, participants consisted of a nice mix of new-comers and veteran groups. Another important continuing trend is the progression of many returning groups to more complex tasks, from monolingual to bilingual, from bilingual to multilingual.
The campaign culminated in a Workshop held in Trondheim, Norway, 21-22 August. More than sixty researchers and system developers from academia and industry attended the Workshop in order to discuss the results of their experiments. In addition to presentations by participants in the CLEF campaign, talks included reports on the activities of the NTCIR evaluation initiative for Asian languages, and on cross-language information retrieval work at Moscow State University. The final session discussed proposals for future evaluation activities within the CLEF framework.
The presentations given at the CLEF Workshops and detailed reports on the experiments of CLEF 2003 and previous years can be found on the CLEF website at http://www.clef-campaign.org/
Carol Peters, ISTI-CNR
Tel: +39 050 3152897