Multilingual Interactive Experiments with Flickr

by Jussi Karlgren, Paul Clough and Julio Gonzalo

The Cross-Lingual Evaluation Forum (CLEF) in 2006 will feature a track on interactive image retrieval from dynamic target data taken from the popular Flickr photo-sharing service. In the past, interactive tracks at CLEF have addressed applications such as information retrieval and question answering. This year however, the focus has turned to text-based image retrieval from Flickr.

Information retrieval systems, especially text retrieval systems, have in the last few decades benefited greatly from a fairly strict and straight-laced evaluation scheme, which enables system designers to run tests on versions of their system using a test collection of pre-assessed data. These tests, based on the target notion of topical relevance, with system-oriented evaluation of performance, have served the text retrieval field well. However, system evaluation only addresses some of the bottlenecks in building a successful system.

As a complement, experiments such as iCLEF - the interactive track at CLEF - aim to investigate real-life cross-language searching problems in a realistic scenario, and to give indications of how best to aid users in solving them. This crucially involves developing new evaluation methodologies and new target notions: relevance does not cover all the aspects that make an interactive session successful.
Over the past five years, the CLEF interactive track has studied various cross-language search tasks, including retrieval of documents, answers and annotated images. All tasks involve the user interacting with information systems in a language different from that of the document collection, and have been evaluated using conventional evaluation methodologies. This involves a fairly elaborate experimental setup.

This year we introduced some major changes. We want to find a collection where the cross-language search necessity arises more naturally for average users. We have chosen Flickr, a large-scale, Web-based image database serving a large social network of WWW users. It has the potential to offer both challenging and realistic multilingual search tasks for interactive experiments.

We want to use the iCLEF track to explore alternative evaluation methodologies for interactive information access. For this reason, we have decided to fix the search tasks, but to keep the evaluation methodology open. This allows each participant to contribute with their own ideas about how to study interactive issues in cross-lingual information access.

Additionally, we will lower the threshold for entry to attract more participants.

This year, the tasks given to participants are:
Topical ad-hoc retrieval over many languages: find pictures of as many different European parliaments as possible.
Creative open-ended retrieval: illustrate a short text on a given topic with five pictures (the text is provided separately to the experiment subjects).
Example-based retrieval: determine the name of the place shown in a given photo.

The majority of Web image searching is text-based, and the success of such an approach often depends on reliably identifying relevant text associated with a particular image. Flickr is an online tool for managing and sharing personal photographs and currently contains over five million freely accessible images. These are available via the Web, and are updated daily by a large number of users. The photos are annotated by authors with freely chosen keywords in a naturally multilingual manner. Most authors use keywords in their native language; some combine keywords in more than one language. This sort of emerging, unsupervised and distributed semantic structure is known as a folksonomy and provides a modelling challenge for traditional knowledge-based retrieval approaches.

An example photo from Flickr with multilingual annotations.

Participants will access images and metadata in Flickr through the open API provided by Flickr, and are encouraged to log as many details as possible about every search session. A skeleton questionnaire will be provided to collect some of the evaluation metrics, and we will aim to probe notions related to user satisfaction and confidence:

Satisfaction (all tasks): are you satisfied with how you performed the task?
Completion (creative task, ad-hoc task): did you find sufficient results, or would you have continued if there had not been a time limit? How long would you have continued?
Score (example-based task, ad-hoc task): one point for each relevant image found.

This year's workshop, held in Alicante in September 2006, will discuss the efficiency of search strategies, the usefulness of tested methods, and the utility of the projected evaluation methodologies.

Links:
CLEF: http://www.clef-campaign.org
Track home page: http://nlp.uned.es/iCLEF/
Data source: http://www.flickr.com

Please contact:
Jussi Karlgren, SICS, Sweden
Tel: +46 8 633 1500
E-mail: jussisics.se

ERCIM News 66

Multilingual Interactive Experiments with Flickr