ERCIM-Led MUSCLE Network of Excellence: Approved and Poised to Embark on Ambitious Four-Year Research and Integration Program
by Eric Pauwels
The European Commission signed the contract for the FP6 Network of Excellence MUSCLE (Multimedia Understanding through Semantics, Computation and Learning) on 23 February, thereby giving this ERCIM-led consortium of 42 scientific groups the final go-ahead to embark on an ambitious four-year research and integration program.
As the Network's expanded acronym indicates, MUSCLE aims to facilitate high-level access to multimedia databases by systematically incorporating machine learning into an integrated approach to multimedia data mining. The original impetus for this initiative stems from the realisation that we urgently need new tools to intelligently index and explore the vast quantities of multimedia documents currently being amassed. As the enormous size of these collections precludes comprehensive human annotation, the only viable alternative is the development of reliable machine perception and understanding, and in particular, the automatic creation of semantically rich metadata that can be used as input for subsequent high-level processing. Indeed, enriching multimedia databases with additional layers of automatically generated semantic meta-data, as well as the artificial intelligence to reason about these (meta)data, seems the way forward in mining for complex content, and it is at this level that MUSCLE will focus its main effort. This will enable users to move away from labour-intensive, case-by-case modelling of individual applications, and allow them to take full advantage of generic adaptive and self-learning solutions that need minimal supervision.
The scientific work has been divided up into workpackages (WP), which collectively constitute the Joint Program of Activities (JPA). Each WP covers a different but complementary component in the overall research strategy. The Single Modality WP groups together all the research that is restricted to a single sensor modality (ie audio, video, speech). This well-established approach is augmented by the work done in the Cross-Modal Integration WP, where the focus is on performance improvement that can be achieved by combining different but synergistic modalities. For instance, visual interpretation of a sports video can be improved by taking into account the accompanying audio stream (eg crowd cheering). The WP on Machine Learning addresses the possibility of learning data-models automatically instead of having to hand-code them. A typical application would be the automatic classification of music into classical or modern based on a number of illustrative examples. The WP on Computation Intensive Methods investigates how sophisticated computational techniques can assist in exploring complicated models or estimating uncertainty. Using numerical simulation to determine parameter confidence intervals is a case in point. Finally, the WP on Human Computer Interfaces looks at the role of human computer interfaces in the exploration or visualisation of complex datasets, while the Meta-Data Representation WP concerns itself with the internal representation of acquired information.
Two Grand Challenges
To encourage close coordination of effort and durable scientific integration, MUSCLE will set itself two 'Grand Challenges'. These are ambitious research projects that involve the entire spectrum of expertise represented within the consortium and as such, will act as focal points. The first challenge addresses natural high-level interaction with multimedia databases. This project will work on querying of multimedia databases at a high semantic level. Think Ask Jeeves for multimedia content: one can address a search engine using natural language and it will take appropriate action, or at least ask intelligent, clarifying questions. This is an extremely complicated problem and will involve a wide range of techniques: natural language processing, interfacing technology, learning and inferencing, merging of different modalities, federation of complex meta-data, appropriate representation and interfaces and so on. The second challenge is more related to machine perception and addresses the problem of detecting and recognising humans and their behaviour in videos. At first glance, this might seem rather narrow but it has become clear that robust performance will rely heavily on the integration of various complementary modalities such as vision, audio and speech. Applications are legion: surveillance and intrusion detection, face recognition and registration of emotion or affect and automatic analysis of sports videos and movies, to name just a few.
The research plans outlined above cover only part of the MUSCLE mission. In addition, strong emphasis will be placed on networking and dissemination, as the European Commission intends NoEs to be important players in a Europe-wide drive towards durable integration and collaboration. To this end, MUSCLE has planned a number of initiatives. First, there will be an annual post-doctoral fellowship scheme extending and complementing the ERCIM model. As is the case for the latter, applications will be open to talented young researchers from across the globe. The consortium will also set up a Web-based infrastructure to facilitate electronic collaboration between different teams and support access to multimedia databases for benchmarking or testing purposes. In the same vein, MUSCLE will host a multimedia preprint server, offering authors the opportunity to publish their research results in media-rich format, which will do better justice to the content. Input from industrial and commercial parties will be solicited through the setting up of an Application Forum. Finally, in order to maximise its impact on the European and global research scenes, MUSCLE will pool resources with other Networks and Integrated Projects active in the area of Semantic-based Knowledge Systems.
For more information, the reader is invited to visit the MUSCLE Web page.
Eric Pauwels, CWI, Amsterdam