Skip to content

Personal tools
You are here: Home
c/o Idiap
Centre du Parc
Rue Marconi 19
Case Postale 592
CH-1920 Martigny

tel. +41 27 721 77 11
fax +41 27 721 77 12


Welcome to IM2

Document Actions
The Swiss National Center of Competence in Research (NCCR) on Interactive Multimodal Information Management (IM2)

IM2 is one the 20 Swiss National Centres of Competence in Research (NCCR) aiming at boosting research and development in several areas considered of strategic importance to the Swiss economy. The National Centers of Competence in Research are a research instrument managed by the Swiss National Science Foundation on behalf of the Federal Authorities. Granted for a maximum duration of 12 years, they are evaluated every year by a review panel, and renewed every four years. In December 2009, the SNSF approved the next and last four-year period (2010 - 2013) which has started January 1st, 2010 (more details about IM2 Phase III projects, click here). Success of the NCCRs is measured in terms of research achievements, training of young scientists (PhD students and postdocs), knowledge and technology transfer (including spin-offs), and advancement of women.

Lay summary

IM2 is concerned with the development of natural multimodal interfaces for human-computer interaction. By “multimodal” we mean the different technologies that coordinate natural input modes (such as speech, pen, touch, hand gestures, head and body movements, and eventually physiological sensors) with multimedia system output (such as speech, sounds, and images). Ultimately, these multimodal interfaces should flexibly accommodate a wide range of users, tasks, and environments for which any single mode may not suffice. The ideal interface should primarily be able to deal with more comprehensive and realistic forms of data, including mixed data types (i.e., data from different input modalities such as image and audio).

As part of IM2, we are also focusing on computer-enhanced human-to-human interaction. Indeed, understanding human-human interaction is fundamental to the long-term pursuit of powerful and natural multimodal interfaces for human-computer interaction. In addition to making rich, socially-enhanced analyses of group process ripe for exploitation, our advances in speech, video, and language processing, as well as the tools for working with multimodal data, will improve research and development in many related areas.

The field of multimodal interaction covers a wide range of critical activities and applications, including recognition and interpretation of spoken, written and gestural language, particularly when used to interface with multimedia information systems, and biometric user authentication (protecting information access). As addressed by IM2, management of multimedia information systems is a wide-ranging and important research area that includes not only the multimodal interaction described above, but also multimedia document analysis, indexing, and information retrieval. The development of this technology is necessarily multi-disciplinary, requiring the collaborative contributions of experts in engineering, computer science, and linguistics.

To foster collaboration, and as a particularly interesting application, IM2 is mainly focusing on new multimodal technologies to support human interaction, in the context of smart meeting rooms and remote meeting assistants. In this context, IM2 thus aims to enhance the value of multimodal meeting recordings and to make human interaction more effective in real time. These goals will be achieved by developing new tools for computer supported cooperative work and by designing new ways to search and browse meetings as part of an integrated multimodal group communication, captured from a wide range of devices. Several technology prototypes, able to record meetings and to automatically generate searchable multimedia meeting archives are now available and some of the resulting technologies are being exploited by IM2 spin-offs or have been adopted by companies working in the multiple fields of Information and Communication Technology (ICT), including e.g. video-conferencing and meeting facilitation.

Keywords: Human-computer interaction, human-to-human communication, automatic speech recognition, computer vision, multimedia indexing, smart meeting room, Integrated Multimodal Processing, Social Signal Processing, Human Centered Design and Evaluation.

The National Centres of Competence in Research (NCCR)
are a research instrument of the Swiss National Science Foundation (SNSF)

All relevant information about IM2 is also available on

Last modified 2013-10-24 11:21

Powered by Plone