Interactive Multimodal Information eXtraction (IMIX)

The Demonstrator

The IMIX demonstrator intends to show the status, progress, and results of the research that is carried out in the programme. All research groups participated and collaborated in building the demonstrator. The IMIX demonstrator is meant as a vehicle to show that all modules developed in the individual projects are able to jointly operate within an end-to-end spoken multimodal question-answering dialogue system.

The IMIX demonstrator is an interactive multimodal question-answering system for information on a particular domain: encyclopaedic medical information (drawing on two Dutch medical encyclopaedia). The system is able to deal with more complex questions than simple factual ones, and it is able to engage in a simple dialogue with the user. In follow-up questions the system may attempt to obtain a better understanding of the question, or reach a better common understanding with the user in order to arrive at an improved question. The answers may consist of noun phrases, sentences, paragraphs in either text or speech format, tables or graphical displays, depending on the (type of) question, the contents of the answer and the needs or profile of the user.

Two versions of the demonstrator were realized. The first version only answered questions in a single turn; the main aim of this version was to technically integrate basic versions of all modules to form a fully operational end-to-end system. The second version added interactivity and multimodal input. The demonstrator included:

  • A multimodal part focusing on the RSI domain. Here, speech and keyboard/ mouse input are combined with multimodal output (speech, text, tables, and graphics)

  • A text-based part focusing on the complete (Spectrum, Merck, and RSI) medical domain