MUMS-logo

 MUMS

Mobile Mu

ltimodal Route Navigation



| Introduction | MUMS System | Material | Contact | Suomeksi Suomeksi |

Introduction

PUMS Project

In the PUMS ("New Methods and Applications of Speech Technology") project speech-based interaction is studied from technology and human-computer interaction perspectives. The consortium includes several significant Finnish universities, research institutes and companies. The project is part of the FENIX - Interactive Computing technology programme of the National Technology Agency of Finland. The Applications subproject studies local transportation services suitable for mobile devices and user groups such as visually impaired users. User-centered research methods are used in the course of iterative development. An additional goal in this subproject is to implement a passenger information system for the international UITP-2007 public transportation conference held in Helsinki.

Multimodality & MUMS

As mobile devices decrease in size, the importance of interface design rises significantly. Speech is a natural and efficient way of communicating, but due to the challenges in speech recognition, flexible commercial speech-based applications are yet to be introduced. This is where parallel information channels, e.g. gestures, come in. This so-called multimodality has already been used for some time in various limited-functionality applications. Multimodality makes interaction natural and efficient, but at the same time resulting in additional effort needed in the design and implementation phases. This is one of the reasons why multimodal applications haven't gained much ground.

The MUMS client application in use

Image 1. The MUMS client application in use

The MUMS system is accessed from a standard PDA device (some features must be disabled in simpler client devices, e.g. mobile phones), and is aimed at public transportation commuters. Users interact with the system using natural speech and map gestures. The system responds with synthetic speech and graphical map representations. The system also utilizes GPS information, which not only aids the user in pinpointing his/her location, but also simplifies dialogue.

The MUMS system

Image 2. Depiction of the MUMS system

Recognition of the user's utterances can be improved also with a custom-made interaction (dialogue) model. In the MUMS system we chose to divide the dialogue into two phases/parts: 1. the user and the system seek to define a route, which is valid and suits the user, and 2. the system guides the user on this defined route.

Future plans include using positioning data also in the navigation phase, enabling more accurate and flexible navigational instructions, as the user's location is known at all times.

A wholly functioning prototype of the MUMS system is at the moment in test use. A more thorough usability test was carried out in late 2005, with 20 participants manhandling the system in scenario-based route finding tasks. New features and improvements will be added to the system based on user experiences, and a detailed analysis of the evaluation as a whole will soon follow.


MUMS System

Client Application

Users can freely roam and zoom the map in real-time. User input - speech and map gestures - are simultaneously recorded with the help of a recording button. The user is free to present input unimodally with either speech or gestures, or multimodally with a combination of both (see Image 3 below for an example). Image 3 also shows a system response, where synthetic speech is combined with a graphical representation. When a user accepts the route the system proposes, navigation phase is initiated by saying, e.g. "navigate". The user is then guided along the route a leg at a time, in the normal or the detailed guidance level.

MUMS user interface

Image 3. The user interface

If the client device is coupled with a GPS device, the system attempts to utilize positioning data, as the user can present route queries in a simpler form:

     Without GPS: "I need to get from Aleksanterinkatu thirteen to Länsi-Pasila."

     With GPS: "I need to get from here to Länsi-Pasila." or just: "I want to get to Länsi-Pasila."

Accepted input modes and modes of output presented by the system are freely customizable in the application preferences. A detailed route description in textual form can be found ín the 'Route' menu template.

MUMS Server

The MUMS server is built on the highly customizable Jaspis-framework. The system is connected to an external routing service. Some parts of the dialogue between the user and the system are carried out on the system side, and some in the client device, depending on its capabilities. E.g. when using the PDA device, synthetic speech is produced in the client device, and speech recognition is carried out in the system server. To achieve a desired level of flexibility, all information between different parts of the system are in XML format (except, of course, recorded speech).

Speech and gestural information received from the client device first undergo separate recognition processes, after which the two information streams are combined into an n-best list of candidates in a multi-phase fusion process. These combined forms are then used to figure out the user's intentions. The dialogue management component chooses the best input candidate and uses it to build a suitable response - it either requests additional data from the user, or parses a route description or a route navigation leg. This response is finally sent to the client device.

See the section 'Material' below for detailed information about the system and the project as a whole.


Material

Publications

Miscellaneous


Contact

Topi Hurtig, firstname.lastname@helsinki.fi, 050-385 0623
Kristiina Jokinen, firstname.lastname@helsinki.fi

Department of General Linguistics
PL 9 (Siltavuorenpenger 20 A)
00014 University of Helsinki



| Introduction | MUMS System | Material | Contact | Suomeksi Suomeksi |

Updated: 2006-04-10