Networked Computing for the 21st Century
STIMULATE
leftright
Overview
Synergistic modalities for human/machine communication
Information Retrieval
Search over live multimedia information


Overview

The Speech, Text, Image, and MULtimedia Advanced Technology Effort (STIMULATE) supports fundamental research devoted to understanding multimodal human communication and its application to computer technology. STIMULATE is a multiagency collaboration that includes researchers at NSF, NSA, and DARPA. The aim is to accelerate the progress of information technology by supporting new directions in R&D for understanding human communications in multiple languages and modalities, such as text, images, video, gestures, facial expressions, handwriting, and other means by which humans communicate. STIMULATE projects include:



Synergistic modalities
for human/machine
communication


Natural communication with machines is a crucial factor in bringing the benefits of networked computers to mass markets, and the sensory dimensions of sight, sound, and touch are, at present, comfortable and convenient modalities for the human user to interact with these computers. However, new technologies are now emerging that can support human/machine communication with features that emulate face-to-face interaction. Because speech is a preferred means for human information exchange, conversational interaction with machines will play a central role in collaborative knowledge work performed with networked computers. Using speech in combination with simultaneous visual gestures and physical signaling requires software agents that can fuse error-susceptible sensory information into reliable interpretations that are responsive to and anticipatory of human user intentions. A Rutgers University project seeks to design methods and evaluation metrics for providing human users the benefits of natural communication with computers.



Information retrieval

The center for Intelligent Information Retrieval (CIIR) at the University of Massachusetts is investigating how to retrieve information from general image databases. Retrieval is based on image content and any associated text. Given an image database, images similar to an example image are retrieved. Similarity is evaluated on the basis of appearance (that is, the shape of the gray-level intensity surface), color, and texture. The CIIR's approach is not limited to specific image types, nor does it depend upon learning.
 
 
This figure is an example of a retrieval.performed at the CIIR at the University of Massachusetts. The queried database is a set of 2048 trademark images from the U.S. Patent and Trademark Office. The figure shows the retrieved images ranked as most similar to the first image (the query).



Search over live multimedia
information

Research at Columbia University focuses on developing technologies to help people find and track the information they need to keep current in their jobs. The Columbia Digital News System provides up-to-the-minute news briefings on topics of interest, linking the user into an integrated collection of related multimedia documents. Depending on the user's profile or query, events will be tracked over time and an automatically generated summary of the most recent developments provided. A representative set of images or videos can be incorporated into the summary, and the user can follow up with multimedia queries for more details and further information. In order to allow searches over images, a major component of this research is categorizing images through coordinated use of image features and associated text.
 
 

 
The figure above depicts the results of a sample query made to the Columbia Digital News System. After examining the retrieved images, the user can link to articles, captions, and other information associated with them.

leftright