-   -
 
National Coordination Office for Networking and Information Technology Research and Development
 
 
 
 

Left bulletRight bullet
Graphic of NOAA's High Performance Computing and Communications
Graphic by Janet Ward of NOAA's High Performance Computing and Communications Program



Representative FY 2002 agency activities

NSF: Development of digital library collections, including research in architectures, tools, and technologies for organizing, annotating, searching, and preserving distributed multimedia archives; expand online scientific data and large-scale data-mining research to accelerate use of existing data to supplement new observations

DARPA: Deploy scalable prototype analysis environment in defense application with cross-repository information analysis functionality (semantic retrieval, indexing, value filtering, user defined alerting, and categorizing)

NIH: Continue work on query by image content to produce a reliable method for computer-assisted x-ray image segmentation, indexing, and query; establish an information storage, curation, analysis, and retrieval (ISCAR) program for biological data

DOE Office of Science: Modular electronic notebook prototype, whiteboard, and related tools for collaboratory sharing of scientific data, instrumentation, and research results

NIST: Initial Internet-accessible repository of full structural crystallographic data for inorganic materials and a second repository of Internet-accessible molecular recognition knowledge; intelligent interfaces for using existing bioinformatics tools for protein databases

NOAA: Extend real-time collaborative access to chemical disaster information by surrounding this functionality with synchronous collaborative tools to enable experts nationwide to consult while maintaining a consistent view of the data

AHRQ: Develop Web-based applications to improve health data systems and quality of care; innovative strategies for data collection in clinical settings; approaches for integrating quality and outcomes data into the care process

Early Federal IT investments have pioneered development and implementation of digital repositories of information and such basic enabling technologies as search engines, record management systems, and linkages among distributed archives. Creating digital libraries across the range of human knowledge and developing the technologies and tools to make that knowledge universally available on demand is a core challenge in information technology whose advances benefit every profession, every academic discipline, every learner, and every citizen.

Digital libraries form the basis of the Nation's 21st century knowledge network. The Federally supported research to decode the human genome, for example, was accelerated by many years because researchers could create, store, and immediately share over the Internet massive databases of genetic information representing pieces of the enormous biological puzzle. Federal digital libraries funding not only established major digital collections in such areas as Earth and space sciences, the humanities, law, medicine, oral history, and science, mathematics, and engineering education, but also spun off search engine technologies that have become successful commercial enterprises.

Developmental issues in the digital libraries field are growing in tandem with today's knowledge explosion. Their scale is suggested by a recent University of California at Berkeley study estimating that the world now produces between one and two exabytes (an exabyte is a billion billion 8-bit bytes) of information annually; most of this vast output is images, sound, and numeric data already in digital formats; only 0.003 percent represents print documents. At the same time, barely 10 percent of all public information ever produced in print has been digitized and made available on the Internet. How to determine, collect, and preserve what is of value in the world's dizzying new digital output now joins older questions of how and what to digitize from humanity's pre-digital knowledge stores as issues for archivists.

Building archives is only one step in generating the technological framework that makes a digital library useable. It also takes advanced technologies for managing and working with digital information, from visualization, data fusion, and analysis capabilities to remote collaboration and metadata notation schemes, to advanced interoperable systems. The NITRD effort is building on early Federal successes to develop the next-generation technologies that are needed to help realize the full potential of electronic information. Today's search engines, for example, are based on fundamental algorithms developed 20 years ago; current search tools cannot locate audio or image information by content description. Strategies to assure long-term preservation of digital records constitute another particularly pressing issue for research. As storage technologies evolve with increasing speed to cope with the growing demand for storage space, the obsolescence of older storage hardware and software threatens to cut us off from the electronically stored past.

Federal agencies' FY 2002 research efforts will include development of large-scale digital collections in engineering, sciences, and humanities; research to increase interoperability and integration of software in distributed systems; protocols and tools for data annotation and management; and research in technical issues in preservation.

Long-Term Research Needs

  • Data storage and management technologies:
    • Tools for collection, indexing, synthesis, and archiving
    • Protocols for data compatibility, conversion, interoperability, interpretation
    • Technologies and tools for fusion of databases, such as molecules and macromolecular structures in biology or disparate real-time weather observations, with remote access and analysis capabilities
    • Component technologies and integration of dynamic, scalable, flexible information environments
    • Digital representation, preservation, and storage of multimedia collections
    • Protocols and tools to address legal issues such as copyright protection, privacy, and intellectual property management
  • Usability of large-scale data sets:
    • Intelligent search agents, improved abstracting and summarizing techniques, and advanced interfaces
    • Digital classification frameworks and interoperable search architectures
    • Metadata technologies and tools for distributed multimedia archives
    • Ultra-scale data-mining technologies
    • Testbeds for prototyping and evaluating media integration, software functionality, and large-scale applications
Left bulletRight bullet
 
 
4201 Wilson Blvd, Suite II-405, Arlington, VA 22230 | (703) 292-4873 | (703) 292-9097 (fax)
 
-
Home | Back to Top | Contact Us | Privacy Policy | Search
-