-   -
 
National Coordination Office for Networking and Information Technology Research and Development
 
 
 
 

Information Technology: The 21st Century Revolution
High End Computing -- Infrastructure and Applications
LeftRight
Introduction
Computing environments and toolkits
Modeling tools
HECC applications
Biomedical application
Aerospace applications
Advanced chemistry application
Quantum physics applications
Weather applications


Introduction

High end computing infrastructure and applications are needed for the advancement of science and technology, for large-scale information management approaches to product and process design, and for the support of national security-all of which are within or directly support the missions of various Federal agencies. HEC I&A encompasses projects to develop the software that underlies high end applications. Such software tends to support multiple applications or provide an application-level or middleware infrastructure where applications can run. This section describes that software as well as select high end applications. The IT R&D facilities section (page 34) describes HECC infrastructure facilities in detail.



Computing environments
and toolkits


Architecture adaptive
computing environment
(aCe)

NASA's aCe is a data-parallel computing environment being designed to improve the adaptability of algorithms to diverse architectures. aCe will encourage programmers to implement applications on parallel architectures by assuring them that future architectures will run their applications with minimal modification, and will encourage computer architects to develop new architectures by providing an easily implemented software development environment and a library of test applications.

aCe will facilitate the ability of programmers to:

  • Allow easy, architecture-independent expression of algorithms
  • Port algorithms among diverse computer architectures
  • Adapt algorithms to different architectures
  • Easily and efficiently implement algorithms on diverse architectures
  • Optimize algorithms on diverse computing architectures
  • Develop applications on heterogeneous computing environments
  • Develop programming environments for new computer architectures

Structured parallel execution is the concept behind aCe. Beginning with a virtual architecture that reflects the spatial organization of an algorithm, a programmer develops software reflecting the algorithm's temporal organization. Today, aCe has been implemented for a superset of C for Linux and Linux with parallel virtual machine (PVM), and has a Linux and Tera/Cray debugger.


This KeLP application, illustrates the results of a hierarchical adaptive mesh. The larger dark areas represent regions of high error, such as atomic nuclei in materials design applications.


Kernel lattice parallelism (KeLP)

KeLP, developed under NSF funding at UCSD and NPACI, is a framework for implementing portable scientific applications on distributed memory parallel computing systems. KeLP supports coarse-grained data parallelism in distributed collections of structured data blocks. It is intended for applications with special needs, such as to adapt to data-dependent or hardware-dependent runtime conditions. KeLP is currently used in full-scale applications including subsurface modeling, turbulence studies, and first principles simulation of real materials.

The KeLP infrastructure, implemented as a C++ class library, provides high- level tools to computational scientists, allowing them to concentrate on applications and mathematics instead of low-level data distribution and interprocessor communication concerns, enabling them to develop complicated applications in a fraction of the time formerly required.


The visualization in this KeLP application-done at San Diego Supercomputer Center's (SDSC's) Advanced Scientific Visualization Lab-depicts small-scale structure in terms of the relative significance of rotational vs. straining motion. The red areas correspond to rotation-dominated (high vorticity) regions that concentrate into tube-like structures. The green regions are those with comparable rotation and strain, which tend to form sheets and surround the tube-like structures. The blue areas correspond to strain-dominated regions that indicate locally high- energy dissipation. This direct numerical simulation is the numerical solution to the exact 3-D time-dependent Navier-Stokes equations governing fluid motion.



This image is from a simulation of a halo-alkane dehalogenase enzyme after 100 picoseconds. The simulation was performed by molecular dynamics software that uses global arrays, part of DOE's ACTS toolkit.


Parallel algorithms and
software for irregular
scientific applications

This NSF-funded project at New Mexico State University is applying techniques from parallel computing and computational geometry to develop theoretically sound and practically efficient parallel algorithms for a class of irregular scientific applications that depend upon interactions among entities such as atoms located in 2-D or 3-D space. Researchers will explore techniques for efficient parallel execution of such applications and will develop software to aid applications programming in this environment. Applications include the N-body problem useful in astrophysics, plasma physics, molecular and fluid dynamics, computer graphics, numerical complex analysis, and protein-accessible surface area calculations for computational molecular biology.

 

Advanced computational
testing and simulation
(ACTS) toolkit

DOE's ACTS toolkit is a set of software tools to help programmers write high performance scientific codes for parallel computers. It focuses primarily on software used inside an application, rather than software used to develop an application. Consisting primarily of software libraries, ACTS tools are designed to run on distributed memory parallel computing systems using the message passing interface (MPI) for communication, with portability and performance important considerations in their design. The tools fall into four broad categories:

  • Numerical tools implement numerical methods and include sparse linear system solvers, ordinary differential equation solvers, and others.
  • Frameworks provide infrastructure to manage some of the complexity of parallel computing, such as distributing arrays and communicating boundary information.
  • Execution support provides application-level tools including performance analysis and visualization support.
  • Developer support is provided transparently for tool developers

This image displays the results of a simulation of liquid octanol. The 1,000-step simulation, which uses the ACTS molecular dynamics software NWArgos, included 216,000 atoms and was performed on a 1,300-node Cray T3E-900.


Scalable visualization toolkits

Scientists routinely use desktop computers to visualize 100-megabyte data sets. But biomedical researchers, astronomers, oceanographers, and other scientists often need to analyze and visualize hundreds of gigabytes of data at a time. These files are so large that only a supercomputer can process them. Yet even a supercomputer's memory has difficulty accommodating the data. NPACI researchers are creating versatile supercomputer-based tools for rendering, visualizing, and interacting with very large data sets from a variety of scientific disciplines. These scalable visualization toolkits will support the next generation of large-scale simulations on teraflops computers, spurring new collaborations within and between scientific disciplines by providing a graphical user environment for sharing data and insights.


NIST staff members discussing the result of a micromagnetic simulation.

 

Tools to explore geometric
complexity

Computational geometry began with the promise of unifying efforts to solve geometric problems in statistics, biology, robot motion planning, graphics, image analysis, virtual reality, and data mining. In two decades, the field has produced a number of tools for solving geometric algorithmic problems. A recurring feature in the design and analysis of geometric problems is the strong link between the computational and combinatorial aspects of the questions under investigation. Understanding the combinatorial geometry behind the problem is fundamental to being able to find an efficient solution. This NSF-sponsored project at the Polytechnic University of New York will explore combinatorial problems arising in geometric contexts to develop new tools and refine tools already available to design and analyze geometric algorithms, with a goal of constructing simpler and more efficient algorithms. The project is also investigating techniques to more realistically estimate the behavior of geometric algorithms on typical inputs.



Modeling Tools


NASA Earth system modeling framework
IIn software engineering, a software architecture and a set of software entities (objects, programs, routines, interface definitions, type systems, and so forth) that allow the construction, storage, management, and aggregation of software components are called a "framework." Frameworks are used to:
  • Foster reusability among software components and portability among computing architectures
  • Reduce the time needed to modify research applications software
  • Structure systems to better manage evolving software
  • Enable software exchange among major research centers

This multiyear project, begun in FY 2000, will improve the interoperability, performance, and manageability of NASA's Earth and Space Science (ESS) applications through the development of a common Earth system modeling framework (ESMF). The overall goal of the ESS project is to demonstrate the potential afforded by balanced teraflops systems' performance to further our understanding of and ability to predict the dynamic interaction of physical, chemical, and biological processes affecting the Earth, the solar-terrestrial environment, and the universe. (For more about ESS, please see page 19.)

 

Micromagnetic modeling
NIST's micromagnetic modeling project is developing computational tools for accurate and efficient micromagnetic calculations-essential in the magnetic disk drive industry to achieve higher densities and faster read-write times. NIST has released OOMMF, a modular object-oriented micromagnetic modeling framework and reference software that allows software developers to swap their code in and out as desired. OOMMF will help establish a baseline level of competence in 2-D modeling and compare competing algorithmic components. A 3-D version of the code is under development.

 

Modeling realistic material microstructures

The behavior of a material on the macroscopic scale depends to a large extent on its microstructure-the complex ensemble of polycrystalline grains, second phases, cracks, pores, and other features that are large compared to atomic sizes. Modeling such structures is challenging due, in part, to their complicated geometries. NIST has developed the OOF object-oriented finite-element software to analyze material microstructures and simulate physical property measurements. OOF allows materials scientists to study the influence of microstructure on a material's macroscopic properties through an easy-to-use graphical interface. By applying stresses, strains, or temperature changes, the user can measure the effective macroscopic material behavior or examine internal stress, strain, and energy density distributions. By modifying a microscopic material property, the user can find the effect of that property on the macroscopic behavior; by modifying the microstructure, the effect of geometry on a particular material can be determined. OOF is being extended to handle other properties in addition to thermoelasticity. The software won a 1999 Technology of the Year Award from Industry Week magazine.

The graphic at left illustrates the steps in the use of NIST's object-oriented finite-element software OOF, which allows materials scientists to study the influence of microstructure on a material's macroscopic behavior by means of a graphical interface.

 

Numerical and data
manipulation techniques
for environmental
modeling

The primary objective of EPA's numerical and data manipulation techniques program is to improve the performance of key numerical algorithms that form the computational foundation of environmental models. This research develops and evaluates parallel computing techniques encompassing interconnected workstations, vector and parallel supercomputers, parallel software and algorithms, and communication to determine the most effective approach to complex, multipollutant, and cross-media environmental modeling. Fundamental research is also conducted on computational techniques to quantify uncertainty as an integral part of the numerical computation.

 



HECC applications


HECC applications research harnesses the raw speed and data storage capacity of advanced computing platforms to science's most data-intensive, complex, and challenging problems, such as the design and properties of materials in weapons, aerospace, and industrial systems; the shapes and processes of biomolecular structures; and synthesis and analysis of terascale data sets. Computationally intensive high end applications include modeling, 3-D visualization, and tools for data mining and data fusion.

 



Biomedial applications




Neuroscience imaging

During the next century there is a real possibility that we will discover in detail how the brain works and how to treat or prevent common neurological diseases and traumas. Developments in modern computer-aided microscopes and advances in high performance computing promise to uncover new information about the structural and functional dynamics of the nervous system.

Neuroscientists are involved in research covering a wide range of scales, from modeling molecular events and subcellular organelles to mapping brain systems. They are also investigating the ways in which single neurons and small networks of neurons process and store information. Newly possible detailed models of single neurons are being used to model the complex properties of neurons and neuronal networks. Breakthroughs in optical imaging and image processing provide opportunities for deriving information about the 3-D relationships among biological structures, and structure-function work is moving into 4-D (3-D plus time) imaging.

Mcell

A growing interest in neuron modeling parallels the increasing experimental evidence that the nervous system is extremely complex. In fact, modeling is as essential as laboratory experimentation in understanding structure-function relationships in the brain. The models may become as complex as the nervous system itself, thereby requiring use of advanced computing. In National Partnership for Advanced Computational Infrastructure (NPACI)-supported neuroscience research, both widely used and newly developed neuron modeling systems are being extended and linked to large-scale, high performance capabilities.

The ongoing NPACI MCell project has developed a general Monte Carlo (pseudo-random number-based) simulator of cellular microphysiology. Biological structures, like neurons, show tremendous complexity and diversity at the subcellular level. For example, a single cubic millimeter of cerebral cortex may contain on the order of five billion interdigitated synapses of different shapes and sizes, and subcellular communication is based on a wide variety of chemical signaling pathways. A process like synaptic transmission encompasses neurotransmitter and neuromodulator molecules, proteins involved with exo- and endocytosis, receptor proteins, transport proteins, and oxidative and hydrolytic enzymes.

Mcell incorporates high-resolution ultrastructure into models of ligand diffusion and signaling. Ligands and effectors-reaction mechanisms-and surfaces on which reactions take place are specified by the modeler, who uses the Mcell model description language to build the simulation objects. Mcell then carries out the simulation for a specified number of iterations, after which numerical results and images can be produced. Optimized software for widely used and newly developed models is being ported to the University of California-San Diego's (UCSD's) Cray T3E and IBM teraflops systems.

Mcell has also been tested on a 40-machine NetSolve cluster. A collaborative effort among scientists at UCSD, the University of Tennessee, and Oak Ridge National Laboratories (ORNL), NetSolve turns a loosely associated collection of machines into a fault-tolerant, client-server compute cluster. The initial test of MCell on the NetSolve cluster demonstrated the need for a distributed file-checking mechanism that would allow NetSolve to support larger Mcell runs.


Simulation of a synapse between a nerve cell (not shown) and a mouse sternomastoid muscle cell. The neurotransmitter acetylcholine diffuses from a synaptic vesicle to activate receptors (pictured as a cloud of dots) on the muscle membrane. Snapshot is at 300 microseconds at peak activation.


Protein folding

Understanding how proteins form may yield exciting medical and scientific possibilities. In nature's ultimate origami, cells use information encoded in genes to construct a long chain of amino acids that compacts into a tangle of loops, helices, and sheets. A protein's unique geometry enables it to interact with other molecules and do the body's biochemical heavy lifting, regulating digestion, for example, or turning genes on and off during fetal development. Because of its complexity, however, simulating protein folding and proteins' interactions with other molecules is one of the toughest problems in computational biology. Solvation models--so-called because water is the natural environment for proteins-calculate the forces acting between every possible pairing of the atoms in the protein as well as the surrounding solution, but such accuracy comes at a high cost. Simulating, with full atomic detail, just one-millionth of a second of the folding process in a small protein can take months of computation, even on today's high performance computers. "Cutoff models" that include only pairs of nearby atoms miss significant effects from greater distances. Methods that group atoms--originally developed in the 1980s to study interactions among stars--are yielding more accurate results. Research in this area is funded by NIH and is being carried out by National Computational Science Alliance (Alliance) scientists.

Researchers at the University of California-Los Angeles's (UCLA's) Laboratory of Neuro-Imaging (LONI) are building population-based digital brain atlases to discover how brain structures are altered by disease. In 3-D maps of variability from an average cerebral cortex surface derived from 26 Alzheimer's disease patients, individual variations in brain structure are calculated based on the amount of deformation needed to drive each subject's convolution pattern into correspondence with the group average. Surface matching figures are computed using fluid flow equations with more than 100 million parameters. This requires parallel processing and very high memory capacity.


Emerge: Portable
biomedical information
retrieval and fusion

National Center for Supercomputing Applications (NCSA) researchers supported by NIH's National Cancer Institute (NCI) and NSF are addressing a recurring science problem-finding and relating information scattered across many data sources. To pinpoint the defective genes that cause cancer cells to run amok, for example, biological researchers comb the Internet weekly, scanning vast online databases for clues. The next essential clue to a tumor suppressor could lie hidden in the billions of bases of human DNA being archived in GenBank, or in any of dozens of other online databases. A user needs a skilled translator such as Emerge, a portable collection of information-retrieval programs developed at NCSA. Emerge translates a single query into the idioms of separate databases, collects the results, translates them back to a common computer language, and displays them on a user's screen.

A cancer researcher, for example, could enter the phrase "small-cell lung cancer" into a form displayed by a Web browser. The query is converted by Emerge into a data format called Extensible Markup Language (XML), a versatile offshoot of HyperText Markup Language (HTML) that may soon become the foundational data format on the Web. The query is then sent to Gazebo, the heart of the Emerge software system. Gazebo translates the query into Z39.50, a standard language recognized by many library catalogs and databases of scientific literature.

Because the cancer literature often uses synonymous terms to describe a single concept, NCI has funded an interface to the Unified Medical Language System (UMLS) metathesaurus developed over the past 10 years by NIH's National Library of Medicine (NLM) to integrate collections of medical terminologies. By next year, NCI plans to link cancer-related terms with Emerge, allowing patients to click on a highlighted term and search for related information from a universe of cancer databases. Physicians and cancer patients and their families may also discover, with a few clicks of the mouse, how many people suffer from a particular type of cancer, the status of cancer-related legislation, and information on potential drug treatments. Emerge is also part of a comprehensive science information system that has up-to-date news about research grants.



Aerospace applications



Computational
Aerosciences (CAS)

The NASA Computational Aerosciences (CAS) project is working with industry toward the goal of trimming the time and cost of designing airplanes. Researchers propose to develop the high end computing hardware and systems and applications software to enable 1,000x speed-ups in systems performance. NASA-supported researchers have demonstrated a full compressor simulation in 15 hours-400 times faster than was possible in 1992. In overnight supercomputing calculations, numerical propulsion system simulation (NPSS) software will simulate a full range of engine functions. These simulations let designers try out potential changes without building and testing real hardware.

CAS is developing a framework to enable multidisciplinary design optimization of complete aircraft, which requires enormous computing resources. By integrating two Silicon Graphics, Inc. (SGI) Origin 2000 systems, CAS created a 100-gigaflop testbed that presents a single system image with global shared memory. CAS supports the development of cost-effective, high end computing solutions. For example, CAS found a 92 percent cost savings for certain design applications using 10 workstations as opposed to a single-processor supercomputer.


A simulation of the GE90's high-pressure compressor. NASA-supported CAS researchers have demonstrated a full compressor simulation in 15 hours-400 times faster than was possible in 1992

 


Earth and Space Science (ESS)

NASA's ESS R&D applies high end computing technologies to topics as cosmic as the collisions of ultra-dense neutron stars and as compelling as the fate of the Earth's climate. ESS, which is managed by NASA's Goddard Space Flight Center (GSFC) with the Jet Propulsion Laboratory (JPL), encompasses the following research areas:

  • Earth imaging science employs multiple supercomputing systems to process and visualize satellite-collected radar data to monitor regional changes in the Earth's environment. A detailed imaging study that followed seasonal changes in the Amazon basin was recently completed.
  • Relativistic astrophysics combines fluid dynamics and general relativity to probe the violent merger of ultra-dense neutron stars. These mergers may be the cause of still-mysterious hyper-energetic gamma-ray bursts.
  • Simulations of the Earth's interior will gain insight into the chaotic processes that drive changes in the planet's interior.
  • Four-dimensional data assimilation will build comprehensive views of the atmosphere by merging observations with climate models.
  • Earth system modeling seeks to understand the Earth's climate via complex high-resolution models of coupled atmospheric/oceanic circulation and chemistry.
  • Fluids in microgravity R&D studies fluids in low-gravity environments to simulate space-based manufacturing, life support, and safety systems.
  • Convection and dynamos in stars focuses on the most fundamental and least understood turbulent processes in the interior of stars such as the sun.
  • Multiscale heliosphere modeling uses computational studies to probe interactions of the solar wind with the local space environment.
  • Solar activity/heliospheric dynamics investigates the tangled 3-D magnetic structures in the sun's corona or outer atmosphere. These structures play a key role in the physics of solar activity.


Advanced chemistry
application


Understanding combustion

Most of the energy the world uses comes from the combusionof fossil fuels. Increases in computing power over the next few decades will make possible predictive computer models that will enable us to understand the complex interactions of flued flow, chemistry, surface physics, and materials properties that determine the efficiency of combustion devices as well as the output of undersirable combusion by products such as soot and NOx. In the past year, researchers at DOE's Lawrence Berkeley National Laboratory (LBNL) have brought to light aspects of methane combusion that have confounded scientists for a number of years. This research combined advanced adaptive mesh refinement technologies and a new understanding of chemical reaction rates to yield simulations that agree closely with experiements. Future extensions of the research must incorporate the chemistry of more complex hydrocarbons, such as diesel fuel, which have thousands of reaction pathways, as well as more realistic surface physics and more complex geometries.


Collaborators at DOE's LBNL and LLNL and the University of California-Davis have used supercomputers to obtain a complete solution of the ionization of a hydrogen atom by collision with an electron, the simplest nontrivial example of the problem's last unsolved component. Pictured at left is a representative radial wave function of two electrons scattered in the collision of an electron with a hydrogen atom.



Quantum physics
applications


Ionization by electron impact

For over half a century, theorists have tried and failed to provide a complete solution to scattering in a quantum system of three charged particles, one of the most fundamental phenomena in atomic physics. Such interactions abound: Ionization by electron impact, for example, is responsible for the glow of fluorescent lights and the ion beams that engrave silicon chips. Collaborators at DOE's Lawrence Berkeley National Laboratory (LBNL) and Lawrence Livermore National Laboratory (LLNL), and the University of California-Davis recently used supercomputers to solve the ionization of a hydrogen atom by collision with an electron-the simplest nontrivial example of the problem's last unsolved component. The breakthrough employs a mathematical transformation of the Schrödinger wave equation in which the wave functions of outgoing particles vanish at large distances from the nucleus rather than extending to infinity.



Weather applications


Hurricane intensity prediction

As part of NOAA's efforts to understand and forecast climate and weather, researchers at the agency's Geophysical Fluid Dynamics Laboratory (GFDL) seek to:

  • Understand the genesis, development, and decay of tropical disturbances by investigating thermo-hydrodynamical processes using numerical simulation models
  • Study small-scale features of hurricane systems such as the collective role of deep convection, the exchange of physical quantities at the lower boundary, and the formation of organized spiral bands
  • Investigate the ability of numerical models to predict hurricane movement and intensity and transition those models to operational use

While the GFDL models are excellent in predicting intensities of weak to moderate hurricanes, better prediction of surface wind intensities in stronger hurricanes is anticipated in FY 2000-FY 2001 due to increased computing power, a result of HEC R&D, that allows hurricane models to operate at higher grid resolutions, account for asymmetries in storms, and improve physical parameterization. During FY 2000, developmental work to improve the hurricane model initialization, ocean interaction, model physics, and resolution continues; case studies will be used to evaluate the models' impact on forecasting skills. Work on assimilating more data into the forecast and analysis system will continue. The effects of evaporation of rain and sea spray, together with dissipative heating, will be evaluated.


NOAA-supported researchers have simulated samples of hurricanes from today's climate and a projected greenhouse gas-warmed climate by linking information from GFDL's global climate model into the high-resolution GFDL hurricane prediction model (left). This now operational model has been used successfully by NOAA's National Centers for Environmental Prediction to predict tropical storm paths over the last several hurricane seasons.


Hurricanes and global warming

The strongest hurricanes in the present climate may be upstaged by even more intense hurricanes over the next century if the Earth's climate continues to be warmed by increasing levels of greenhouse gases in the atmosphere. Most hurricanes do not reach their maximum potential intensity before weakening over land or cooler ocean regions. However, those storms that do approach their upper-limit intensity are expected to be slightly stronger in the warmer climate due to the higher sea surface temperatures.

NOAA researchers have simulated samples of hurricanes from the present-day climate and from a projected greenhouse gas-warmed climate by linking information from GFDL's global climate model into the high-resolution GFDL hurricane prediction model. This is the operational model that has been used by NOAA's National Centers for Environmental Prediction to predict tropical storm tracks for the last several hurricane seasons. The simulation projects that wind speeds in the northwest tropical Pacific will increase by 5-12 percent if tropical sea surfaces warm by a little more than 2 degrees Centigrade. This study represents the first use of an operational model to study a phenomenon that was theorized a decade ago. It illustrates the use of high performance computing to investigate the potential impact of global climate change on weather systems.

LeftRight