| |
|
Information Technology:
The 21st Century Revolution
High End
Computing -- Infrastructure and Applications |

|
|
 |
 |
|
|
|
Introduction
|
High end computing infrastructure and applications are needed for the advancement
of science and technology, for large-scale information management approaches
to product and process design, and for the support of national security-all
of which are within or directly support the missions of various Federal
agencies. HEC I&A encompasses projects to develop the software that underlies
high end applications. Such software tends to support multiple applications
or provide an application-level or middleware infrastructure where applications
can run. This section describes that software as well as select high end
applications. The IT R&D
facilities section (page 34) describes HECC infrastructure
facilities in detail.
|
|
|
Computing environments
and toolkits

|
Architecture adaptive
computing environment
(aCe)
|
NASA's aCe is a data-parallel
computing environment being designed to improve the adaptability of algorithms
to diverse architectures. aCe will encourage programmers to implement
applications on parallel architectures by assuring them that future architectures
will run their applications with minimal modification, and will encourage
computer architects to develop new architectures by providing an easily
implemented software development environment and a library of test applications.
aCe will facilitate the ability
of programmers to:
- Allow easy, architecture-independent
expression of algorithms
- Port algorithms among diverse
computer architectures
- Adapt algorithms to different
architectures
- Easily and efficiently implement
algorithms on diverse architectures
- Optimize algorithms on diverse
computing architectures
- Develop applications on heterogeneous
computing environments
- Develop programming environments
for new computer architectures
Structured parallel execution is the
concept behind aCe. Beginning with a virtual architecture that reflects the
spatial organization of an algorithm, a programmer develops software reflecting
the algorithm's temporal organization. Today, aCe has been implemented for
a superset of C for Linux and Linux with parallel virtual machine (PVM), and
has a Linux and Tera/Cray debugger.
|

|
This KeLP application, illustrates the results of a hierarchical adaptive
mesh. The larger dark areas represent regions of high error, such as atomic
nuclei in materials design applications.
|
|
Kernel
lattice parallelism (KeLP)
|
KeLP, developed under NSF funding
at UCSD and NPACI, is a framework for implementing portable scientific
applications on distributed memory parallel computing systems. KeLP supports
coarse-grained data parallelism in distributed collections of structured
data blocks. It is intended for applications with special needs, such
as to adapt to data-dependent or hardware-dependent runtime conditions.
KeLP is currently used in full-scale applications including subsurface
modeling, turbulence studies, and first principles simulation of real
materials.
The KeLP infrastructure, implemented
as a C++ class library, provides high- level tools to computational scientists,
allowing them to concentrate on applications and mathematics instead of low-level
data distribution and interprocessor communication concerns, enabling them
to develop complicated applications in a fraction of the time formerly required.
 |
The visualization in this KeLP application-done at San Diego Supercomputer
Center's (SDSC's) Advanced Scientific Visualization Lab-depicts small-scale
structure in terms of the relative significance of rotational vs. straining
motion. The red areas correspond to rotation-dominated (high vorticity)
regions that concentrate into tube-like structures. The green regions are
those with comparable rotation and strain, which tend to form sheets and
surround the tube-like structures. The blue areas correspond to strain-dominated
regions that indicate locally high- energy dissipation. This direct numerical
simulation is the numerical solution to the exact 3-D time-dependent Navier-Stokes
equations governing fluid motion. |
 |
This image is from a simulation of a halo-alkane dehalogenase enzyme after
100 picoseconds. The simulation was performed by molecular dynamics software
that uses global arrays, part of DOE's ACTS toolkit.
|
|
Parallel
algorithms and
software for irregular
scientific applications
|
This NSF-funded project at New
Mexico State University is applying techniques from parallel computing
and computational geometry to develop theoretically sound and practically
efficient parallel algorithms for a class of irregular scientific applications
that depend upon interactions among entities such as atoms located in
2-D or 3-D space. Researchers will explore techniques for efficient parallel
execution of such applications and will develop software to aid applications
programming in this environment. Applications include the N-body problem
useful in astrophysics, plasma physics, molecular and fluid dynamics,
computer graphics, numerical complex analysis, and protein-accessible
surface area calculations for computational molecular biology.
|
Advanced
computational
testing and simulation
(ACTS) toolkit
|
DOE's ACTS toolkit is a set of
software tools to help programmers write high performance scientific codes
for parallel computers. It focuses primarily on software used inside an
application, rather than software used to develop an application. Consisting
primarily of software libraries, ACTS tools are designed to run on distributed
memory parallel computing systems using the message passing interface
(MPI) for communication, with portability and performance important considerations
in their design. The tools fall into four broad categories:
- Numerical tools implement numerical
methods and include sparse linear system solvers, ordinary differential
equation solvers, and others.
- Frameworks provide infrastructure
to manage some of the complexity of parallel computing, such as distributing
arrays and communicating boundary information.
- Execution support provides application-level
tools including performance analysis and visualization support.
- Developer support is provided
transparently for tool developers
 |
This image displays the results of a simulation of liquid octanol. The 1,000-step
simulation, which uses the ACTS molecular dynamics software NWArgos, included
216,000 atoms and was performed on a 1,300-node Cray T3E-900.
|
|
Scalable
visualization toolkits
|
Scientists routinely use desktop
computers to visualize 100-megabyte data sets. But biomedical researchers,
astronomers, oceanographers, and other scientists often need to analyze and
visualize hundreds of gigabytes of data at a time. These files are so large
that only a supercomputer can process them. Yet even a supercomputer's memory
has difficulty accommodating the data. NPACI researchers are creating versatile
supercomputer-based tools for rendering, visualizing, and interacting with
very large data sets from a variety of scientific disciplines. These scalable
visualization toolkits will support the next generation of large-scale simulations
on teraflops computers, spurring new collaborations within and between scientific
disciplines by providing a graphical user environment for sharing data and
insights.
 |
NIST staff members discussing the
result of a micromagnetic simulation. |
|
Tools
to explore geometric
complexity
|
Computational geometry began with
the promise of unifying efforts to solve geometric problems in statistics,
biology, robot motion planning, graphics, image analysis, virtual reality,
and data mining. In two decades, the field has produced a number of tools
for solving geometric algorithmic problems. A recurring feature in the
design and analysis of geometric problems is the strong link between the
computational and combinatorial aspects of the questions under investigation.
Understanding the combinatorial geometry behind the problem is fundamental
to being able to find an efficient solution. This NSF-sponsored project
at the Polytechnic University of New York will explore combinatorial problems
arising in geometric contexts to develop new tools and refine tools already
available to design and analyze geometric algorithms, with a goal of constructing
simpler and more efficient algorithms. The project is also investigating
techniques to more realistically estimate the behavior of geometric algorithms
on typical inputs.
|
|
Modeling Tools
|
|
NASA
Earth system modeling framework
|
IIn
software engineering, a software architecture and a set of software entities
(objects, programs, routines, interface definitions, type systems, and so forth)
that allow the construction, storage, management, and aggregation of software
components are called a "framework." Frameworks are used to:
- Foster reusability among software
components and portability among computing architectures
- Reduce the time needed to modify
research applications software
- Structure systems to better
manage evolving software
- Enable software exchange among
major research centers
This multiyear project, begun in
FY 2000, will improve the interoperability, performance, and manageability
of NASA's Earth and Space Science (ESS) applications through the development
of a common Earth system modeling framework (ESMF). The overall goal of
the ESS project is to demonstrate the potential afforded by balanced teraflops
systems' performance to further our understanding of and ability to predict
the dynamic interaction of physical, chemical, and biological processes
affecting the Earth, the solar-terrestrial environment, and the universe.
(For more about ESS, please see page 19.)
|
Micromagnetic
modeling
|
NIST's
micromagnetic modeling project is developing computational tools for accurate
and efficient micromagnetic calculations-essential in the magnetic disk drive
industry to achieve higher densities and faster read-write times. NIST has
released OOMMF, a modular object-oriented micromagnetic modeling framework
and reference software that allows software developers to swap their code in
and out as desired. OOMMF will help establish a baseline level of competence
in 2-D modeling and compare competing algorithmic components. A 3-D version
of the code is under development.
|
Modeling
realistic material microstructures
|
The
behavior of a material on the macroscopic scale depends to a large extent
on its microstructure-the complex ensemble of polycrystalline grains, second
phases, cracks, pores, and other features that are large compared to atomic
sizes. Modeling such structures is challenging due, in part, to their complicated
geometries. NIST has developed the OOF object-oriented finite-element software
to analyze material microstructures and simulate physical property measurements.
OOF allows materials scientists to study the influence of microstructure on
a material's macroscopic properties through an easy-to-use graphical interface.
By applying stresses, strains, or temperature changes, the user can measure
the effective macroscopic material behavior or examine internal stress, strain,
and energy density distributions. By modifying a microscopic material property,
the user can find the effect of that property on the macroscopic behavior;
by modifying the microstructure, the effect of geometry on a particular material
can be determined. OOF is being extended to handle other properties in addition
to thermoelasticity. The software won a 1999 Technology of the Year Award
from Industry Week magazine.
 |
The graphic at left
illustrates the steps in the use of NIST's object-oriented finite-element
software OOF, which allows materials scientists to study the influence of
microstructure on a material's macroscopic behavior by means of a graphical
interface. |
|
Numerical
and data
manipulation techniques
for environmental
modeling
|
The
primary objective of EPA's numerical and data manipulation techniques program
is to improve the performance of key numerical algorithms that form the computational
foundation of environmental models. This research develops and evaluates parallel
computing techniques encompassing interconnected workstations, vector and parallel
supercomputers, parallel software and algorithms, and communication to determine
the most effective approach to complex, multipollutant, and cross-media environmental
modeling. Fundamental research is also conducted on computational techniques
to quantify uncertainty as an integral part of the numerical computation.
|
|
HECC applications
|
HECC applications research harnesses the raw speed and data storage capacity
of advanced computing platforms to science's most data-intensive, complex,
and challenging problems, such as the design and properties of materials
in weapons, aerospace, and industrial systems; the shapes and processes
of biomolecular structures; and synthesis and analysis of terascale data
sets. Computationally intensive high end applications include modeling,
3-D visualization, and tools for data mining and data fusion.
|
|
Biomedial applications
|
|
Neuroscience imaging
|
During the next century there
is a real possibility that we will discover in detail how the brain works
and how to treat or prevent common neurological diseases and traumas.
Developments in modern computer-aided microscopes and advances in high
performance computing promise to uncover new information about the structural
and functional dynamics of the nervous system.
Neuroscientists are involved in
research covering a wide range of scales, from modeling molecular events
and subcellular organelles to mapping brain systems. They are also investigating
the ways in which single neurons and small networks of neurons process
and store information. Newly possible detailed models of single neurons
are being used to model the complex properties of neurons and neuronal
networks. Breakthroughs in optical imaging and image processing provide
opportunities for deriving information about the 3-D relationships among
biological structures, and structure-function work is moving into 4-D
(3-D plus time) imaging.
|
Mcell
|
A growing interest in neuron modeling
parallels the increasing experimental evidence that the nervous system
is extremely complex. In fact, modeling is as essential as laboratory
experimentation in understanding structure-function relationships in the
brain. The models may become as complex as the nervous system itself,
thereby requiring use of advanced computing. In National Partnership for
Advanced Computational Infrastructure (NPACI)-supported neuroscience research,
both widely used and newly developed neuron modeling systems are being
extended and linked to large-scale, high performance capabilities.
The ongoing NPACI MCell project
has developed a general Monte Carlo (pseudo-random number-based) simulator
of cellular microphysiology. Biological structures, like neurons, show
tremendous complexity and diversity at the subcellular level. For example,
a single cubic millimeter of cerebral cortex may contain on the order
of five billion interdigitated synapses of different shapes and sizes,
and subcellular communication is based on a wide variety of chemical signaling
pathways. A process like synaptic transmission encompasses neurotransmitter
and neuromodulator molecules, proteins involved with exo- and endocytosis,
receptor proteins, transport proteins, and oxidative and hydrolytic enzymes.
Mcell incorporates high-resolution
ultrastructure into models of ligand diffusion and signaling. Ligands
and effectors-reaction mechanisms-and surfaces on which reactions take
place are specified by the modeler, who uses the Mcell model description
language to build the simulation objects. Mcell then carries out the simulation
for a specified number of iterations, after which numerical results and
images can be produced. Optimized software for widely used and newly developed
models is being ported to the University of California-San Diego's (UCSD's)
Cray T3E and IBM teraflops systems.
Mcell has also been tested on a 40-machine
NetSolve cluster. A collaborative effort among scientists at UCSD, the University
of Tennessee, and Oak Ridge National Laboratories (ORNL), NetSolve turns a
loosely associated collection of machines into a fault-tolerant, client-server
compute cluster. The initial test of MCell on the NetSolve cluster demonstrated
the need for a distributed file-checking mechanism that would allow NetSolve
to support larger Mcell runs.
 |
Simulation of a synapse between a nerve cell (not shown) and a mouse sternomastoid
muscle cell. The neurotransmitter acetylcholine diffuses from a synaptic
vesicle to activate receptors (pictured as a cloud of dots) on the muscle
membrane. Snapshot is at 300 microseconds at peak activation.
|
|
Protein folding
|
Understanding how proteins form may
yield exciting medical and scientific possibilities. In nature's ultimate
origami, cells use information encoded in genes to construct a long chain
of amino acids that compacts into a tangle of loops, helices, and sheets.
A protein's unique geometry enables it to interact with other molecules and
do the body's biochemical heavy lifting, regulating digestion, for example,
or turning genes on and off during fetal development. Because of its complexity,
however, simulating protein folding and proteins' interactions with other
molecules is one of the toughest problems in computational biology. Solvation
models--so-called because water is the natural environment for proteins-calculate
the forces acting between every possible pairing of the atoms in the protein
as well as the surrounding solution, but such accuracy comes at a high cost.
Simulating, with full atomic detail, just one-millionth of a second of the
folding process in a small protein can take months of computation, even on
today's high performance computers. "Cutoff models" that include only pairs
of nearby atoms miss significant effects from greater distances. Methods that
group atoms--originally developed in the 1980s to study interactions among
stars--are yielding more accurate results. Research in this area is funded
by NIH and is being carried out by National Computational Science Alliance
(Alliance) scientists.
 |
Researchers at the
University of California-Los Angeles's (UCLA's) Laboratory of Neuro-Imaging
(LONI) are building population-based digital brain atlases to discover how
brain structures are altered by disease. In 3-D maps of variability from
an average cerebral cortex surface derived from 26 Alzheimer's disease patients,
individual variations in brain structure are calculated based on the amount
of deformation needed to drive each subject's convolution pattern into correspondence
with the group average. Surface matching figures are computed using fluid
flow equations with more than 100 million parameters. This requires parallel
processing and very high memory capacity. |
|
Emerge: Portable
biomedical information
retrieval and fusion
|
National Center for Supercomputing
Applications (NCSA) researchers supported by NIH's National Cancer Institute
(NCI) and NSF are addressing a recurring science problem-finding and relating
information scattered across many data sources. To pinpoint the defective
genes that cause cancer cells to run amok, for example, biological researchers
comb the Internet weekly, scanning vast online databases for clues. The
next essential clue to a tumor suppressor could lie hidden in the billions
of bases of human DNA being archived in GenBank, or in any of dozens of
other online databases. A user needs a skilled translator such as Emerge,
a portable collection of information-retrieval programs developed at NCSA.
Emerge translates a single query into the idioms of separate databases,
collects the results, translates them back to a common computer language,
and displays them on a user's screen.
A cancer researcher, for example,
could enter the phrase "small-cell lung cancer" into a form displayed
by a Web browser. The query is converted by Emerge into a data format
called Extensible Markup Language (XML), a versatile offshoot of HyperText
Markup Language (HTML) that may soon become the foundational data format
on the Web. The query is then sent to Gazebo, the heart of the Emerge
software system. Gazebo translates the query into Z39.50, a standard language
recognized by many library catalogs and databases of scientific literature.
Because the cancer literature often
uses synonymous terms to describe a single concept, NCI has funded an
interface to the Unified Medical Language System (UMLS) metathesaurus
developed over the past 10 years by NIH's National Library of Medicine
(NLM) to integrate collections of medical terminologies. By next year,
NCI plans to link cancer-related terms with Emerge, allowing patients
to click on a highlighted term and search for related information from
a universe of cancer databases. Physicians and cancer patients and their
families may also discover, with a few clicks of the mouse, how many people
suffer from a particular type of cancer, the status of cancer-related
legislation, and information on potential drug treatments. Emerge is also
part of a comprehensive science information system that has up-to-date
news about research grants.
|
|
Aerospace applications
|
|
Computational
Aerosciences (CAS)
|
The NASA Computational Aerosciences
(CAS) project is working with industry toward the goal of trimming the
time and cost of designing airplanes. Researchers propose to develop the
high end computing hardware and systems and applications software to enable
1,000x speed-ups in systems performance. NASA-supported researchers have
demonstrated a full compressor simulation in 15 hours-400 times faster
than was possible in 1992. In overnight supercomputing calculations, numerical
propulsion system simulation (NPSS) software will simulate a full range
of engine functions. These simulations let designers try out potential
changes without building and testing real hardware.
CAS is developing a framework to enable
multidisciplinary design optimization of complete aircraft, which requires
enormous computing resources. By integrating two Silicon Graphics, Inc. (SGI)
Origin 2000 systems, CAS created a 100-gigaflop testbed that presents a single
system image with global shared memory. CAS supports the development of cost-effective,
high end computing solutions. For example, CAS found a 92 percent cost savings
for certain design applications using 10 workstations as opposed to a single-processor
supercomputer.
 |
A simulation of the GE90's high-pressure compressor. NASA-supported CAS
researchers have demonstrated a full compressor simulation in 15 hours-400
times faster than was possible in 1992 |
|
|
Earth
and Space Science (ESS)
|
NASA's ESS R&D applies high end
computing technologies to topics as cosmic as the collisions of ultra-dense
neutron stars and as compelling as the fate of the Earth's climate. ESS,
which is managed by NASA's Goddard Space Flight Center (GSFC) with the
Jet Propulsion Laboratory (JPL), encompasses the following research areas:
- Earth imaging science employs
multiple supercomputing systems to process and visualize satellite-collected
radar data to monitor regional changes in the Earth's environment. A
detailed imaging study that followed seasonal changes in the Amazon
basin was recently completed.
- Relativistic astrophysics combines
fluid dynamics and general relativity to probe the violent merger of
ultra-dense neutron stars. These mergers may be the cause of still-mysterious
hyper-energetic gamma-ray bursts.
- Simulations of the Earth's interior
will gain insight into the chaotic processes that drive changes in the
planet's interior.
- Four-dimensional data assimilation
will build comprehensive views of the atmosphere by merging observations
with climate models.
- Earth system modeling seeks
to understand the Earth's climate via complex high-resolution models
of coupled atmospheric/oceanic circulation and chemistry.
- Fluids in microgravity R&D studies
fluids in low-gravity environments to simulate space-based manufacturing,
life support, and safety systems.
- Convection and dynamos in stars
focuses on the most fundamental and least understood turbulent processes
in the interior of stars such as the sun.
- Multiscale heliosphere modeling
uses computational studies to probe interactions of the solar wind with
the local space environment.
- Solar activity/heliospheric
dynamics investigates the tangled 3-D magnetic structures in the sun's
corona or outer atmosphere. These structures play a key role in the
physics of solar activity.
|
|
|
Advanced chemistry
application

|
Understanding
combustion
|
Most of the energy the world uses
comes from the combusionof fossil fuels. Increases in computing power over
the next few decades will make possible predictive computer models that will
enable us to understand the complex interactions of flued flow, chemistry,
surface physics, and materials properties that determine the efficiency of
combustion devices as well as the output of undersirable combusion by products
such as soot and NOx. In the past year, researchers at DOE's Lawrence
Berkeley National Laboratory (LBNL) have brought to light aspects of methane
combusion that have confounded scientists for a number of years. This research
combined advanced adaptive mesh refinement technologies and a new understanding
of chemical reaction rates to yield simulations that agree closely with experiements.
Future extensions of the research must incorporate the chemistry of more complex
hydrocarbons, such as diesel fuel, which have thousands of reaction pathways,
as well as more realistic surface physics and more complex geometries.
 |
Collaborators at DOE's LBNL and LLNL and the University of California-Davis
have used supercomputers to obtain a complete solution of the ionization
of a hydrogen atom by collision with an electron, the simplest nontrivial
example of the problem's last unsolved component. Pictured at left is a
representative radial wave function of two electrons scattered in the collision
of an electron with a hydrogen atom. |
|
|
|
Quantum physics
applications

|
Ionization
by electron impact
|
For over half a century, theorists
have tried and failed to provide a complete solution to scattering in
a quantum system of three charged particles, one of the most fundamental
phenomena in atomic physics. Such interactions abound: Ionization by electron
impact, for example, is responsible for the glow of fluorescent lights
and the ion beams that engrave silicon chips. Collaborators at DOE's Lawrence
Berkeley National Laboratory (LBNL) and Lawrence Livermore National Laboratory
(LLNL), and the University of California-Davis recently used supercomputers
to solve the ionization of a hydrogen atom by collision with an electron-the
simplest nontrivial example of the problem's last unsolved component.
The breakthrough employs a mathematical transformation of the Schrödinger
wave equation in which the wave functions of outgoing particles vanish
at large distances from the nucleus rather than extending to infinity.
|
|
Weather applications
|
|
| Hurricane
intensity prediction |
As part of NOAA's efforts to understand
and forecast climate and weather, researchers at the agency's Geophysical
Fluid Dynamics Laboratory (GFDL) seek to:
- Understand the genesis, development,
and decay of tropical disturbances by investigating thermo-hydrodynamical
processes using numerical simulation models
- Study small-scale features of
hurricane systems such as the collective role of deep convection, the
exchange of physical quantities at the lower boundary, and the formation
of organized spiral bands
- Investigate the ability of numerical
models to predict hurricane movement and intensity and transition those
models to operational use
While the GFDL models are excellent
in predicting intensities of weak to moderate hurricanes, better prediction
of surface wind intensities in stronger hurricanes is anticipated in FY 2000-FY
2001 due to increased computing power, a result of HEC R&D, that allows hurricane
models to operate at higher grid resolutions, account for asymmetries in storms,
and improve physical parameterization. During FY 2000, developmental work
to improve the hurricane model initialization, ocean interaction, model physics,
and resolution continues; case studies will be used to evaluate the models'
impact on forecasting skills. Work on assimilating more data into the forecast
and analysis system will continue. The effects of evaporation of rain and
sea spray, together with dissipative heating, will be evaluated.
 |
NOAA-supported researchers have
simulated samples of hurricanes from today's climate and a projected greenhouse
gas-warmed climate by linking information from GFDL's global climate model
into the high-resolution GFDL hurricane prediction model (left). This now
operational model has been used successfully by NOAA's National Centers
for Environmental Prediction to predict tropical storm paths over the last
several hurricane seasons. |
|
| Hurricanes and global
warming |
The
strongest hurricanes in the present climate may be upstaged by even more
intense hurricanes over the next century if the Earth's climate continues
to be warmed by increasing levels of greenhouse gases in the atmosphere.
Most hurricanes do not reach their maximum potential intensity before
weakening over land or cooler ocean regions. However, those storms that
do approach their upper-limit intensity are expected to be slightly stronger
in the warmer climate due to the higher sea surface temperatures.
NOAA researchers have simulated
samples of hurricanes from the present-day climate and from a projected
greenhouse gas-warmed climate by linking information from GFDL's global
climate model into the high-resolution GFDL hurricane prediction model.
This is the operational model that has been used by NOAA's National Centers
for Environmental Prediction to predict tropical storm tracks for the
last several hurricane seasons. The simulation projects that wind speeds
in the northwest tropical Pacific will increase by 5-12 percent if tropical
sea surfaces warm by a little more than 2 degrees Centigrade. This study
represents the first use of an operational model to study a phenomenon
that was theorized a decade ago. It illustrates the use of high performance
computing to investigate the potential impact of global climate change
on weather systems.
|
|

|
|
| |