| |
|
Information Technology: 21
Century Revolution
DOE's ASCI Program
|

|
|
 |
 |
|
|
|
Overview
|
DOE's Accelerated Strategic Computing Initiative (ASCI) applies advanced
capabilities in scientific and engineering computing to one of the most
complex challenges in the nuclear era-maintaining the performance, safety,
and reliability of the Nation's nuclear weapons without physical testing.
A critical component of the agency's Stockpile Stewardship Program (SSP),
ASCI research develops computational and simulation technologies to help
scientists understand aging weapons, predict when components will have
to be replaced, and evaluate the implications of changes in materials
and fabrication processes for the design life of aging weapons systems.
ASCI was established in FY 1996 in response to the Administration's commitment
to pursue a comprehensive ban on nuclear weapons testing.
ASCI researchers are developing high end computing capabilities far above
the current level of performance, and advanced simulation applications
that can reduce the current reliance on empirical judgments by achieving
higher resolution, higher fidelity, 3-D physics and full-system modeling
capabilities for assessing the state of nuclear weapons. DOE has established
an FY 2004 deadline for attaining working ASCI hardware and software and
FY 2010 as the full implementation date for ASCI's R&D products.
ASCI activities center on development
of the advanced applications software, more powerful computing platforms,
and high end computational infrastructures needed to achieve four major
objectives of the R&D effort:
- Performance-Create predictive
simulations of nuclear weapons systems to analyze behavior and assess
performance in an environment without nuclear testing
- Safety-Predict with high certainty
the behavior of full weapons systems in complex accident scenarios
- Reliability-Achieve sufficient
validated predictive simulations to extend the lifetime of the stockpile,
predict failure mechanisms, and reduce routine maintenance
- Renewal-Use virtual prototyping
and modeling to understand how new production processes and materials
affect performance, safety, reliability, and aging. This understanding
will help define the right configuration of production and testing facilities
necessary for managing the stockpile throughout the next several decades.
DOE's three national Defense Program
(DP) laboratories--LANL, LLNL, and SNL-collaborate on ASCI-related activities.
|
|
PathForward
|
The performance simulation and virtual prototyping applications required
for the SSP call for far more powerful computing platforms than the industry
now produces. ASCI's strategy, called PathForward, is to build the high
end computing systems the program requires by scaling commercially viable
building blocks-both hardware and software-to 30 teraops (30 trillion computing
operations per second) and beyond. PathForward has established multiple
partnerships with computer companies, government agencies, and academia
to develop and accelerate technologies that are either not in the current
business plans of manufacturers or not expected to be available in the timeframe
or scale required by ASCI. For example, DOE and its three DP laboratories
are partnering on a cost-sharing basis with Compaq/DEC, IBM, SGI, Cray,
and Sun Microsystems to develop and engineer high-bandwidth, low-latency
technologies to interconnect the 10,000 commodity processors needed to build
a 30 teraops computer. In another PathForward partnership, DOE, DoD, and
NASA are developing high performance storage technologies to reduce the
physical size of ultra-scale data storage systems while significantly improving
the speed at which data can be written into these systems. Such attributes
are vital due to the massive storage requirements of the complex SSP simulations.
The goal of this research is to develop optical tape drive technologies
that can write 25 MBps (25 million bytes per second) to a 1 terabyte (1
trillion bytes)-capacity optical tape cartridge in a conventional sized
unit.
|
|
ASCI computing
platforms
|
A new IBM system dubbed "Baby
Huey," the scalable prototype for a 10-teraops system thatwill be the world's
fastest computing platform, has been installed at LLNL. Baby Huey consists
of 16 IBM Nighthawk 1 nodes with a peak computing capacity of 114 Gflops
and 32 GB of memory. It is running ASCI applications to evaluate the performance
of IBM's latest 64-bit computing technology in preparation for the scheduled
full installation in June 2000 of ASCI's 10-teraops "Option White" platform--a
massively parallel system consisting of 512 multiprocessor nodes. It was
just one 32-node RS6000 system (with specialized chess co-processors) named
"Deep Blue" that defeated the world's leading chess champion in a highly
publicized series of matches in 1997. Housing Option White requires 17,000
square feet of floor space and over 6.2 megawatts of electricity for power,
cooling, and mechanical equipment-enough electricity to supply a small town
with air conditioning. Option White is ASCI's fourth custom-built high-speed
platform and the latest step toward the goal of having a 100-teraops system
in place by 2004.
|
|
Visual Interactive
Environment for
Weapons Simulation
(VIEWS)
|
ASCI's computing platforms enable DOE-supported scientists to store, retrieve,
and manipulate complex data on a scale not possible on any other computing
system. But researchers must develop equally advanced tools for organizing,
managing, and visualizing the vast 3-D data sets representing the physical,
chemical, and engineering properties of the nuclear weapons stockpile. ASCI
applications will use extremely high fidelity 3-D models, on the order of
one billion cells, to generate terabytes of raw data-a volume of information
that would overwhelm scientists attempting to analyze it in the absence
of tools that helped them manage it and "see" what it means. VIEWS integrates
high performance storage, high-speed networking, visualization hardware,
and advanced data exploration and management software to provide the capabilities
for high-level scientific data analyses.
The hardware infrastructure required to support this work includes a high
performance scalable network of graphics workstations, visualization servers,
and storage systems, all connected to the terascale computing platforms
via high-speed interconnects. Through technologies such as video fiber modems,
image compression and transmission, and hardware-parallel visualization
systems, ASCI researchers can transmit real-time high-quality images from
the visualization servers and supercomputers into offices and graphics labs.
But the enormous sizes of ASCI data sets render existing visualization software
ineffective for interactive data interrogation-that is, manipulating the
data and visual representation for purposes of analysis in a distributed
computing environment. Researchers are exploring parallel and demand-driven
visualization strategies, including multiresolution and hierarchical techniques
to moderate the data levels needed for visualization, and parallel and distributed
algorithms to increase system capacity to meet scientists' visualization
requirements.
|
|
Scientific Data
Management (SDM)
|

VIEWS research is developing an organizational framework to speed and
enhance a user's ability to browse and search the complex SSP data collections
by integrating SDM-developed application software libraries and Web/Java-based
components with commercial databases, mass storage systems, networking,
and computing infrastructure. SDM efforts focus on capturing and sharing
simulation data from application codes, organizing, searching, and managing
a variety of data, and automating the computer-based data discovery process.
To have large-scale and diverse data sets flow smoothly among physics
applications and between calculation and analysis, the data must be modeled
in a machine- and application-neutral way. To meet this challenge, SDM researchers
developed a common data model for ASCI simulation data based on principles
from topology, along with a common API, and are working on metadata strategies
to improve data organization and management. Such metadata range from documentation
of the size, type, and creation date of a data set to a scientist's notes
about the data. The researchers have developed a number of accompanying
tools, including a calculation summary, a Web-accessible knowledge base
of weapons data archives, a metadata editing and browsing capability, and
Data Discovery, a suite of techniques and tools for automated querying,
representation, and extraction of information from terascale simulation
data sets.
|
|
Problem Solving
Environment (PSE)
|

ASCI's Problem Solving Environment (PSE) consists of the high end computational
technologies and tools needed to conduct advanced scientific analyses on
a secure, very high performance distributed computing system. PSE teams'
work includes:
- ASCI distributed computing
environment (DCE). The DCE team is establishing a common set of secure
distributed computing services for use at each of its supercomputer
sites, focusing on the middleware enabling desktop users to work smoothly
across the heterogeneous ASCI network using heterogeneous computing
and operating systems.
- High performance storage system
(HPSS). More than 20 organizations representing academia, industry,
and the Federal government (DOE, NASA, NOAA, and NSF) are collaborating
to prototype a next generation storage system based on commercially
available products.
- Accelerated data transfer. Moving
large amounts of data from one device to another is one of computing's
most time-consuming tasks. ASCI researchers linked a new data-moving
protocol to the HPSS parallel file transfer protocol, enabling dramatic
improvements in data transfer rates.
- Scalable linear solvers. Because
many currently used algorithms are not scalable, the computational workload
for larger problems grows faster than the optimal linear rate. ASCI
researchers are developing scalable algorithms that decrease computation
time. For example, these algorithms can reduce a two-day run on a massively
parallel processing machine to 30 minutes. Applications include studying
complex physical phenomena, such as energy, the environment, and biological
systems.
- Scalable input/output (I/O). This research aims to speed up the transfer
of data through the various hardware and software components of a supercomputing
system. The goal is an overall improvement in "end-to-end" performance,
achieved by increasing data transfer speeds at every layer between an
application and hardware.
|
|
Academic Strategic
Alliances Program
(ASAP)
|

By supporting ongoing technical interactions between the ASCI research and
leading-edge academic R&D, ASAP accelerates new developments in simulation
science and high performance technologies for computer modeling. In ASAP
Level One, ASCI has established five major university centers to engage
in long-term, large-scale unclassified research in simulation science and
computational mathematics on advanced scientific problems. In Level Two,
ASCI supports strategic investigations-smaller, discipline-oriented projects
in computer science and computational mathematics critical to ASCI's success.
ASAP Level Three supports individual collaborations focused on near-term
ASCI research.
|
|

|
|
|
|
|