|
NSF's new $53-million initiative aims to bring the power and benefits
of terascale computing to
the Nation's college and university campuses. The effort will build
a unique distributed terascale
system outfitted for grid computing that will serve as the experimental
protot ype for a wider-scale academic computing infrastructure of
the future.
A longtime champion of improved computing infrastructure for academe,
NSF has, since the mid-1990s, supported development of the Partnerships
for Advanced Computational infrastructure (PACI), two networks of
universities working to expand campus researchers' access to high-end
computing facilities and tools.The National Partnership for Advanced
Computational Infrastructure (NPACI) based at the University of
California-San Diego's (UCSD) San Diego Supercomputing Center (SDSC)
is a collaboration of 30 other funded partners and 16 domestic and
international affiliates; the National Computational Science Alliance
(the Alliance) is a group of more than 50 academic, government,
and business organizations based at the National Center for Supercomputing
Applications (NCSA) on the campus of the University of Illinois
at Urbana-Champaign.
In FY 2000, NSF commissioned an initial effort to develop a terascale
academic computing platform at the Pittsburgh Supercomputing Center.
That system, built in partnership with Compaq, came online ahead
of schedule in 2001 with a peak performance of 6 teraflops and realized
75 percent of peak with an existing application code. The three-year
Distributed Terascale Facility (DTF) program will add to this developing
academic infrastructure the world's first multisite terascale computing
system, with peak performance of 11.6 teraflops and more than 450
terabytes of storage.
The four sites chosen for DTF awards -
SDSC, NCSA, DOE's Argonne National Laboratory (ANL), and the California
Institute of Technology (Caltech) - will work with primary corporate
partners IBM, Intel, Myricom, Oracle, Qwest, and Sun - to build
the computing platforms and link them through a 40-gigabit (40 billion
bits-persecond) optical network. The DTF institutions will divide
up the developmental responsibilities:
- NCSA will be the lead in computational aspects with an IBM Linux
cluster using Intel's 64-bit Itanium "McKinley" processors. Peak
performance will be 6.1 teraflops, with 240 terabytes of secondary
storage
- SDSC will head the project's data- and knowledge-management
effort, with a 4-teraflops IBM Linux cluster using McKinley processors,
225 terabytes of storage, and a Sun high-end server for managing
data
- ANL will have a 1-teraflops IBM Linux cluster to host advanced
software for high-resolution rendering, remote visualization,
and grid computing
- Caltech will focus on data, with a 0.54-teraflops McKinley cluster
and a 32-node IA-32 cluster to manage 86 terabytes of online storage
These computational resources will be
woven together as the "TeraGrid," a grid computing framework
using Globus that participants see as the potential blueprint for
tomorrow's advanced
academic computing. The four sites are expected to be linked and
operational in FY 2003.
Scientific Discovery Through Advanced Computing (SciDAC) at
DOE
Scientific inquiry at the frontier of complexity requires computational
modeling and simulation, a critical component of research across
all areas of DOE's Office of Science. Today, the astonishing advances
in processing speeds over the last decade have opened the prospect
of new, even more accurate physical, chemical, and biological models.
But these highly complex models can only be effectively developed
by interdisciplinary teams of applications scientists, applied mathematicians,
and computer scientists.
The Office of Science's Scientific Discovery through Advanced Computing
(SciDAC) program, a new initiative in FY 2001, has assembled such
teams to develop the scientific computing software and hardware
infrastructure needed to use terascale computers to advance fundamental
research in areas related to DOE missions. Under the multiyear SciDAC
program, 51 projects have received a total of $57 million to build
terascale capabilities for climate modeling, fusion energy sciences,
chemical sciences, nuclear astrophysics, high-energy physics, and
high-performance computing.
The SciDAC effort will help create a new generation of scientific
simulation codes. The codes will take full advantage of the extraordinary
computing capabilities of terascale platforms to address ever-larger,
more complex problems. The program also includes research on improvedb
mathematical and computing systems software that will enable these
codes to use modern parallel computers effectively. Collaboratory
software developed within the SciDAC program will enable geographically
separated scientists to use scientific instruments and computers
remotely and work together with distant colleagues as a team, sharing
data more readily.
Selected from more than 150 proposals, the SciDAC activities include
large projects funded for three to five years and smaller projects
supported for three years. Success of the SciDAC program depends
on multidisciplinary teams from universities and laboratories working
in close partnership. The projects involve collaborations among
13 DOE laboratories and more than 50 colleges and universities.
Thirty-three projects are in the biological,
chemical, and physical sciences. Specifically, 14 university projects
will advance the science of climate simulation and prediction. These
projects involve novel methods and computationally efficient approaches
for simulating components of the climate system and work on the
integrated "climate model of the future."
Ten projects will address the areas of
quantum chemistry and fluid dynamics, which are critical for modeling
energy-related chemical transformations such as combustion, catalysis,
and photochemical energy conversion. The scientists involved in
these activities will develop new theoretical methods and efficient
computational algorithms to predict complex molecular structures
and reaction rates with unprecedented accuracy.
Five projects are focused on developing
and improving the physics models needed for integrated simulations
of plasma systems to advance fusion energy science. These projects
will focus on such fundamental phenomena as electromagnetic wav-plasma
interactions, plasma turbulence, and macroscopic stability of magnetically
confined plasmas.
Four projects in high energy and nuclear
physics will significantly extend our exploration of the fundamental
processes of nature. The projects include the search for the explosion
mechanism of core-collapse supernovae, development of a new generation
of accelerator simulation codes, and simulations of quantum chromodynamics.
Seventeen projects are to develop the
software infrastructure to support research collaboration using
distributed resources and scientific simulation on terascale computers.
Three Applied Mathematics Integrated Software
Infrastructure Centers will take on the challenge of providing scalable
numerical libraries. The centers will provide new tools for near-optimal
complexity solvers for nonlinear partial differential equations
based on multilevel methods, hybrid and adaptive mesh generation,
and high-order discretization techniques for representing complex,
evolving domains, and tools for the efficient solution of partial
differential equations based on locally structured grids, hybrid
particle/mesh simulations, and problems with multiple-length scales.
Four Computer Science Integrated Software Infrastructure Centers
will address critical issues in high-performance component software
technology, large-scale scientific data management, understanding
application/architecture relationships for improved sustained performance,
and scalable system software tools for improved management and utility
of systems with thousands of processors.
Four national collaboratory, two middleware,
and four network research projects will have general applicability.
This work will investigate, develop, deploy, and refine the underpinning
software environment that will enable innovative approaches to scientific
computing through secure remote access to shared distributed resources,
large-scale transfers over high-speed networks, and integration
of collaborative tools with the researcher's desktop.
Long-Range Research in Revolutionary
Architectures
Radically new component technologies and system architectures are
needed to make it possible to design smaller supercomputing platforms
that cost less to build and maintain but increase speeds, portability,
and scalability. The current generation of U.S. high-end platforms
requires many thousands of square feet of floor space and megawatts
of power. This approach, which packages many commodity multiprocessor
nodes into one large system, is reaching the limits of scalability
and affordability.
The NITRD research agenda supports long-range efforts seeking fundamental
breakthroughs in highend processor and systems architectures to
reduce the size, cost, and power requirements of platforms and mass
storage devices. This work includes high-risk experimentation with
promising concepts in biomolecular, quantum, and hybrid nanotechnologies
for processing and storage; reconfigurable systems on a chip; systems
architectures integrating component and device technologies; and
programming environments.
To create a supercomputing platform, very large numbers of components
must be brought together and assembled. Achieving maximum possible
computational speeds dictates that all these components be tightly
spaced and closely interconnected. To build such a platform on a
scale that increases portability and scalability will require solution
of the high-end field's most difficult challenges in fundamental
science, including power requirements, thermal
management, component and system architecture miniaturization, and
superconducting switches and interconnects.
Advanced national defense applications are a key area in which
new approaches to high-end computing systems are urgently needed,
a DoD official noted at a March 2002 conference. To achieve the
necessary breakthroughs, DARPA is undertaking an ambitious "High
Productivity Computing Systems" program, which challenges industry,
working with academic research partners, to think "out-of-the-box"
about architectures and component technologies, with the goal of
producing an entirely new commercial system by 2010.
Working through three R&D phases from
innovative concept to technical design to prototype fabrication,
the DARPA development teams will be expected to achieve the following
metrics: 10 to 40 times todays supercomputing performance;
significantly increased productivity through reduced application
development time and operational costs; better portability (application
software insulated from system specifics); and substantially improved
robustness and reliability.
In addition, NSA is leading an effort
involving the national security R&D agencies and user community
to develop a plan for a long-term integrated R&D program in
high-end computing. Congress has asked the Secretary of Defense
to povide the research blueprint by July 2002.
|