Advanced Research Computing at Virginia Tech
VT-ARC is a Unit within the Office of the Vice President of Information Technology.
Mission
Advanced Research Computing (ARC) at Virginia Tech is an innovative
and interdisciplinary environment advancing computational science,
engineering and technology. Its mission is to:
- Provide leadership, advanced infrastructure and support to invigorate computational science
and engineering at Virginia Tech
- Provide partnerships and support for joint faculty appointments in VT academic departments,
building areas of excellence in computational science and engineering
across disciplines, and providing opportunities for new innovation in
scientific computing
- Offer educational programs and training on
scientific computing, encouraging the development of knowledge and
skills in computational tools and techniques for undergraduate, graduate
and research faculty and staff
- Offer programs to stimulate and expand
interdisciplinary and computational driven research activity at VT,
including visiting researcher, travel, events, distinguished
postdoctoral fellow and graduate student programs that provide new
sources of support for collaboration, research, and development
- Affiliate with business, industry, and government to help drive economic
development growth in Virginia by building connections between research
and applications for emerging tools and techniques in computational
science and engineering, and by establishing research agreements that
facilitate knowledge creation and application in industry
- Collaborate with other computational science and engineering driven research centers
in advancing knowledge and leading the evolution of scientific computing
tools, techniques, and facilities that accelerate scientific discovery
The Virginia Tech Data Center is located within the 51,000 square-foot Andrews
Information Systems Building (AISB) in the Virginia Tech Corporate
Research Center. This building contains office areas, the computer Data
Center, a 12,000 square-foot facility that already houses the System X
supercomputer as well as the university's main computing systems, and a
telecommunications switch center. This building is a secure facility
with emergency power, an electronic access control system, surveillance
cameras, and, after normal business hours, security guards. The Data
Center is protected against fire by a Halon gaseous automatic fire
suppression system.
User Support
Accelerating time to discovery depends critically on effective user
support sustained by a realistic and manageable operating budget.
Virginia Tech's long track-record in research computing, and especially
our recent experience with System X, gives us the ability to meet the
requirements of a broad HPC user community in an extremely
cost-effective manor. The sine qua non of HPC user requirements is high
availability and consistent access to the resource. If users cannot
plan for, submit, monitor, and retrieve results from their jobs in a
predictable and intuitive way, then the center is a failure.
The Systems Support Department provides a "5 deep" call list of trained
system administrators to handle hardware and operating system issues.
Customized scripts and automated tools are used to efficiently manage
tasks such as log review, patch analysis, security and performance
monitoring, and hardware status. Similarly, CNS is set up with an
extensive cadre of network engineers available and on-call to respond to
communications infrastructure issues. A user of a HPC computer needs
only to contact the Virginia Tech Operations Center, staffed 24/7 (363
days per year) to report an issue. The VTOC handles all IT issues for
the VT central and remote campuses, as well as users of "Network
Virginia," a 1.5 million connection network serving users across the
Commonwealth. Problem dispatch can be accomplished and tracked by a
Problem Reporting System (Remedy). Issues can be passed to Systems
Support staff, CNS staff, or the specific user support group assigned to
a particular research area or researcher.
Application support is provided Monday-Friday, 8:00 a.m. - 5:00 p.m.
EST. The primary goals of the application support group are to assist
users in accessing and using the system, helping users resolve
application errors encountered on the system, and user training.
This group also works with users in refining and tuning application
programs to run on the system, porting community codes to the system,
and working with researchers to develop new applications to run on
cluster based systems.
Surveys, offered through the VTOC and other support groups working
directly with the researchers, are typically conducted on a annual basis.
User feedback and
results from their computational work are the primary measure of
satisfaction and performance of the system. Other metrics, including
job success/job abort ratio, computing resources used in a given period,
and active storage, are also collected and charted for trend
analysis. These metrics, along with user feedback and input through
surveys and other avenues, will also help determine directions for
increasing capacity and adding software components.
|