Athena is a cluster system with GPUs and large RAM memory footprint running CentOS Linux 5. There are 42 quad-socket, AMD 2.3GHz Magny Cour octa-core nodes (1,344 cores) with 64 GB RAM each (12.4 TFLOP peak). Sixteen of the nodes also have access to 8 total nVidia S870 Fermi (quad-core) GPUs with 6GB of memory. These new GPUs support OpenGL and also single and double precision operations (96 TFLOP single precision peak) and C++ and FORTAN compilers (PGI). The nodes are connected via Quad-data-rate (QDR) InfiniBand (40 Gb/sec) and there are 40 TB of file storage attached to this device. James River Technical/Appro supplied the system. This machine is intended for computation and visualization of large data sets. This is a unique machine with its large memory/node footprint, which is crucial for managing large timeseries data and global/serial statistical operations. This system directly enables our computational scientists/engineers to tackle bigger problems.
The GPUs in this cluster are primarily to accelerate rendering tasks (drawing high- resolution plots, animations, 3D and video). For more information, see the Visualization section.
Athena is divided into two queues:
- The general compute queue "athena_q" is for non-GPU jobs and has access to nodes athena005-athena009, athena012-athena018, athena023-athena024, athena029-athena033, and athena036-athena042.
- The GPU queue "athenagpu_q" has access to nodes athena001-athena004, athena010-athena011, athena019-athena022, athena025-athena028, and athena034-athena035, with each node having access to two NVIDIA Tesla S870 GPUs.
The two queues are governed by different policies:
|Intended Use||Non-GPU jobs||GPU jobs|
|Max Jobs/User*||128 to 256||128 to 192|
|Max Cores/User*||128 to 256||128 to 192|
|Max Walltime||300 hours||300 hours|
* There are soft and hard limits to the number of jobs and cores per user. The soft limits apply when other users have jobs waiting in the queue. The hard limits define the maximum resources that a given user will ever be able to access at a given time. The two limits help ensure that the system is fully utilized while not allowing a single user to dominate its resources in times of high demand.
Note that Athena is not "dedicated node" - i.e., more than one job (potentially from more than one user) can run on a single node. So if you want dedicated access to a node's memory, you must reserve all 32 of the node's cores.
Note also that Athena does not allow users to schedule jobs on more nodes than the number of cores requested. So if a job is submitted with a request of 10 nodes and 10 cores per node, the scheduler will put those 100 cores onto as few nodes as possible. This minimizes the nodes in use at a given time, leaving more room for larger jobs. It also helps ensure that jobs are not run across multiple nodes without utilizing all of the cores on those nodes, which minimizes the risk that jobs will consume memory that should be available to other cores.
When you submit a job to one of these queues, you must specify a "walltime" - the time that you expect your job to take to complete. If your job exceeds the walltime estimated during submission, the scheduler will kill it. So it is important to be conservative (i.e., to err on the high side) with the walltime that you estimate.
In general, users should utilize the module avail command to get an up-to-date list of the software available on a given system. However, as of February 2012, the software available on Athena included (but was not limited to):
- CUDA 3.1, 3.2, and 4.0
- GCC 4.4.0
- Linear Algebra:
- CUDA BLAS
- BLACS 1.1
- GotoBLAS 1.26
- LAPACK 3.2.1
- ScaLAPACK 1.8.0
- MKL 10.2.7.041
- ATLAS 3.9.25
- ABAQUS 6.10
- ANSYS 14.0
- FFTW3 3.2.2
- MATLAB R2010b, R2011b, R2012a
- NAMD 2.7
- OpenFOAM 2.1.1
- ParaView 3.14.1, 3.8.1
- VisIt 2.3.2
Access to applications is enabled via the module command interface to the Modules package. The Modules package provides for the dynamic modification of the user's environment for an application or set of applications. You must use the module command to add the module for an application to your environment in order to use that application. The most common module subcommands are:
- module avail - View a list of available modules
- module list - View a list of modules loaded in your environment
- module add <module name> - Add a module to your environment (Also: module load)
- module rm <module name> - Remove a module to your environment (Also: module unload)
- module purge - Remove all modules from your environment
The module command can be used at the command line and within qsub job launch scripts.
The cluster is accessed via ssh to the login node athena1.arc.vt.edu. Log in using your username (usually Virginia Tech PID) and password. You will need an SSH Client to log in; see here for information on how to obtain and use an SSH Client. You must be on a campus network to access the login node, so off-campus use requires connecting to the CNS VPN first.
Access to cluster compute nodes, including the GPU nodes, is via the Torque/Moab queueing system. Normally one submits batch jobs using the qsub command via qsub scripts. However, you may occasionally have a need for an interactive shell, such as for initial development and testing; if so, you may request interactive shells on compute nodes using interactive qsub commands. Both methods are discussed below:
Submitting Jobs to a Queue
Job submission is done by submitting a job launch script. Example submissions scripts are provided in the Examples section below. To submit your job to the queuing system, use the command qsub. For example, if your script is in "JobScript.sh", the command would be:
This will return your job name of the form xxxx.admin01.cm.cluster (e.g., 62.admin01.cm.cluster, where 62 is the job number). To remove a job from the queue, or stop a running job, use the command qdel. For job number 62, the command would be:
To see status information about your job, you can use:
- qstat -f <job number> - Provide detailed information about the job. You can also use qstat -u <username> to see all of your jobs or qstat <queuename> to see all of the jobs in a given queue. Using the -n flag will also list the nodes that each job is using.
- showstart <job number> - Moab command that will tell you expected start and finish times.
- checkjob -v <job number> - Moab command that will provide detailed information about the job.
Note: The Moab commands may report an error of the form "ERROR: cannot locate job '<job name>'" if the scheduler has not yet picked up the newly submitted job. If so, just a wait a minute and try again.
When your job has finished running, any outputs to stdout or stderr will be placed in the files .o<job number> and .e<job number>. These two files will be in the directory that you submitted the job from.
To find information about all your queued or running jobs you can use the commands qstat and showq. The qstat command without a <job name> argument will show all Athena jobs from the Torque resource manager's perspective. The showq command without arguments will show all of the running jobs over all ARC systems from the Moab scheduler's perspective. If you wish to only view Athena jobs with showq, use
showq -p ATHENANote: Users generally find showq to be more useful that qstat.
If you have a job sitting in the queue that you think should be able to run, use the following command to see the reason the job is not running, as shown at the bottom of the output:
checkjob -v <job name>
You can get an interactive shell on a node via the qsub command. See the file /apps/docs/Interactive-qsub-GPU-Example-Commands for example commands. The job will be queued and scheduled as any batch job, but when executed, the standard input, output, and error streams of the job are connected through qsub to the terminal session in which qsub is running. Please note that there will be a delay between submitting an interactive qsub command and receiving a compute node shell.
Shell script templates for submission of jobs on Athena:
To utilize these script templates, create your own copy and edit the following:
- The walltime (denoted by #PBS -lwalltime). This is the time that you expect your job to run; so if you submit your job at 5:00pm on Wednesday and you expect it to finish at 5:00pm on Thursday, the walltime would be 24:00:00. Note that if your job exceeds the walltime estimated during submission, the scheduler will kill it. So it is important to be conservative (i.e., to err on the high side) with the walltime that you include in your submission script.
- The number of nodes and cores per node (denoted by #PBS -lnodes...). This is the number of nodes and cores per node that you want to reserve for your job. Each Athena node has 32 cores, so ppn can range from 1 to 32; in general you should try to utilize all of the cores on a node before moving on to multiple nodes. The number of nodes available in each queue is described in the Policies section.
- Modules (denoted by module add ...). Use module add (or module load) commands to add the modules that your job will need to run.
- The part where you actually run your job (beginning with echo "Hello world!"). Add commands here to execute your job - this can be execution of your own program or a call to a software package, such as NAMD.
Once these changes have been made, your script is ready for submission (see Submitting Jobs, above).
To compile and run the example OpenMP matrix multiplication program on Athena:
- SSH into Athena (See Submitting Jobs, above).
- Download the source code file (link above) and put it in a folder in your /home directory. See here for information on how to transfer files to and from ARC's systems.
- Use the Unix cd command to navigate to that folder.
- Compile the source code file into an executable:
- To use the GNU compiler:
- Type module avail to get a list of available modules. Locate the GNU compiler set that you want to use (e.g. gcc/4.3.4).
- Load the GNU compiler using the path you identified in the previous step: module load gcc/4.3.4
- Compile the source code: gcc openmp_mmult.c -fopenmp -o gcc_mm (Note that here we use the -fopenmp flag since it is a program that uses OpenMP.)
- Your executable is now in the file gcc_mm. The program takes two arguments: the number of threads to use and the dimension of the matrix to multiply. To execute the program with 8 threads and 100x100 matrices, you would use the command: ./gcc_mm 8 100. To do so, however, would run the job on a head node, thereby slowing down the system for other users. Therefore, Athena jobs should be submitted to the scheduler using a qsub command (see the next step).
- To use the Intel compiler:
- Type module avail to get a list of available modules. Locate the Intel compiler set that you want to use (e.g. intel/compiler/64/11.1/073).
- Load the Intel compiler using the path you identified in the previous step: module load intel/compiler/64/11.1/073
- Compile the source code: icc openmp_mmult.c -openmp -o icc_mm (Note that here we use the -openmp flag since it is a program that uses OpenMP.)
- Your executable is now in the file icc_mm. The program takes two arguments: the number of threads to use and the dimension of the matrix to multiply. To execute the program with 8 threads and 100x100 matrices, you would use the command: ./icc_mm 8 100. To do so, however, would run the job on a head node, thereby slowing down the system for other users. Therefore, Athena jobs should be submitted to the scheduler using a qsub command (see the next step).
- To submit your program to the scheduler, download and open the sample Athena general compute queue submission script.
- Edit the line "#PBS -lnodes=1:ppn=1" to set the number of nodes (lnodes) and processors per node (ppn) that you want to utilize. Since this is an OpenMP (shared memory) program, you cannot use more than one node. The maximum number of processors per node on Athena is 32.
- Replace the "./a.out" line in the sample script with the command that runs your program (e.g. ./gcc_mm 8 100 or ./icc_mm 8 100).
- Save the script file in your /home directory.
- To submit the script, use the qsub command. For example, if you saved your script file as "openmp_mmult.sh", the command would be qsub ./openmp_mmult.sh.
- The system will return your job name of the form xxxx.admin01.cm.cluster (e.g., 53318.admin01.cm.cluster, where 53318 is the job number). Follow the instructions above to use qstat to track the progress of your job, qdel to delete your job, etc. The program output will be held in the file with the extension .o followed by your job number (e.g. "openmp_mmult.sh.o53318").
The GPUs in this cluster are primarily to accelerate rendering tasks (drawing high- resolution plots, animations, 3D and video), enabling cluster visualization via offline batch usage or real-time remote visualization to thin clients. Both of these visualization models are proven and heavily-used resources in the national labs and Teragrid/XSEDE.
For example, Athena includes VisIt and ParaView, open source applications that allow data, computational, and graphic processing to be performed on the cluster. This can be done in two ways:
- Via offscreen batch scripting (e.g. using Python), where the results of the visualization are saved to image or video files.
- During an interactive session where visualization is viewed and manipulated on the client machine in real-time.
Doing the processing where the data is housed substantially accelerates the speed with which complex rendering can occur. The Visionarium provides detailed documentation on these software packages at the links below: