BlueRidge (Sandy Bridge)

Overview / Technical Specifications

BlueRidge is a 408-node Cray CS-300 cluster. Each node is outfitted with two octa-core Intel Sandy Bridge CPUs and 64 GB of memory, for a total of 6,528 cores and 27.3 TB of memory systemwide. Eighteen nodes have 128 GB of memory. In addition, 130 nodes are outfitted with two Intel MIC (Xeon Phi) coprocessors. Each of the 260 coprocessors on BlueRidge has 60 1.05 GHz cores and a theoretical peak performance of approximately 1 TeraFlop (double-precision) per second. The MIC is compatible with widely used multi-core CPU programming paradigms such as OpenMP and can be exploited in various ways to accelerate scientific computations. For more information on using the MIC coprocessors, see the MIC User Guide.

BlueRidge became available to the Virginia Tech research community in March 2013 with 318 nodes (five with 128 GB memory), 5,088 cores, and 20.4 TB memory. At that time, BlueRidge was ranked No. 402 on the Top500 list of the world’s most powerful supercomputers. The MIC coprocessors came online in February 2014 and an additional 90 compute nodes, including 13 128 GB nodes, were added in June 2014.

The table below lists the specifications for BlueRidge’s CPUs and MICs:

Specification CPU (2 per node) MIC (2 per node*)
Model Intel Xeon E5-2670 (Sandy Bridge) Intel Xeon Phi 5110P
Cores 8 60
Clock Speed 2.60 GHz 1.05 GHz
Memory 64 GB 8 GB
L1 Cache 32 KB (per core) 32 KB (per core)
L2 Cache 512 KB (per core) 512 KB (per core)
L3 Cache 20 MB (shared) N/A
Vector Unit 256 bit (4 DPFP) 512 bit (8 DPFP)
Theoretical Peak (DP) 166 Gflops/s 1,011 Gflops/s

* Two MICs are available on each of BlueRidge nodes 1-130.


Note: BlueRidge is governed by an allocation manager, meaning that in order to run most jobs on it, you must be an authorized user of an allocation that has been submitted and approved. The open_q queue is available to jobs that are not charged to an allocation, but it has tight usage restrictions (see below for details) limits and so is best used for initial testing in preparing allocation requests. For more on allocations, click here.

Each of the first 130 nodes on BlueRidge (br001 to br130) is equipped with two Intel MIC (Xeon Phi) cards. These can be requested by adding :mic to the node request in your submission script. For more information on using these nodes, see the MIC User Guide.

The next four nodes (br131 to br134) are equipped with two Nvidia Tesla K40m GPUs. The GPUs can be used for visualization and GPGPU programming. These can be requested by adding :gpu to the node request in your submission script.

The last 18 nodes (br391 to br408) are equipped with additional memory (128 GB). These can be requested by adding :highmem to the node request in your submission script.

Note that if “normal” nodes fill up, then jobs will automatically overflow onto the mic nodes, then onto the highmem nodes, and finally onto the gpu nodes.

BlueRidge has four queues:

  • normal_q for production (research) runs
  • dev_q for short testing, debugging, and interactive sessions. dev_q provides slightly elevated job priority and 16 dedicated nodes to facilitate code development and job testing prior to production runs.
  • vis_q for interactive visualization.
  • open_q provides access for small jobs and evaluating system features. open_q does not require an allocation; it can be used by new users or researchers evaluating system performance for an allocation request.

The settings for the queues are:

Queue normal_q dev_q vis_q open_q
Intended Use Production jobs Development and testing Interactive visualization Small jobs & system evaluation
Available Nodes br017-br130, br133-br408 All br131-br134 br017-br130, br133-br408
Max Jobs/User 10 1 1 2
Max Nodes/User 128 128 4 4
Max Cores/User 2,048 2,048 64 64
Max Run Time 144 hours 2 hours 4 hours 4 hours
Max Core-Hours/User* 36,864 128** 256 256
Priority Normal Elevated Normal Normal

* A user cannot, at any one time, have more than this many core-hours allocated across all of their running jobs. (Core-hours is defined as the number of cores used multiplied by the remaining requested walltime.) For example, the setting means that a user wishing to run a job with the maximum of 2,048 cores will be limited to a 18 hour walltime (36,864/2,048 = 18). Similarly, a 512-core job cannot run for more than 72 hours. This also implies that a job using the maximum runtime of 144 hours cannot use more than 256 cores (36,864/144 = 256).

** This setting means that large development (dev_q) jobs have to be quite short; for example, a 512-core dev_q job can only run for 15 minutes (128/512 = 0.25 hours). This is to ensure that the development queue is not used for production runs.

Software and Compilers

For list of software available on BlueRidge, as well as a comparison of software available on all ARC systems, click here.

Note that a user will have to load the appropriate module(s) in order to use a given software package on the cluster. The module avail and module spider commands can also be used to find software packages available on a given system.


The cluster is accessed via ssh to one of the login nodes or Log in using your username (usually Virginia Tech PID) and password. You will need an SSH Client to log in; see here for information on how to obtain and use an SSH Client. You must be on a campus network to access the login node, so off-campus use requires connecting to the VT VPN first.

Jobs are submitted to ARC resources through a job queuing system, or “scheduler”. Submission of jobs through a queueing system means that jobs may not run immediately, but will wait until the resources that it requires are available. The queuing system thus keeps the compute servers from being overloaded and allocates dedicated resources across running jobs. This will allow each job to run optimally once it leaves the queue.

Job management (submission, checking) is described in the Scheduler Interaction tutorial. Please take note of BlueRidge’s policies and queues when creating your submission script. A step-by-step example is provided below.


Submission Templates

This shell script provides a template for submission of jobs on BlueRidge. The comments in the script include notes about how to add modules, submit MPI jobs, etc.

To utilize this script template, create your own copy and edit as described here.

Step-by-Step Examples

Note: Step-by-step examples for using the Intel MIC Cards are provided on a separate page.

To compile and run the example MPI Quadrature C program on BlueRidge:

  1. SSH into BlueRidge (See Usage, above).
  2. Download the source code file (link above) and put it in a folder in your Home directory. See here for information on how to transfer files to and from ARC’s systems.
  3. Use the Unix cd command to navigate to that folder.
  4. Compile the source code file into an executable:
    1. To use the Intel (default) compiler and mvapich2 (default) MPI stack:
      1. The compiler module should already be loaded; you can check this with the module list command.
      2. Compile the source code: mpicc -o mpiqd mpi_quad.c (Note that here we use the mpicc command since it is an MPI program.)
      3. Your executable is now in the file mpiqd. To execute the program on 16 cores, you would use the command: mpirun -np 16 ./mpiqd. To do so, however, would run the job on a head node, thereby slowing down the system for other users. Therefore, BlueRidge jobs should be submitted to the scheduler using a qsub command (see the next step).
    2. To use the GNU compiler:
      1. Since the Intel compiler is loaded by default, we’ll need to replace it with the gcc compiler. Type module avail to get a list of available modules. Locate the GNU compiler set that you want to use (e.g. gcc/4.7.2).
      2. Replace the Intel compiler module with the gcc module using the path you identified in the previous step: module swap intel gcc
      3. Compile the source code, as with the Intel compiler (see above), though you may need to add
        #include <stdio.h>
        #include <math.h>

        statements to mpi_quad.c to get it to compile correctly.

    3. To use the OpenMPI MPI stack:
      1. Since the mvapich2 MPI stack is loaded by default, we’ll need to replace it with OpenMPI. Type module avail to get a list of available modules. Locate the OpenMPI module that you want to use. Replace the mvapich2 module with the openmpi module:
        module swap mvapich2 openmpi
      2. Compile the source code, as with the Intel compiler and mvapich2 MPI stack (see above).
  5. To submit your program to the scheduler, download and open the sample BlueRidge submission script.
  6. Edit the script to run your program:
    1. The walltime is set with the command #PBS -l walltime. This is the time that you expect your job to run; so if you submit your job at 5:00pm on Wednesday and you expect it to finish at 5:00pm on Thursday, the walltime would be 24:00:00. Note that if your job exceeds the walltime estimated during submission, the scheduler will kill it. So it is important to be conservative (i.e., to err on the high side) with the walltime that you include in your submission script. The walltime in the sample script is set to one hour; the quadrature code will run quickly so we’ll change the walltime to 10 minutes using the command #PBS -l walltime=00:10:00.
    2. Edit the line #PBS -lnodes=1:ppn=16 to set the number of nodes (lnodes) and processors per node (ppn) that you want to utilize. Since BlueRidge has 16 cores per node, ppn should generally be set to 16. The number of nodes available in each queue is described in the Policies section. For this example, we’ll use 32 cores across 2 nodes using the command #PBS -l nodes=2:ppn=16.
    3. Edit the line #PBS -A youraccount to replace youraccount with the name of your allocation.
    4. The MPI quadrature program is a simple code that doesn’t require any special software modules to run. But it does require that you have the compiler and MPI stack that you used to compile it loaded. So use the module purge and module load commands to ensure that the modules that you need are loaded. For more on BlueRidge’s module structure, click here.
    5. Replace everything between the echo "Hello world!" line and (but not including) the exit lines in the sample script with the command to run your job: mpiexec -np $PBS_NP ./mpiqd. ($PBS_NP is simply a variable that holds the number of cores you requested earlier – since we requested 2 nodes and 16 cores each, $PBS_NP will automatically hold 32.)
  7. Your script should look something like this. Save the script file.
  8. Copy the compiled file and script to your Work directory (example command: cp mpiqd $WORK). Running your program from Work (or Scratch) will ensure that it gets the fastest possible read/write performance.
  9. Navigate to your Work directory: cd $WORK
  10. To submit the script, use the qsub command. For example, if you saved your script file as “br_mpiqd.qsub”, the command would be
    qsub br_mpiqd.qsub
  11. The system will return your job name of the form xxxx.master.cluster (e.g., 53318.master.cluster, where 53318 is the job number). Follow the instructions above to use qstat to track the progress of your job, qdel to delete your job, etc.
  12. When complete, the program output will be held in the file with the extension .o followed by your job number (e.g. “br_mpiqd.qsub.o53318”). Any errors will be held in the analogous .e file (e.g. “br_mpiqd.qsub.e53318”).
  13. Work and Scratch are wiped periodically, so to ensure that you have long-term access to the results, copy them back to your Home directory:
    cp br_mpiqd.qsub.o53318 $HOME