NewRiver

Overview

NewRiver is a 173-node system capable of tackling the full spectrum of computational workloads, from problems requiring hundreds of compute cores to data-intensive problems requiring large amount of memory and storage resources. NewRiver contains six compute engines designed for distinct workloads.

  • General – Distributed, scalable workloads. With two of Intel’s latest-generation Haswell processors, 24 cores, and 128 GB memory on each node, this 100-node compute engine is suitable for traditional HPC jobs and large codes using MPI.
  • Big Data – High performance data analytics. With 43.2 TB of direct-attached storage for each node, this system enables processing and analysis of massive datasets. The 16 nodes in this compute engine also have 512 gigabytes (GB) of memory, making them suitable for jobs requiring large memory.
  • Visualization (K80 GPU) – Data visualization. Nvidia K80 GPUs in this system can be used for remote rendering, allowing large datasets to be viewed in place without lengthy data transfers to desktop PCs. The 8 nodes in this compute engine also have 512 GB of memory, making them suitable for jobs requiring large memory.
  • GPU (P100) – Code acceleration. Nvidia P100s GPUs in this system can be used for code acceleration. There are 40 nodes in this compute engine, but one node is reserved for system maintenance. Each node also has 512 GB of memory, making them suitable for jobs requiring large memory.
  • Interactive – Rapid code development and interactive usage. Each of the eight nodes in this compute engine have 256 GB memory and a K1200 graphics card. A browser-based, client-server architecture allows for a responsive, desktop-like experience when interacting with graphics-intensive programs.
  • Very Large Memory – Graph analytics and very large datasets. With 3TB (3072 gigabytes) of memory, four 15-core processors and 6 direct attached hard drives, each of the two servers in this system will enable analysis of large highly-connected datasets, in-memory database applications, and speedier solution of other large problems.

Technical Specifications

The table below lists the specifications for NewRiver’s compute engines. View a detailed schematic of NewRiver and the ARC Core Network here.

Compute Engine # Hosts CPU Cores Memory Local Disk Other Features
General 100 nr027-nr126 2 x E5-2680v3 2.5GHz (Haswell) 24 128 GB, 2133 MHz 1.8TB 10K RPM SAS
Big Data 16 nr003-nr018 2 x E5-2680v3 2.5GHz (Haswell) 24 512 GB 43.2 TB (24 x 1.8 TB) 10K RPM SAS 2x 200 GB SSD
Visualization (K80 GPU) 8 nr019-nr026 2 x E5-2680v3 2.5GHz (Haswell) 24 512 GB 3.6 TB (2 x 1.8 TB) 10K RPM SAS NVIDIA K80 GPU
GPU (P100) 39 nr127-nr165 2 x E5-2680v4 2.4GHz (Broadwell) 28 512 GB 2 x 200GB SSD 2 x NVIDIA P100 GPU
Interactive 8 newriver1-newriver8 2 x E5-2680v3 2.5GHz (Haswell) 24 256 GB NVIDIA K1200 GPU
Very Large Memory 2 nr001-nr002 4 x E7-4890v2 2.8GHz (Ivy Bridge) 60 3 TB 10.8 TB (6 x 1.8 TB) 10K RPM SAS

Notes:

  • Theoretical peak (CPU-only) performance of a single NewRiver Haswell node is 960 GFlops/s, compared with 330 for a BlueRidge node. In testing, threaded matrix multiplies (dgemm) peaked at ~800 Gflops/s for gcc/ATLAS, ~780 Gflops/s for Intel/MKL, and ~730 Gflops/s for gcc/OpenBLAS.
  • 1.8 TB NVMe drives on BigData nodes will be made available soon.
  • K80 GPU Notes: There are 2 CUDA Devices Although the K80s are a single physical device in 1 PCIe slot, there are 2 separate GPU chips inside. They will be shown as 2 separate devices to CUDA code. nvidia-smi will show this.

Network

  • 100 Gbps EDR-Infiniband provides low latency communication between compute nodes for MPI traffic.
  • Compute nodes are connected to storage (GPFS) via 100 Gbps EDR-IB. They can also be connected via dual 10 Gbps Ethernet.
  • Interactive nodes are connected to the public internet by 10 Gbps Ethernet.

Policies

Note: NewRiver is governed by an allocation manager, meaning that in order to run most jobs on it, you must be an authorized user of an allocation that has been submitted and approved. The open_q queue is available to jobs that are not charged to an allocation, but it has tight usage restrictions (see below for details) limits and so is best used for initial testing in preparing allocation requests. For more on allocations, click here.

NewRiver has seven queues:

  • normal_q for production (research) runs.
  • largemem_q for production (research) runs on the large memory nodes.
  • dev_q for short testing, debugging, and interactive sessions. dev_q provides slightly elevated job priority to facilitate code development and job testing prior to production runs.
  • vis_q for interactive visualization on the GPU nodes.
  • open_q provides access for small jobs and evaluating system features. open_q does not require an allocation; it can be used by new users or researchers evaluating system performance for an allocation request.
  • p100_normal_q for production (research) runs on the P100 GPU nodes.
  • p100_dev_q for short testing, debugging, and interactive sessions on the P100 GPU nodes. p100_dev_q provides slightly elevated job priority to facilitate code development and job testing prior to production runs.

The settings for the queues are:

Max Nodes*32 per user
48 per allocationN/A32 per user2 per user4 per user12/24 per user
18/24 per allocation12/24 per user

Queue normal_q largemem_q dev_q vis_q open_q p100_normal_q p100_dev_q
Access to nr004-nr018, nr029-nr126 nr001-nr002 nr003-nr018, nr027-nr126 nr019-nr026 nr004-nr018, nr029-nr126 nr131-nr165 nr127-nr165
Max Jobs 6 per user
12 per allocation
1 per user 1 per user 1 per user 1 per user 8 per user
16 per allocation
1 per user
Max Cores* 768 per user
1,152 per allocation
60 per user 768 per user 48 per user 96 per user 336/672 per user
504/672 per allocation
336/672 per user
Max Memory* 4 TB per user
6 TB per allocation
3 TB per user 4 TB per user
6 TB per allocation
384 GB per user 1 TB per user 6/12 TB per user
9/12 TB per allocation
6/12 TB per user
Max Walltime 144 hr 144 hr 2 hr 4 hr 4 hr 144 hr 2 hr
Max Core-Hours* 27,648 per user
41,472 per allocation
8,640 per user 192 per user 192 per user 192 per user 16,128 per user
24,192 per allocation
168/336 per user

*Entries with two values (e.g., XX/YY) represent soft and hard limits, respectively. Hard limits apply when the queue is not busy (there are idle resources); soft limits apply when the queue is full. The two different limits allow users to use more computational resources when those resources are not in demand.

Other notes:

  • Shared node access: more than one job can run on a node (Note: This is different from other ARC systems)

Software

For list of software available on NewRiver, as well as a comparison of software available on all ARC systems, click here.

Note that a user will have to load the appropriate module(s) in order to use a given software package on the cluster. The module avail and module spider commands can also be used to find software packages available on a given system.

Usage

NewRiver provides two access methods: traditional terminal access and broswer-based access for improved graphical experiences.

Terminal Access

The cluster is accessed via ssh to one of the login nodes below. Log in using your username (usually Virginia Tech PID) and password. You will need an SSH Client to log in; see here for information on how to obtain and use an SSH Client.

  • newriver1.arc.vt.edu
  • newriver2.arc.vt.edu

High-Performance Remote Application Access: ETX

NewRiver offers web browser-based access to interactive nodes via ETX. This interface provides faster, more interactive access to graphical user interfaces than standard X11 forwarding (e.g. ssh -X). ETX will automatically load balance users between the eight interactive nodes.

To access it, use a web browser (e.g. Firefox or Safari, not Chrome) to go to

http://newriver.arc.vt.edu

For more information, see the ETX page.

Job Submission

Access to all compute engines (aside from interactive nodes) is controlled via the job scheduler. See the Job Submission page here. The basic flags are:

#PBS -l walltime=dd:hh:mm:ss
#PBS -l [resource request, see below]
#PBS -q normal_q (or other queue, see Policies)
#PBS -A <yourAllocation> (see Policies)
#PBS -W group_list=newriver

Shared Node

Compute nodes are not dedicated to a single job (as is done on BlueRidge). NewRiver has more options for requesting resources to ensure the scheduler can optimally place jobs. Resources can be requested by specifying the number of nodes:ppn (like on BlueRidge), but also cores, memory, GPUs, etc. See example resource requests below:

Request 2 nodes with 24 cores each
#PBS -l nodes=2:ppn=24

Request 4 cores (on any number of nodes)
#PBS -l procs=4

Request 12 cores with 20gb memory per core
#PBS -l procs=12,pmem=20gb

Request 2 nodes with 24 cores each and 20gb memory per core (will give two 512gb nodes)
#PBS -l nodes=2:ppn=24,pmem=20gb

Request 2 nodes with 24 cores per node and 1 gpu per node
#PBS -l nodes=2:ppn=24:gpus=1

Request 2 cores with 1 gpu each
#PBS -l procs=2,gpus=1

Examples

This shell script provides a template for submission of jobs on NewRiver. The comments in the script include notes about how to request resources, load modules, submit MPI jobs, etc.

To utilize this script template, create your own copy and edit as described here.

A simple script (with comments removed) for running on the P100 nodes is here. (Please make sure to change youraccount in the -A line to your allocation name.)