DragonsTooth is a 48-node system designed to support general batch HPC jobs. The table below lists the technical details of each DragonsTooth node. Nodes are connected to each other and to storage via 10 gigabit ethernet (10GbE), a communication channel with high bandwidth but higher latency than InfiniBand (IB). As a result, DragonsTooth is better suited to jobs that require less or no internode communication and/or less I/O to/from storage than NewRiver, which has similar nodes but a low-latency IB interconnect. To allow I/O-intensive jobs, DragonsTooth nodes are each outfitted with nearly 2 TB of solid state local disk. DragonsTooth was released to the Virginia Tech research community in August 2016.
|CPU||2 x Intel Xeon E5-2680v3 (Haswell) 2.5 GHz 12-core|
|Memory||256 GB 2133 MHz DDR4|
|Local Storage||4 x 480 GB SSD Drives|
|Theoretical Peak (DP)||806 GFlops/s|
Note: DragonsTooth is governed by an allocation manager, meaning that in order to run most jobs on it, you must be an authorized user of an allocation that has been submitted and approved. The open_q queue is available to jobs that are not charged to an allocation, but it has tight usage restrictions (see below for details) limits and so is best used for initial testing in preparing allocation requests. For more on allocations, click here.
As described above, communications between nodes and between a node and storage will have higher latency on DragonsTooth than on other ARC clusters. For this reason the queue structure is designed to allow more jobs and longer-running jobs than on other ARC clusters.
DragonsTooth has three queues:
- normal_q for production (research) runs.
- dev_q for short testing, debugging, and interactive sessions. dev_q provides slightly elevated job priority to facilitate code development and job testing prior to production runs.
- open_q provides access for small jobs and evaluating system features. open_q does not require an allocation; it can be used by new users or researchers evaluating system performance for an allocation request.
The settings for the queues are:
|Max Jobs||288 per user
432 per allocation
|1 per user||1 per user|
|Max Nodes||12 per user
18 per allocation
|12 per user||4 per user|
|Max Cores||288 per user
432 per allocation
|288 per user||96 per user|
|Max Memory||3 TB per user
4.5 TB per allocation
|3 TB per user||256 GB per user|
|Max Walltime||30 days||2 hr||4 hr|
|Max Core-Hours*||34,560 per user
51,840 per allocation
|96 per user||192 per user|
- Shared node access: more than one job can run on a node
* A user cannot, at any one time, have more than this many core-hours allocated across all of their running jobs. So you can run long jobs or large/many jobs, but not both. For illustration, the following table describes how many nodes a user can allocate for a given amount of time:
|Walltime||Max Nodes (per user)||Max Nodes (per allocation)|
|72 hr (3 days)||12||18|
|144 hr (6 days)||10||15|
|360 hr (15 days)||4||6|
|720 hr (30 days)||2||3|
For list of software available on DragonsTooth, as well as a comparison of software available on all ARC systems, click here.
Note that a user will have to load the appropriate module(s) in order to use a given software package on the cluster. The module avail and module spider commands can also be used to find software packages available on a given system.
The cluster is accessed via ssh to one of the login nodes below. Log in using your username (usually Virginia Tech PID) and password. You will need an SSH Client to log in; see here for information on how to obtain and use an SSH Client.
Access to all compute engines (aside from interactive nodes) is controlled via the job scheduler. See the Job Submission page here. The basic flags are:
#PBS -l walltime=dd:hh:mm:ss #PBS -l [resource request, see below] #PBS -q normal_q (or other queue, see Policies) #PBS -A <yourAllocation> (see Policies) #PBS -W group_list=dragonstooth
DragonsTooth compute nodes can be shared by multiple jobs. Resources can be requested by specifying the number of nodes, processes per node (ppn), cores, memory, etc. See example resource requests below:
Request 2 nodes with 24 cores each #PBS -l nodes=2:ppn=24 Request 4 cores (on any number of nodes) #PBS -l procs=4 Request 12 cores with 20gb memory per core #PBS -l procs=12,pmem=20gb
This shell script provides a template for submission of jobs on DragonsTooth. The comments in the script include notes about how to request resources, load modules, submit MPI jobs, etc.
To utilize this script template, create your own copy and edit as described here.