digits-user-guide

DIGITS wraps several of the popular deep learning tools into an easy-to-use web interface.  To open the DIGITS interface, first establish an instance of the DIGITS server by submitting a batch job that launches digits-devserver on one of the compute nodes. The provided digits-devserver  script will start the digits server on a compute node with 24 hours of walltime. The job should be launched by typing

sbatch digits-devserver.sh

Type squeue to identify which compute node the job is running on. Once the server is running on the compute node, you will be able to load DIGITS via ssh tunnel or X11 forwarding.

Via SSH tunnel

ssh -L 5000:hu001:5000 huckleberry1.arc.vt.edu -N
(This assumes the compute node running DIGITS is hu001. You may need to change it.)

Then point your browser to http://localhost:5000

Via X11 forwarding

Open an SSH connection with X11 forwarding enabled:
ssh -X huckleberry1.arc.vt.edu

To start firefox from the login node, type:

firefox --no-remote &

If your job is running on compute node hu001, you should point your browser at http://hu001:5000 to open the digits interface (if your job is running on another compute node, you should enter it in stead of hu001). DIGITS essentially provides a portal to control the jobs that run on the compute node.

To train a basic model, a good starting point are the basic examples included in DIGITS. Input data has already been downloaded to the ARC filesystem. A local copy can be obtained by

tar xvzf /home/TRAINING/mnist.tar.gz

Once the data has been downloaded, you can train a model by following the steps described at  https://github.com/NVIDIA/DIGITS/blob/master/docs/GettingStarted.md