TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google’s Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.
The TensorFlow website is https://www.tensorflow.org/
- Getting started https://www.tensorflow.org/get_started/
- Tutorials https://www.tensorflow.org/tutorials/
Before you read this section, please review the Huckleberry user guide for detail on how to create and manage jobs on Huckleberry.
To use TensorFlow you should load the CUDA module (for GPU training) and source the TensorFlow package. A sample submission file would look like
#SBATCH -J TFTestScript
#SBATCH -p normal_q
#SBATCH -N 1
#SBATCH -t 10:00
module load cuda
Where the contents of
HellowWorld.py would be a python code using the TensorFlow package. A hello world example would be
from __future__ import print_function
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
# Start tf session
sess = tf.Session()
# Run the session
Note that some of the tutorials try to download data sets from the Internet. The compute nodes do not have Internet access. Downloading any data from the Internet should be done from the login node before submitting your job. You will need to modify your script to point to the local copy of the data set instead of downloading it from the Internet.