MATLAB

Introduction

MATLAB handles a range of computing tasks in engineering and science, from data acquisition and analysis to application development. The MATLAB environment integrates mathematical computing, visualization, and a powerful technical language. It is especially well-suited to vectorized calculations and has a Parallel Computing Toolbox (not included in all licenses) that streamlines parallelization of code.

Availability

MATLAB is available on several ARC systems, but the primary one is Ithaca, which has a separate queue devoted solely to Parallel MATLAB and licenses for up to 224 parallel workers at a time. Departments can purchase a concurrent MATLAB license with the Parallel Computing Toolbox through the University's Departmental Software Distribution program. Students can purchase a license with the Parallel Computing Toolbox through the University's Student Software Distribution program (note that MATLAB is also included in some of the Student Bundles).

Interface

There are two types of environments in which the MATLAB application can be used:

Windows Interface

If you are using MATLAB on a Mac, in PC Windows, or the X Windows System under UNIX, you can start MATLAB by double clicking on the MATLAB icon. A blank notebook page will be opened into which you can enter your MATLAB commands. When you start MATLAB, you will typically see a Command Window pane for entering MATLAB commands and a Current Folder pane for manipulating files. You may also see a Workspace pane listing the variables that have been defined and/or a Command History listing a history of commands that have been entered.

MATLAB commands are executed as soon as the Enter (Return) key is pressed. Multiple commands may be entered on a line by separating the commands with semi-colons (;). See this MATLAB Reference (PDF) for examples of a variety of basic MATLAB commands. You can get help on a command by typing help <command> or via the Help menu.

Command Line Interface

You can also start MATLAB from the command line on Unix systems where MATLAB is installed. Note that the command line runs on the login node, so big jobs cannot be run via the command line interface - they should be submitted via the queuing system, typically via remote batch submission.

To start MATLAB in this mode, enter matlab at a system prompt. Note that you will first have to load the appropriate module. You can also include a MATLAB command using the -r flag. So matlab -r myscript would start MATLAB and run the script found in myscript.m. After the MATLAB kernel loads, some messages will be displayed, and the ">>" prompt will appear indicating that MATLAB has started and is waiting for you to enter commands. When you are ready to exit MATLAB, simply type quit.

Parallel Computing in MATLAB

There are two primary means of obtaining parallelism in MATLAB:

  • parfor: Replacing a for loop with a parfor loop splits the loop iterations among a group of processors. This requires that the loop iterations be independent of each other.
  • spmd: Single program multiple data (spmd) allows multiple processors to execute a single program (similar to MPI).

Slides and example programs for both parfor and spmd are available in the Resources section.

Submitting Remote Batch Jobs

In order to run large jobs on ARC's systems (e.g. Ithaca), you will need to submit your job to that system's queue. This section describes how to submit a remote batch job from your MATLAB desktop application.

Note: In order for this submission to work, you will need to create an Ithaca profile on your machine.

Batch jobs are submitted via the batch command. They can be submitted to the local profile, which runs the job on the desktop where MATLAB is running, or via another profile that points to a remote machine (e.g. Ithaca). The batch is called as follows:

For the local profile:

job = batch ( 'script_name', 'Profile', 'local', 'Matlabpool', n );
Where 'script_name' is the name of the script (a file listing a series of MATLAB commands) that you want MATLAB to execute and n is the number of parallel workers that you want to use (up to 12 for the local profile).

For a remote profile:

job = batch ( 'script_name', 'Profile', 'profile_name', 'AttachedFiles', { 'function1', 'function2' }, 'CurrentFolder', 'remote_dir', 'Matlabpool', n );
Where:

  • 'script_name' is the name of the script (a file listing a series of MATLAB commands) that you want MATLAB to execute.
  • 'function1' and 'function2' are the names of any other functions that are needed for your script to execute.
  • 'profile_name' is the name of the cluster profile (remote configuration) that you want to use (e.g. 'ithaca_R2013a').
  • 'remote_dir' indicates the remote folder in which the script should execute. A default for this is set in your Ithaca profile when it is set up, so this can simply be set to '.' if you want to use that default directory.
  • n is the number of parallel workers that you want to use minus 1. So if you request 15 in your batch command, MATLAB will actually run with 16 workers. See Ithaca's policies for guidance on how many workers to request. More workers will (presumably) make your script run faster but may also force it to wait in the queue longer before it can execute.

The batch command has many different potential parameters; use the command help batch to get a full list of possibilities and a description of each.

Note: You may also want to adjust the walltime of your job before submitting to Ithaca (by default, jobs have a walltime of one hour). To do so, add the following command prior to calling the batch command (adjust the walltime as appropriate - this example uses 12 hours):

    PBSClusterInfo.setExtraParameter('-l walltime=12:00:00');

Once batch has been called, there are a number of other commands that can follow:

  • get (e.g. get(job,'State')) gets the status of the job (e.g. Running or Queued or Complete). Or you can simply use wait (e.g. wait(job)) to pause the MATLAB session until the job completes.
  • diary (e.g. diary(job)) displays any messages printed during execution. (This is not available if you set the 'CaptureDiary' batch parameter to false. The default value is true.)
  • load (e.g. load(job)) makes the script's workspace available. So, for example, if your script creates the variable result describing the result of the run, load will then make that variable available in your MATLAB workspace. So typing result will cause MATLAB to print the result. You can also examine just a single output variable if you specify the variable name: total = load ( job, 'total' )

There are a few examples of some simple batch scripts in the Examples section below. It is recommended that you test one or more of them before you try to create your own.

Checking Remote Jobs

If you do not want to lose your command prompt while your job is running (i.e. don't want to use the wait() command above) or you need to shut down your Matlab desktop before your job is complete, you can recover your job information using the following steps:

  1. Get a cluster/scheduler object: pc = parcluster('ithaca_R2013a');
  2. Get a list of jobs submitted to that cluster, including the size and status: jobs = findJob(pc)
  3. Pick out a particular job: job = jobs(3)
  4. You can then get information about a given job using that index. For example, to get the diary for job, use diary(job). Similarly, to load the workspace for the third job, use load(jobs(3)).
  5. You can also get information about the tasks in a given job: tasks = findTask(jobs(3))

Examples

  • This simple example runs a batch script to count the prime numbers between 1 and 10,000,000. (The correct answer is 664,579.) It is recommended that you test this script before you try to create your own. To run it:
    1. Download the zip file and unzip it somewhere on your machine (where MATLAB is running).
    2. Open MATLAB and set the current folder (in the box at the top of the screen) to the location where you unzipped the files. You should see the files (prime_fun.m, prime_batch.m, etc) appear in the Current Folder pane. If the Current Folder pane is not open, open it by checking Current Folder in the Desktop menu.
    3. Run a local batch job:
      1. Double-click on prime_batch_local.m to view its contents. Notice that the batch command on line 28 uses the local configuration and that it calls the script file prime_script.m, which in turn calls the function prime_fun.m with the parameter 10,000,000.
      2. Switch back to the command window and run the script by either typing run prime_batch_local at the command line or right-clicking on prime_batch_local.m and selecting run. (Or, with prime_batch_local.m open in the editor, you can select Run in the Debug menu or simply hit F5.)
      3. The script should print some messages describing what it's doing (e.g. PRIME_BATCH_LOCAL Run PRIME_SCRIPT locally.) and then print the result: Total number of primes = 664579
    4. Run a remote batch job on Ithaca. Note that you will need to have an Ithaca configuration set up on your computer for this step to work:
      1. Double-click on prime_batch_ithaca.m to view its contents.
      2. In the batch command on line 36, change the configuration name 'ithaca_R2013a' to the name of your Ithaca configuration. (To check the name of your configurations, go to the command window and select Manage Cluster Profiles under the Parallel menu. You should see an Ithaca configuration in addition to the 'local' configuration. If you do not see an Ithaca configuration, click here to set one up on your computer.)
      3. Save the file.
      4. Switch back to the command window and run the script by either typing run prime_batch_ithaca at the command line or right-clicking on prime_batch_ithaca.m and selecting run. (Or, with prime_batch_ithaca.m open in the editor, you can select Run in the Debug menu or simply hit F5.)
      5. The script should print some messages describing what it's doing (e.g. PRIME_BATCH_ITHACA Run PRIME_SCRIPT on Ithaca.) and then print the result: Total number of primes = 664579
  • This simple example submits a Parallel MATLAB job using the local configuration (up to 7 workers) to the normal Ithaca queue. The job will run like any other job submitted to the general Ithaca queue; for more information, click here. To run this example:
    1. Unzip the files and put them in a folder on Ithaca. See here for information on how to transfer files to and from ARC's systems.
    2. Log in to Ithaca and navigate to the folder where you saved the files.
    3. Run it using the following command: qsub ./prime_qsub.sh (since prime_qsub.sh is the name of the submission script)
      1. This runs the submission script prime_qsub.sh.
      2. prime_qsub.sh calls the MATLAB script prime_batch_local.m.
      3. prime_batch_local.m calls the MATLAB script prime_script.m.
      4. prime_script.m calls the MATLAB function prime_fun.m with the parameter 10,000,000.
    4. Once the run is complete, the MATLAB output is printed to the file "prime_qsub.sh.o#####", where "#####" is your job number. An example: prime_qsub.sh.o43794.
    5. For more on submitting jobs to the Ithaca scheduler, click here.

Resources

The following resources should be helpful to users trying to get started with MATLAB or Parallel MATLAB. Please also see our tutorial for configuring desktop MATLAB for remote batch submission to Ithaca.

  • There is a Virginia Tech MATLAB listserv used to distribute information about workshops, special events, and other issues affecting MATLAB users. To subscribe, send an email to listserv@listserv.vt.edu. The body of the message should simply be: subscribe mathworks firstname lastname
  • PARFOR and SPMD slides and examples from the Spring 2014 NLI (FDI) MATLAB Course offered by the Interdisciplinary Center for Applied Mathematics (ICAM) and ARC are posted here.
  • Slides and examples from Dr. Gene Cliff's (of ICAM) Spring 2013 FDI course on MATLAB-based Optimization:
  • MathWorks provides a number of excellent tutorials on MATLAB here. Their Getting Started with MATLAB video provides an excellent introduction to using MATLAB in just over five minutes. Their Parallel and GPU computing tutorials and sample codes were updated with MATLAB version 2014a. See also their webinars for more in-depth looks are more advanced topics.
  • A simple PDF reference sheet for a variety of MATLAB commands: Matlab Reference
  • FDI also provides a self-paced MATLAB course here.
  • MathWorks provides some MATLAB resources via Scholar here.

Not logged in. [Log in]