Matlab
Contents
 Summary and Version Information
 Running a MATLAB script from the command line
 MATLAB and HPC
 Builtin multithreaded functions
 MATLAB Parallel Computing Toolbox
 MATLAB Distributed Computing Server
Summary and Version Information
Package  Matlab 

Description  Matlab 
Categories  Numerical Analysis 
Version  Module tag  Availability^{*}  GPU Ready 
Notes 

2009b  matlab/2009b  NonHPC Glue systems All OSes 
Y  
2010b  matlab/2010b  NonHPC Glue systems Evergreen HPCC Linux 
Y  
2011a  matlab/2011a  NonHPC Glue systems Bswift HPCC Linux 
Y  
2011b  matlab/2011b  NonHPC Glue systems Evergreen HPCC Bswift HPCC Linux 
Y  
2012b  matlab/2012b  NonHPC Glue systems Bswift HPCC Linux 
Y  
2013b  matlab/2013b  NonHPC Glue systems RedHat6 
Y  
2014a  matlab/2014a  NonHPC Glue systems RedHat6 
Y  
2014b  matlab/2014b  NonHPC Glue systems Deepthought HPCC Bswift HPCC Deepthought2 HPCC RedHat6 
Y  
2015a  matlab/2015a  NonHPC Glue systems RedHat6 
Y  
2015b  matlab/2015b  NonHPC Glue systems Deepthought HPCC Deepthought2 HPCC RedHat6 
Y  
2016a  matlab/2016a  NonHPC Glue systems Deepthought HPCC Deepthought2 HPCC RedHat6 
Y  
2016b  matlab/2016b  NonHPC Glue systems Deepthought HPCC Deepthought2 HPCC RedHat6 
Y  
2017a  matlab/2017a  NonHPC Glue systems Deepthought HPCC Deepthought2 HPCC RedHat6 
Y 
Notes:
^{*}: Packages labelled as "available" on an HPC cluster means
that it can be used on the compute nodes of that cluster. Even software
not listed as available on an HPC cluster is generally available on the
login nodes of the cluster (assuming it is available for the
appropriate OS version; e.g. RedHat Linux 6 for the two Deepthought clusters).
This is due to the fact that the compute nodes do not use AFS and so have
copies of the AFS software tree, and so we only install packages as requested.
Contact us if you need a version
listed as not available on one of the clusters.
In general, you need to prepare your Unix environment to be able to use this software. To do this, either:
tap TAPFOO
module load MODFOO
where TAPFOO and MODFOO are one of the tags in the tap
and module columns above, respectively. The tap
command will
print a short usage text (use q
to supress this, this is needed
in startup dot files); you can get a similar text with
module help MODFOO
. For more information on
the tap and module commands.
For packages which are libraries which other codes get built against, see the section on compiling codes for more help.
Tap/module commands listed with a version of current will set up for what we considered the most current stable and tested version of the package installed on the system. The exact version is subject to change with little if any notice, and might be platform dependent. Versions labelled new would represent a newer version of the package which is still being tested by users; if stability is not a primary concern you are encouraged to use it. Those with versions listed as old set up for an older version of the package; you should only use this if the newer versions are causing issues. Old versions may be dropped after a while. Again, the exact versions are subject to change with little if any notice.
In general, you can abbreviate the module tags. If no version is given, the default current version is used. For packages with compiler/MPI/etc dependencies, if a compiler module or MPI library was previously loaded, it will try to load the correct build of the package for those packages. If you specify the compiler/MPI dependency, it will attempt to load the compiler/MPI library for you if needed.
Running a MATLAB script from the command line
While most people use MATLAB interactively, there are times when you might wish to run a MATLAB script from the command line. Or from within a shell script. Usually in this situation, you have a file containing MATLAB commands, one command per line, and you want to start up MATLAB, run the commands in that file, and save the output to another file, and you do not want the MATLAB GUI starting up (often times, the process will be running in a fashion where there might not be a screen readily available to display the GUI stuff).

If you are running Matlab jobs on one of the Deepthought highperformance
computing clusters, please include a
#SBATCH L matlab
directive near the top of your job script. This is because we have been
having issues with HPC users depleting the campus Matlab license pool.
The above directive will ask Slurm for a matlab license, which will be used
to throttle the number of simultaneous Matlab jobs running on the clusters.
If all the matlab users on the cluster abide by this policy, hopefully
there will be no more issues with license depletion. If such an issue
occurs, we will regrettably have to kill some matlab jobs (starting with
those that did NOT request a license) to free up licenses.
We are hoping in the next several months to obtain a truly unlimited matlab
license on campus, but until then we ask that HPC users include the above
directive in their matlab jobs.

This can be broken down into several distinct parts:
 Get MATLAB to run without the GUI, etc.
 Get MATLAB to start running your script, and exit when your script is done.
 Get the output of the MATLAB command saved to a file.
The first part is handled with the following options to be passed to the
MATLAB command: nodisplay
and nosplash
. The
first disables the GUI, the latter disables the MATLAB splash screen that
gets displayed before the GUI starts up.
The second step is handled using the r
option, which specifies
a command which MATLAB should run when it starts up. You can give it
any valid MATLAB command, but typically you just want to tell it to read
commands from your file. And then you want to tell it to exit; otherwise
it will just sit at the prompt waiting for additional commands. One reason
to keep it simple like that is that the command string has to be quoted to
keep the Unix shell from interpretting it, and that can get tricky for
complicated commands.
Typically, you would give an argument like
matlab r "run('./myscript.m'); exit"
(and you would include the nodisplay
and nosplash
arguments before the r
if you wanted to disable the GUI as
well); where myscript.m
is your script file, and is located
in the current working directory. The exit
causes MATLAB to
exit once the script completes.
The third part is handled with standard Unix file redirection.
Putting it all together, if you had a script myscript.m
in the directory ~/mymatlabstuff
, and you want to run it
from a shell script putting the output in myscript.out
in the
same directory, you could do something like
#!/bin/tcsh
module load matlab
cd ~/mymatlabstuff
matlab nodisplay nosplash r "run('~/myscript.m'); exit" > ./myscript.out
MATLAB and HPC
Mathworks currently provides two products to help with parallelization:
 Parallel Computing Toolkit (PCT): This provides support for
parallel for loops (the
parfor
command), as well some CUDA support for using GPUs. However, without the MATLAB Distributed Compute Server, there are limits on the number of workers that can be created, as well as that all workers must be on the same node.  MATLAB Distributed Computing Server (MDCS): This extends MATLAB desktop workflows to the cluster hardware, and allows you to submit MATLAB jobs to the cluster without having to learn anything about the cluster command line interface.
In addition, some of the builtin linear algebra and numerical functions are multithreaded as well.

If you are running Matlab jobs on one of the Deepthought highperformance
computing clusters, please include a
#SBATCH L matlab
directive near the top of your job script. (This is NOT needed for Matlab
DSC jobs). This is because we have been
having issues with HPC users depleting the campus Matlab license pool.
The above directive will ask Slurm for a matlab license, which will be used
to throttle the number of simultaneous Matlab jobs running on the clusters.
If all the matlab users on the cluster abide by this policy, hopefully
there will be no more issues with license depletion. If such an issue
occurs, we will regrettably have to kill some matlab jobs (starting with
those that did NOT request a license) to free up licenses.
We are hoping in the next several months to obtain a truly unlimited matlab
license on campus, but until then we ask that HPC users include the above
directive in their matlab jobs.

Builtin multithreaded functions
A number of the Matlab builtin functions, especially linear algebra and numerical functions, are multithreaded and will automatically parallelize in that way.
This parallelization is shared memory, via threads, and so is restricted to within a single compute node. So normally your job submission scripts should explicitly specify that you want all your cores on a single node.
For example, if your matlab code is in the file myjob.m
, you might
use a job submissions script like:
#!/bin/bash
#SBATCH t 2:00
#SBATCH N 1
#SBATCH n 12
#SBATCH mempercpu 1024
#SBATCH L matlab
. ~/.profile
module load matlab
matlab nodisplay nosplash r "run('myjob.m'); exit" > myjob.out
and your matlab script should contain the line
maxNumCompThreads(12);
MATLAB Parallel Toolbox
The MATLAB Parallel Toolbox allows you to parallelize your MATLAB
jobs, to take advantage of multiple CPUs on either your desktop or on an
HPC cluster. This toolbox provides paralleloptimized builtin MATLAB
functions, including the parfor
parallel loop command.
A simple example matlab script would be
% Allocate a pool
% We use the default pool, which will consist of all cores on your current
% node (up to 12 for MATLABs before R2014a)
parpool
% For MATLAB versions before R2013b, use "matlabpool open"
%Preallocate a vector
A = zeros(1,100000)
xfactor = 1/100;
% Assign values in a parallel for loop
parfor i = 1:length(A)
A(i) = xfactor*i*sin(xfactor*i);
end
Assuming the above MATLAB script is in a file ptest1.m
in
the directory /lustre/payerle/matlabtests
, we can submit it
with the following script to sbatch
:
#!/bin/tcsh
#SBATCH n 20
#SBATCH N 1
#SBATCH L matlab
module load matlab
matlab nodisplay nosplash \\
r "run('/lustre/payerle/matlabtests/ptest1.m'); exit" \\
> /lustre/payerle/matlabtests/ptest1.out
You would probably want to add directives to specify other job submission paremeters, including
NOTE: It is important that you specify a single node in all of the above, as without using Matlab Distributed Computing Server the parallelization above is restricted to a single node.
MATLAB Distributed Computing Server
The MATLAB Distributed Computing Server (MDCS) allows you to extend your MATLAB workflows from your desktop to an HPC cluster without having to learn the details of submitting jobs to the cluster.
The initial documentation from the consultant from Matlab are below:
 MDCS at UMD A onepage overview of MDCS.
 Getting Started with Serial and Parallel MATLAB on Deepthought2 Instructions on how to set up your workstation to submit MATLAB jobs to Deepthought2.
Before using Matlab Distributed Compute Server (Matlab DCS) the first time on your computer, you will need to perform the following steps. You will need to perform these steps once on each computer you plan to run Matlab on and submit jobs via Matlab DCS to the Deepthought2 cluster. You will also need to repeat this step on a computer if you intend to use Matlab DCS with a new version of Matlab. It should only be necessary to do the following steps once per system/Matlab version; however, it should not hurt anything to repeat the process.
 Most of the configuration is contained in one of these two files:
 umd_deepthought2.tar.gz Required support files in a tarball
 umd_deepthought2.zip Required support files as a zipfile
 Determine the userpath directory for Matlab on your
workstation. To do this, run the
userpath
command in Matlab. Typically, this will be one of
My Documents/MATLAB
orDocuments/MATLAB
on Windows systems, or 
~/Documents/MATLAB
or$matlab/toolbox/local
on Linux systems.

 Untar/unzip the tarball/zipfile downloaded above and place the contents in the userpath directory determined above.
 You will also a profile settings file. You need to select the one that matches the version of Matlab running on your workstation. You can download/install multiple settings files if desired (which might be useful if you run different versions of Matlab on the same system). The files are: (depending on your browser, you probably need to do something like right click and "Save link as ..." to save these as a file): If your version of Matlab is not listed, you can contact system staff to see if Matlab DCS will work with that version of Matlab.
 Copy the settings file from above
(
deepthought2_remote_MATLAB_VERSION.settings
) into the userpath directory as obtained previously. You can have multiple settings files in that directory.
Using Matlab DCS with the DT2 Cluster
The following is a quick guide to using Matlab DCS to submit jobs to the DT2 cluster.
 To start, from the matlab prompt, run the command
configCluster
. This will do some basic setup, as well as ask for your username on the Deepthought2 cluster, and default jobs to running on the cluster instead of locally. If multiple profile settings for the version of Matlab you are running on your workstation are found, you will be prompted to select one. If no profile settings for the version of Matlab you are running are found, it will offer you a list of what was found, but chances are they will not work. Go to the previous section and download a cluster profile settings file for the correct Matlab version and try again.  You now need to define a "cluster" to submit jobs to. This holds the information about the parallel workers, etc. For most cases, it will suffice to enter a command like:
 You can then create and submit jobs to be run on the remote cluster.
The following is a simple example:
>> j = c.batch(@pwd, 1, {} ); additionalSubmitArgs = ntasks 1 licenses=mdcs:1 >> j.wait >> j.fetchOutputs(:) ans = /a/fs3/export/home/deepthought2/mltrain >> >> j.delete >>
The variable
j
holds the "job"; you can use whatever variable you like. In this case, the "job" is created when we create a batch job on our parclusterc
. For this example, we are simply running the builtinpwd
command; in most cases you would probably be including a string with the name of an user defined function (e.g. the name of a "*.m" file without the ".m" extension). The1
in the batch command means that the function is expected to return 1 argument. The braces{}
contain a list of input values to the function; in this case, thepwd
does not take input argument, so we do not provide any.The submission scripts will print the
additionalSubmitArgs
string. These are the arguments that will be provided to the Slurmsbatch
command; the web documentation on submitting jobs will have more information. As you gain experience with the system, you may wish to examine this to ensure that the job is being submitted correctly.The first time you submit a job to Deepthought2 in a particular Matlab session, a popup message will be displayed asking if you wish to "Use an identity file to login to login.deepthought2.umd.edu?". If you answer "No", you will be prompted for your password on the Deepthought2 cluster, and this is the recommended response for new users. Answering "Yes" requires one to setup RSA public key authentication on the Deepthought2 login nodes; you will be prompted to provide the location of the identity file and asked if the file requires a passphrase. In all cases, Matlab will remember this information (your password, or the location and/or passphrase to the identity file) for the remainder of your Matlab session.
When you issue the
batch
, a job is submitted to the scheduler to run on the Deepthought2 compute nodes. Depending on how busy the cluster is, the job might or might not start immediately, and even if it starts immediately, it will generally (except in overly simple test cases such as this) take a while to run. Thej.wait
will not return until the job is completed. You might instead wish to use thec.Jobs
command to see the state of all of your jobs. Although you can submit other jobs (be sure to store the job in different variables) and perform other calculations while your job(s) are pending/running, you cannot examine their output until they complete.To examine the results of a job (after it has completed), you can use the
j.fetchOutputs(:)
function as shown in the example. In the above example, you can see that it returned the path to the home directory of the Matlab test login account that it was run from. If the job does not finish successfully, you probably will not be able to get anything useful from thefetchOutputs
function. In such cases, you should look at the error logs (which can be lengthy) using thegetDebugLog
function. There are separate logs for each worker in the job, so you will need to do something like:>> j.Parent.getDebugLog(j.Tasks(1))
Note: The
fetchOutputs
function will only return the values returned from the function you called; data which has been written to files will not be returned. For such data, you will need to manually log into the Deepthought2 cluster to retrieve the information.The above example is unrealistically simple. In practice, you will generally need to set some more job parameters  although Matlab DCS hides some of the complexity of submitting jobs to an HPC cluster from the user, it cannot hide all of it. In general, the settings for your job will be obtained from the
ClusterInfo
object in Matlab. You can use the commandClusterInfo.state()
to see all of the current settings, and in general the commandsClusterInfo.getFOO()
andClusterInfo.setFOO(VALUE)
can be used to query the value of a particular setting FOO, or set such to VALUE. Notable fields are: WallTime: this sets the maximum wall time for the job.
If not set, the default is 15 minutes, which is probably too short for real
jobs.
This can be given using one of the following formats:
 MINUTES
 DAYSHOURS:MINUTES
 DAYSHOURS
 HOURS:MINUTES:SECONDS
 MemUsage: this sets the memory per CPUcore/task to be reserved. This should be given as a number of MB per core.
 ProjectName: this specifies the allocation account to which the job will be charged. Your default allocation account will be charged if none is specified.
 QueueName: this specifies the partition the job
should run on. Normally you will not wish to set this unless you wish
to run on the
debug
orscavenger
partitions.  UseGpu: If you wish for your job to use GPUs, you should set this to the number of GPUs to use. That will cause Slurm to schedule your job on a node with GPUs; additional work may be needed to get Matlab to actually use the GPUs.
 EmailAddress: if set, it will cause Slurm to send email to the address provided on all job state changes. The default is not to send any email.
 Reservation: if set, the job will use the specified reservation.
 UserDefinedOptions: This is a catchall for any other
options you need to provide to Slurm for your job. You should just present
sbatch
flags as you would on the command line. E.g., to specify that you wish to allow other jobs to run on the same node as your job, you can provide the valueshare
. You can provide multiple Slurm arguments in this string by just putting spaces between the arguments in the string.
The following example shows how to set a walltime of 4 hours and request 4 GB/core (4096 MB/core):
>> ClusterInfo.setWallTime('4:00:00') >> ClusterInfo.setMemUsage('4096')
 WallTime: this sets the maximum wall time for the job.
If not set, the default is 15 minutes, which is probably too short for real
jobs.
This can be given using one of the following formats:
>> c = parcluster;
You can choose whatever variable you like instead of
c
, but if so be sure to change it in the following examples as well.