A guide to using SLURM to run GPU jobs on pat
We now have a queueing system set up for use with some of our GPU machines. The head node is pat.ch.private.cam.ac.uk and all jobs should be submitted to the queue from there. More detailed information on setup and usage can be found at: http://www.ch.cam.ac.uk/computing/abc-cluster
Cards
Currently, pat has 15 GeForce GTX TITAN Black GPUs, 12 Tesla K20m GPUs and 16 GeForce GTX 980 GPUs (Maxwell architecture) on its nodes (only 8 of these are currently available as the racks are being used for extra Titan Black GPUs from eBay - 19/12/16). The Titan and Tesla cards should only be used for applications which require double precision, such as CUDAGMIN and CUDAOPTIM. The Maxwell cards are designed for single precision applications such as AMBER's pmemd.cuda in its default mode. Please do not use the double precision cards for running AMBER.
Queueing system
The queueing system on pat is SLURM. Detailed information on using SLURM can be found in the documentation. The current walltime is seven days. As on sinister, jobs should be run on local /scratch on the nodes rather than the NFS-mounted /home and /sharedscratch. The progress of your job can be viewed by sshing into the appropriate node.
Example SLURM submission script for a job running on a single GPU
The following example script can be submitted to the queue by typing 'sbatch scriptname' at the terminal.
#!/bin/bash # Request 1 TITAN Black GPU - use '--constraint=teslak20' for a Tesla or '--constraint=maxwell' to request a Maxwell GPU for single precision runs #SBATCH --constraint=titanblack #SBATCH --job-name=mytestjob #SBATCH --gres=gpu:1 #SBATCH --mail-type=FAIL hostname echo "Time: `date`" source /etc/profile.d/modules.sh # Load the appropriate compiler modules on the node - should be the same as those used to compile the executable on pat module add cuda/7.0 module add icc/64/2015/3/187 module add anaconda/python2/2.2.0 # Needed for python networkx module - must be python 2, not 3 # Set the GPU to exclusive process mode sudo nvidia-smi -i $CUDA_VISIBLE_DEVICES -c 3 # Make a temporary directory on the node, copy job files there and change to that directory TMP=/scratch/$USER/$SLURM_JOB_ID mkdir -p $TMP cp ${SLURM_SUBMIT_DIR}/{atomgroups,coordsinirigid,coords.inpcrd,coords.prmtop,data,min.in,rbodyconfig} $TMP cd $TMP # Run the executable in the local node scratch directory /home/$USER/svn/GMIN/build/CUDAGMIN # Copy all files back to the original submission directory cp * $SLURM_SUBMIT_DIR STATUS=$? echo "$STATUS" if [ $STATUS == 0 ]; then echo "No error in cp" cd $SLURM_SUBMIT_DIR rm -rf $TMP fi echo Finished at `date`
Example SLURM submission script for a PATHSAMPLE job using CUDAOPTIM
With SLURM, you must request the same number of GPUs on each node you are using. CUDAOPTIM also requires a CPU for each GPU being used, so make sure this is set to the same number. The example script below will run eight simultaneous CUDAOPTIM jobs, four on each node. The number of GPUs per node can be between one and four. There are currently six nodes available that can be used to run CUDAOPTIM jobs.
Also, note that you must have each node you are using set up such that you can ssh into any other node without having to type a password. To do that, first generate a public/private RSA key pair using:
ssh-keygen -t rsa
Do not enter a passphrase when prompted (just press enter). Then type:
cat .ssh/id_rsa.pub >> .ssh/authorized_keys
Test whether this has worked by sshing into one of the nodes and then another directly from there. All the nodes see the same home directory, so if it's working for one node then it should work for all the rest.
IMPORTANT: your pathdata file must contain the keywords SLURM and CUDA. It should not contain SSH or PBS. Using ssh for job submission allows your jobs to use GPUs that were not allocated to them by the queueing system, so it is really important to use the SLURM keyword to avoid crashing other people's jobs!
#!/bin/bash #SBATCH --job-name=mypathsamplejob #SBATCH --nodes=2 # Specify the number of nodes you want to run on #SBATCH --gres=gpu:4 # Specify the number of GPUs you want per node #SBATCH --ntasks-per-node=4 # Specify a number of CPUs **equal to** the number of GPUs requested per node #SBATCH --constraint='teslak20|titanblack' # Use either Titan or Tesla nodes or some combination #SBATCH --requeue # Requeue job in the case of node failure #SBATCH --mail-type=FAIL # Receive an email if your job fails echo "Time: `date`" source /etc/profile.d/modules.sh # Load the appropriate compiler modules on the nodes - should be the same as those used to compile the executables on pat module add cuda/6.5 module add icc/64/2013_sp1/4/211 module add anaconda/python2/2.2.0 # Needed for python networkx module - must be python 2, not 3 echo "Setting GPUs to exclusive process mode on: "; srun hostname srun -l sudo nvidia-smi -i $CUDA_VISIBLE_DEVICES -c 3 gpuspernode=0 visibledevices=$CUDA_VISIBLE_DEVICES for i in $(echo $visibledevices | sed "s/,/ /g") do gpuspernode=$(( gpuspernode + 1 )) done totalnumgpus=$(( gpuspernode * $SLURM_JOB_NUM_NODES )) echo "Total number of GPUs requested: $totalnumgpus" echo $totalnumgpus > nodes.info srun hostname >> nodes.info echo $USER >> nodes.info pwd >> nodes.info # If using a cluster other than pat and your slurm version is 14 or lower, prefix the executable with srun -N1 -n1 /home/$USER/svn/PATHSAMPLE/build/PATHSAMPLE > output echo Finished at `date`