Getting Started with cerebro
Overview
cerebro is a compute cluster. Unlike on the workstations, jobs are not run directly from the command line (e.g. just doing qchem file.in file.out
). Instead, jobs are sent to a queue which is managed by SLURM
. To submit a job, you need to write a submit file with information about your job then submit it to the queue using the sbatch
command. Jobs create a slurm-<jobid>.out
file which contains their terminal output. If you think something might have gone wrong with a job (e.g. something crashed, it ran out of time, etc...) the slurm
file is usually a good place to start looking for the issue.
General information about cerebro is available on the department website here.
Helpful Commands
sbatch
sbatch
is used to submit a job to the queue, e.g.
sbatch $old $long submit_file
You can specify whether a job runs on the old nodes (12 CPUs max) or the new nodes (16 CPUs max) using $old
and $new
, respectively.
You can also specify the partition on which you would like the job to run using one of four options:
Partition | sbatch Option | Time Limit | Other Notes |
---|---|---|---|
TEST | $test | 4 hours | highest priority partition - jobs should run right away |
LONG | $long | 48 hours | default partition |
XLONG | $xlong | 7 days | 96 core limit |
XXLONG | $xxlong | 30 days | 56 core limit |
More information about SLURM on cerebro is available on the department website here and here, and general information about SLURM can be found here.
squeue
To see all the jobs in the queue, do
squeue
To see all the jobs that you have queued or running, do
squeue -u <your-crsid>
scancel
To cancel a job, do
scancel <JOBID>
Make sure you have the right job ID!
More information on queuing is available here.
Using Q-Chem
If you have not used Q-Chem on cerebro before, first check whether you have access with which qchem
. If that does not return anything, you'll need to add a few lines to your .bashrc
file. For Q-Chem 5.3, add
# QChem export QC=trunk source /home/maf63/code/qcsetup-general.bash source ~/.slurmrc
to your .bashrc
file, then do source ~/.bashrc
. Doing which qchem
should give /sharedscratch/maf63/qchem-general/bin/qchem
.
Once you have Q-Chem set up, you'll be able to submit jobs. Here is a generic submit file for Q-Chem on cerebro:
#!/bin/bash # Set default outfile if not defined outfile=${outfile:-qchem.out} scratch=${scratch:-qchem.scratch} export OMP_NUM_THREADS=$SLURM_CPUS_ON_NODE # Run the calculation echo "Using Q-Chem: " $QC rm -rf $QCSCRATCH/$scratch cp -r $scratch $QCSCRATCH/$scratch qchem -nt $SLURM_CPUS_ON_NODE -save $infile $outfile $scratch # Recover the scratch directory (optional) # cp -r $QCSCRATCH/$scratch/* $scratch/
To use this submit file, copy it to submit.qchem
, then do
export infile=<your_qchem_input_file>
export outfile=<name_of_qchem_output_file>
export scratch=<name_of_a_scratch_directory>
sbatch <your options> submit.qchem
Using QCMagic
If you have not used QCMagic on cerebro before, you will need to install it first - instructions are available here.
Here is a submit file template for a QCMagic job:
#!/bin/bash export OMP_NUM_THREADS=$SLURM_CPUS_ON_NODE # Run the calculation runscanSurface.py -p 12 -L --read-minima=1 etc...
If you copy this information to a file called submit.qcmagic
, you can then submit it with
sbatch $old $test submit.qcmagic
You can include multiple commands in a submit file, e.g.:
#!/bin/bash export OMP_NUM_THREADS=$SLURM_CPUS_ON_NODE # Run the calculation runscanSurface.py -p 12 -L --read-minima=4 --read-only output_file.out state_4_read > term_read runqcSDExtract.py -p 12 -L --reconverge --rem="SCF_CONVERGENCE 10" state_4_read.sd state_4 > term_reconv runcombineSDXC.py -i state_1.sd state_2.sd state_3.sd state_4.sd -o states_1234 > term_sdxc runrunSDXC.py -p 12 -L --template=template.in states_1234.sd states_1234 > term_new