The KU Community Cluster uses SLURM (Simple Linux Utility for Resource Management) for managing job scheudling.
The cluster uses your KU Online ID and password.
- SSH: Use a SSH2 client to connect to
usernamewith your KU Online ID, and then authenticate with your KU Online ID password. Alternatively, you can set up public-key authentication. SSH connections to
hpc.crc.ku.eduresolve to either of the following login nodes:
- X2Go: X2Go is software which allows you to access the cluster using a graphical desktop window. This allows you to open GUI applications such as MATLAB on the cluster.
If you are connecting from any of the University of Kansas' campuses, you may connect as the instructions above show.
If you wish to connect the KU Community Cluster from off campus, you must connect through KU Anywhere. If you have multiple VPN Entitlements, any one of them will work. After successful connection, you may connect as instructed above.
Maximum number of jobs The maximum number of jobs a user can have submitted at one time is 5000
Batch jobs:To run a job in batch mode, use your favorite text editor to create a file which has SLURM options and also instructions on how to run your job, called a submission script. All SLURM options are prefaced with
#SBATCH. It is necessary to specify the partition you wish to run in. After your script is complete, you can submit the job to the cluster with command
A submission script is simply a text file that contains your job parameters and the commands you wish to execute as part of your job. You can also load modules, set environmental variables, or other tasks inside your submission script.
You may also submit simple jobs from the command line
srun --partition=sixhour echo Hello World!
SLURM options in your job script.Command-line options Command-line options will override
Interactive jobs: An interactive job allows you to open a shell on the compute node as if you had ssh'd into it. It is usually used for debugging purposes.
To submit an interactive job, use the
srun. Again, you must specify which
--partitionyou wish your job to run in.
srun --time=4:00:00 --ntasks=1 --nodes=1 --partition=sixhour --pty /bin/bash -l
In the example above, the job has requested:
--time=4:00:004 hours for the job run
--ntasks=11 task. By default, 1 core is given to each task.
--partition=sixhourJob to run in sixhour partition
--pty /bin/bashInteractive terminal running /bin/bash shell.
--time, --ntasks, --nodesare called options.
If you have ssh'd to the submit nodes with X11 forwarding enabled and wish to have X11 for an interactive job, then supply the
srun --time=4:00:00 --ntasks=4 --nodes=1 --partition=sixhour --x11 --pty /bin/bash -l
To run a job in batch mode on a high-performance computing system using SLURM, first prepare a job script that specifies the application you want to run and the resources required to run it, and then submit the script to SLURM using the
A very basic job script might contain just a
tcsh shell script. However, SLURM job scripts most commonly contain at least one executable command preceded by a list of options that specify resources and other attributes needed to execute the command (e.g., wall-clock time, the number of nodes and processors, and filenames for job output and errors). These options prefaced with the #SBATCH instruction, which should precede any executable lines in your job script.
Additionally, your SLURM job script (which will be executed under your preferred login shell) should begin with a line that specifies the command interpreter under which it should run.
default options are applied.Default Options If no SLURM options are given,
Slurm is very explicit in how one requests cores and nodes. While extremely powerful, the three flags,
--cpus-per-task can be a bit confusing at first.
The term task in this context can be thought of as a process. Therefore, a multi-process program (e.g. MPI) is comprised of multiple tasks. And a multi-threaded program is comprised of a single task, which can in turn use multiple CPUs. In SLURM, tasks are requested with the
--ntasks flag. CPUs, for the multithreaded programs, are requested with the
--mem option can be used to request the appropriate amount of memory for your job. Please make sure to test your application and set this value to a reasonable number based on actual memory use. The
%j in the
--output line tells SLURM to substitute the job ID in the name of the output file. You can also add
--error with an error file name to separate output and error logs.
#!/bin/bash #SBATCH --job-name=serial_job_test # Job name #SBATCH --partition=sixhour # Partition Name (Required) #SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --firstname.lastname@example.org # Where to send mail #SBATCH --ntasks=1 # Run on a single CPU #SBATCH --mem=1gb # Job memory request #SBATCH --time=0-00:05:00 # Time limit days-hrs:min:sec #SBATCH --output=serial_test_%j.log # Standard output and error log pwd; hostname; date module load python/3.6 echo "Running python script" python /path/to/your/python/script/script.py date
This script can serve as a template for applications that are capable of using multiple processors on a single server or physical computer. These applications are commonly referred to as threaded, OpenMP, PTHREADS, or shared memory applications. While they can use multiple processors, they cannot make use of multiple servers and all the processors must be on the same node.
These applications required shared memory and can only run on one node; as such it is important to remember the following:
- You must set
--ntasks=1, and then set
--cpus-per-taskto the number of threads you wish to use.
- You must make the application aware of how many processors to use. How that is done depends on the application:
- For some applications, set OMP_NUM_THREADS to a value less than or equal to the number of
- For some applications, use a command line option when calling that application.
- For some applications, set OMP_NUM_THREADS to a value less than or equal to the number of
#!/bin/bash #SBATCH --job-name=parallel_job # Job name #SBATCH --partition=sixhour # Partition Name (Required) #SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --email@example.com # Where to send mail #SBATCH --ntasks=1 # Run a single task #SBATCH --cpus-per-task=4 # Number of CPU cores per task #SBATCH --mem-per-cpu=2gb # Job memory request #SBATCH --time=0-00:05:00 # Time limit days-hrs:min:sec #SBATCH --output=parallel_%j.log # Standard output and error log pwd; hostname; date echo "Running on $SLURM_CPUS_PER_TASK cores" export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK module load compiler/gcc/6.3 /path/to/your/program
These are applications that can use multiple processors that may, or may not, be on multiple compute nodes. In SLURM, the
--ntasks flag specifies the number of MPI tasks created for your job. Note that, even within the same job, multiple tasks do not necessarily run on a single node. Therefore, requesting the same number of CPUs as above, but with the
--ntasks flag, could result in those CPUs being allocated on several, distinct compute nodes.
For many users, differentiating between
--cpus-per-task is sufficient. However, for more control over how SLURM lays out your job, you can add the
--nodes specifies how many nodes to allocate to your job. SLURM will allocate your requested number of cores to a minimal number of nodes on the cluster, so it is extremely likely if you request a small number of tasks that they will all be allocated on the same node. However, to ensure they are on the same node, set
--nodes=1 (obviously this is contingent on the number of CPUs and requesting too many may result in a job that will never run). Conversely, if you would like to ensure a specific layout, such as one task per node for memory, I/O or other reasons, you can also set
--ntasks-per-node=1. Note that the following must be true:
ntasks-per-node * nodes >= ntasks
The job below requests 16 tasks per node, with 2 nodes. By default, each task gets 1 core, so this job uses 32 cores. If the
--ntasks=16 option was used, it would only use 16 cores and could be on any of the nodes in the partition, even split between multiple nodes.
#!/bin/bash #SBATCH --partition=sixhour # Partition Name (Required) #SBATCH --ntasks-per-node=16 # 16 tasks per node with each task given 1 core #SBATCH --nodes=2 # Run across 2 nodes #SBATCH --constraint=ib # Only nodes with Infiniband (ib) #SBATCH --mem-per-cpu=4gb # Job memory request #SBATCH --time=0-06:00:00 # Time limit days-hrs:min:sec #SBATCH --output=mpi_%j.log # Standard output and error log echo "Running on $SLURM_JOB_NODELIST nodes using $SLURM_CPUS_ON_NODE cores on each node" mpirun /path/to/program
GPU and MIC (Intel Xeon Phi) nodes can be requested using the general consumable resource option (
--gres=gpu/mic). There are 3 different types of GPU cards in the KU Community Cluster set up as constraints. To run on a V100 GPU:
Multiple GPUs You may request multiple GPUs by changing the
--gres=gpu:2. Note that this value is per node. For example,
--gres=gpu:2will request 2 nodes with 2 GPUs each, for a total of 4 GPUs.
The job below request a single GPU node in the sixhour partition
#!/bin/bash #SBATCH --partition=sixhour # Partition Name (Required) #SBATCH --ntasks=1 # 1 task #SBATCH --time=0-06:00:00 # Time limit days-hrs:min:sec #SBATCH --gres=gpu # 1 GPU #SBATCH --output=gpu_%j.log # Standard output and error log module load singularity CONTAINERS=/panfs/pfs.local/software/install/singularity/containers singularity exec --nv $CONTAINERS/tensorflow-gpu-1.9.0.img python ./models/tutorials/image/mnist/convolutional.py
Submitting the Job
Submitting the SLURM job is done by command
sbatch. SLURM will read the submit file, and schedule the job according to the description in the submit file.
Submitting the job described above is:
$ sbatch example.sh Submitted batch job 62
Checking Job Status
To check the status of your job, use the
squeue command. It will provide information such as:
- The State (ST) of the job:
- R - Running
- PD - Pending - Job is awaiting resource allocation.
- Additional codes are available on the squeue page.
- Job Name
- Run Time
- Nodes running the job
Checking the status of jobs owned by a specific username, use the
$ squeue -u <username> JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 65 sixhour hello-wo <username> R 0:56 1 g004
Additionally, if you want to see the status of a specific partition, for example if you are part of a partition, you can use the
-p option to
$ squeue -p sixhour JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 73435 sixhour MyRandom jayhawk R 10:35:20 1 r10r29n1 73436 sixhour MyRandom jayhawk R 10:35:20 1 r10r29n1 73735 sixhour SW2_driv bigjay R 10:14:11 1 r31r29n1 73736 sixhour SW2_driv bigjay R 10:14:11 1 r31r29n1
Checking Job Start
You may view the start time of your job with the command
squeue --start. The output of the command will show the expected start time of the jobs.
$ squeue --start --user jayhawk JOBID PARTITION NAME USER ST START_TIME NODES NODELIST(REASON) 5822 sixhour Jobname bigjay PD 2018-08-24T00:05:09 3 (Priority) 5823 sixhour Jobname bigjay PD 2018-08-24T00:07:39 3 (Priority) 5824 sixhour Jobname bigjay PD 2018-08-24T00:09:09 3 (Priority) 5825 sixhour Jobname bigjay PD 2018-08-24T00:12:09 3 (Priority) 5826 sixhour Jobname bigjay PD 2018-08-24T00:12:39 3 (Priority) 5827 sixhour Jobname bigjay PD 2018-08-24T00:12:39 3 (Priority) 5828 sixhour Jobname bigjay PD 2018-08-24T00:12:39 3 (Priority) 5829 sixhour Jobname bigjay PD 2018-08-24T00:13:09 3 (Priority) 5830 sixhour Jobname bigjay PD 2018-08-24T00:13:09 3 (Priority) 5831 sixhour Jobname bigjay PD 2018-08-24T00:14:09 3 (Priority) 5832 sixhour Jobname bigjay PD N/A 3 (Priority)
The output shows the expected start time of the jobs, as well as the reason that the jobs are currently idle (in this case, low priority of the user due to running numerous jobs already).
Removing the Job
Removing the job is done with the
scancel command. The only argument to the
scancel command is the job id. The command is:
$ scancel 2234
sacct can be used to display currently running jobs and their usage and also previous job usage. It can be customized to look at certain options
$ sacct -u <user> 170 parallel_+ sixhour crc 4 COMPLETED 0:0 170.batch batch crc 4 COMPLETED 0:0 171 parallel_+ sixhour crc 4 CANCELLED+ 0:0 171.batch batch crc 4 CANCELLED 0:15
Show all job information starting form a specific date
$ sacct --starttime 2014-07-01
Show job account information for a specific job
$ sacct -j <jobid> $ sacct -j <jobid> -l
Features are requested under the
--constraints option. Because the cluster is consortium of hardware, attributes allow the user to specify which type of node they wish to use (e.g. ib, edr_ib, intel)
#SBATCH --constraint "intel" #SBATCH --constraint "intel&ib"
||At least FDR Infiniband connections|
||EDR Infiniband connections|
||Without Infiniband connections|
||NVIDIA K40 GPUs. Must request
||NVIDIA K80 GPUs. Must request
||NVIDIA V100 GPUs. Must request
Each owner group has their own partition. (e.g. bi, compbio, crmda). You can view partitions you can submit to by running
- 60-00:00:00 (60 days): Max walltime of owner partitions
Job Partition You must specify
--partitionfor your job. There is no default partition.
Other than the owner group partitions, there is a
sixhour partition. This partition will allow your jobs to go across all IDLE nodes in the cluster, but is limited to a wall time of 6 hours.
To run in the sixhour partition, specify
#SBATCH --partition sixhour in your job script.
All options below are prefixed with
#SBATCH. For example:
#SBATCH --partition=sixhour #SBATCH --job-name=Jobname
This is a brief list of the most commonly used SLURM options. All options can be on the SLURM Documentaiton.
Option Abbreviation Almost all options have a single letter abbreviation.
||Submit a job array, multiple jobs to be executed with identical parameters.|
||Advise the Slurm controller that ensuing job steps will require ncpus number of processors per task. Without this option, the controller will just try to allocate one processor per task.|
||Request which features the job requires.|
||Defer the start of this job until the specified dependencies have been satisfied completed.|
||Set the working directory of the batch script to directory before it is executed.|
||Instruct Slurm to connect the batch script's standard error directly to the file name specified in the "filename pattern". By default both standard output and standard error are directed to the same file.|
||Identify which environment variables are propagated to the launched application, by default all are propagated. Multiple environment variable names should be comma separated.|
||Specifies a comma delimited list of generic consumable resources. The format of each entry on the list is "name[:count]". Example: "--gres=gpu:2"|
||Specify a name for the job allocation.|
||Notify user by email when certain event types occur. Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL|
||User to receive email notification of state changes as defined by --mail-type.|
||Specify the real memory required per node. Default units are megabytes. See Memory Limits|
||Minimum memory required per allocated CPU. Default units are megabytes. See Memory Limits|
||sbatch does not launch tasks, it requests an allocation of resources and submits a batch script. This option advises the Slurm controller that job steps run within the allocation will launch a maximum of number tasks and to provide for sufficient resources. The default is one task per node, but note that the --cpus-per-task option will change this default.|
||Request that ntasks be invoked on each node.|
||Request that a minimum of minnodes nodes be allocated to this job. A maximum node count may also be specified with maxnodes. If only one number is specified, this is used as both the minimum and maximum node count.|
||Instruct Slurm to connect the batch script's standard output directly to the file name specified in the "filename pattern". By default both standard output and standard error are directed to the same file.|
||Request a specific partition for the resource allocation. If the job can use more than one partition, specify their names in a comma separate list. Required.|
||Set a limit on the total run time of the job allocation. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".|
If some options are not specified in the submission, default values will be set
--time=8:00:00(8 hours) for owner queues and the max is
--time=1:00:00(1 hour) for the
We reserve a chunk of memory per node for system services to prevent the node from crashing. This varies with the amount of memory reported by the server.
This also goes for
--mem-per-core. You'll have to take the number of cores requested per node and multiply that by your
--mem-per-core value and make sure it does not go above the allowed limit.
|Total amount of memory on node||Amount allowed to request|
|32 GB||30 GB|
|64 GB||61 GB|
|128 GB||125 GB|
|192 GB||186 GB|
|256 GB||250 GB|
|384 GB||376 GB|
|512 GB||503 GB|
|768 GB||754 GB|
Below are some common, useful SLURM commands:
||Used to report job or job step accounting information about active or completed jobs.|
||Reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options.|
||Used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared nodes within the job's node allocation.|
||Reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order.|
||Display the jobs submitted by the specified
||Display the jobs in the specified
||Check the status of a job (
||Show an estimate of when your job (
||Check the status of a node (
||Cancel a job.|