Snakemake with Slurm¶
This page describes how to use Snakemake with Slurm.
- This assumes that you have Miniconda properly setup with Bioconda.
- Also it assumes that you have already activated the Miniconda base environment with
We first create a new environment
snakemake-slurm and activate it.
We need the
snakemake package for this.
host:~$ conda create -y -n snakemake-slurm snakemake [...] # # To activate this environment, use # # $ conda activate snakemake-slurm # # To deactivate an active environment, use # # $ conda deactivate host:~$ conda activate snakemake-slurm (snakemake-slurm) host:~$
Snakemake Workflow Setup¶
We create a workflow and ensure that it works properly with multi-threaded Snakemake (no cluster submission here!)
host:~$ mkdir -p snake-slurm host:~$ cd snake-slurm host:snake-slurm$ cat >Snakefile <<"EOF" rule default: input: "the-result.txt" rule mkresult: output: "the-result.txt" shell: r"sleep 1m; touch the-result.txt" EOF host:snake-slurm$ snakemake --cores=1 [...] host:snake-slurm$ ls Snakefile the-result.txt host:snake-slurm$ rm the-result.txt
Snakemake and Slurm¶
You have two options:
- Simply use
snakemake --profile=cubi-v1and the Snakemake resource configuration as shown below. STRONGLY PREFERRED
- Use the
snakemake --cluster='sbatch ...'command.
Note that we sneaked in a
sleep 1m? In a second terminal session, we can see that the job has been submitted to SLURM indeed.
host:~$ squeue -u holtgrem_c JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 325 debug snakejob holtgrem R 0:47 1 med0127
Threads & Resources¶
cubi-v1 profile (stored in
/etc/xdg/snakemake/cubi-v1 on all cluster nodes) supports the following specification in your Snakemake rule:
threads: the number of threads to execute the job on
- memory in a syntax understood by Slurm, EITHER
resources.mem_mb: the memory to allocate for the whole job, OR
resources.mem_per_thread: the memory to allocate for each thread.
resources.time: the running time of the rule, in a syntax supported by Slurm, e.g.
resources.partition: the partition to submit your job into (Slurm will pick a fitting partition for you by default)
resources.nodes: the number of nodes to schedule your job on (defaults to
1and you will want to keep that value unless you want to use MPI)
You will need Snakemake >=7.0.2 for this.
Here is how to call Snakemake:
# snakemake --profile=cubi-v1 -j1
To set rule-specific resources:
rule myrule: threads: 1 resources: mem='8G', time='04:00:00', input: # ... output: # ... shell: # ...
You can combine this with Snakemake resource callables, of course:
def myrule_mem(wildcards, attempt): mem = 2 * attempt return '%dG' % mem rule snps: threads: 1 resources: mem=myrule_mem, time='04:00:00', input: # ... output: # ... shell: # ...
Custom logging directory¶
By default, slurm will write log files into the working directory of snakemake, which will look like
To change this behaviour, the environment variable
SBATCH_DEFAULTS can be set to re-route the
--output parameter. If you want to write your files into
slurm_logs with a filename pattern of
$name-$jobid for instance, consider the following snippet for your submission script:
#!/bin/bash # #SBATCH --job-name=snakemake_main_job #SBATCH --ntasks=1 #SBATCH --nodes=1 #SBATCH --time=48:10:00 #SBATCH --mem-per-cpu=300M #SBATCH --output=slurm_logs/%x-%j.log mkdir -p slurm_logs export SBATCH_DEFAULTS=" --output=slurm_logs/%x-%j.log" date srun snakemake --use-conda -j1 --profile=cubi-v1 date
The name of the snakemake slurm job will be
snakemake_main_job, the name of the jobs spawned from it will be called after the rule name in the Snakefile.