Quick Start
Installation
Requirements
-
Check that nextflow is installed on your cluster and load it.
Preferred version:
21.10 or greater
-
Singularity has now been changed to Apptainer, but most clusters to run it as singularity. We’ll need singularity to pull the
gencall
docker container from our dockerhub account as a singularity image file.Check if singularity is installed on your cluster and load it.
Preferred version:
3.5 or greater
Install
Navigate to the path where you would like to run the workflow from and run the code below.
wget https://github.com/GeneMAP-Research/genemapngs/archive/refs/tags/v1.0.0.tar.gz
tar zxvf v1.0.0.tar.gz
rm v1.0.0.tar.gz
cd genemapngs-1.0.0
Without the main workflow configuration file (
nextflow.config
) nextflow cannot run at all.Copy the
system.config
file tonextflow.config
.cp system.config nextflow.config
Then edit the
nextflow.config
file with the correct parameter values.
Things to note before running the workflow
The workflow runs on the concept of nextflow profiles.
There are three profile categories:
- executors: there are three executors based on where you are working
- local: this can be used anywhere; your computer (laptop) or any cluster (slurm, pbspro, etc).
- slurm: cluster running a slurm job workload manager/scheduler.
- pbspro: cluster running a pbspro job workload manager/scheduler.
- containers: there are three containers
- apptainer: formerly singularity. Some clusters might still not run it.
- singularity: now apptainer. Most clusters still run it.
- docker: Most clusters do not run docker for security reasons. It can be used on local computers.
- references: there are three references
- hg19: human reference build 37 or GRCh37
- hg38: human reference build 38 or GRCh38
- t2t: human Telomere-to-Telomere reference (T2T-CHM13)
The workflow commandline is built as follows.
nextflow run <workflow script> -profile <executor>,<container>,<reference> -w <work directory>
Test the installation
This example will pull and use a plink2 image (which is light-weight) using singularity.
First, make sure nextflow and singularity are loaded and working. Check how these are loaded on your system. This should be done on a worker node via interactive job.
module load <your nextflow version>
Most systems have singularity automatically loaded on worker nodes. Try by simply running
singularity
.Otherwise, load singularity
module load <your singularity version>
./genemapngs.sh test
nextflow -c test.config run test.nf -w workdir -profile singularity
If all goes well, you should see something like this, but can be different based on your plink2 version
N E X T F L O W ~ version 23.04.3
Launching `test.nf` [condescending_hopper] DSL2 - revision: 943b38a3a4
ALIGNMENT WORKFLOW: TEST
pe=true
aligner=BWA
ftype=FASTQ
input_dir=/path/to/fastq/
output_dir=/path/to/save/results/
output_prefix=my-ngs
single_caller=gatk
exome=true
joint_caller=gatk
gvcf_dir=NULL
spark=false
threads=1
njobs=1
executor > local (1)
[52/6e9e57] process > plink (processing ...) [100%] 1 of 1 ✔
PLINK2 is used for the test as it is light-weight and easily pulled from docker hub
PLINK v2.00a3.3LM 64-bit Intel (3 Jun 2022) www.cog-genomics.org/plink/2.0/
(C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public License v3
--pedmap <prefix> : Specify .ped + .map filename prefix.
--ped <filename> : Specify full name of .ped file.
--map <filename> : Specify full name of .map file.
Workflow completed at: 2024-06-09T06:47:43.261741+02:00
Execution status: OK
Test installation without selecting profile
The workflow will be executed locally and nextflow will expect all tools to be already loaded. So, we must load plink2 for the test.
module load <your version of plink2>
./genemapngs.sh test #you don't need to run this again if you already did
nextflow -c test.config run test.nf -w workdir
# The result should be the same as above
Final setup
Central to nexflow workflow are configuration (config) files. They direct nextflow on where to look for stuff or place stuff. There are a few more config files, in addition to the main config file, that we need to setup for the workflow to run smoothly. We will do this step by step
- …
You might see a few warnings:
- regarding
echo
anddebug
. These are caused by different versions of nextflow and do not pose any issues.- regarding singularity cache directory. As long as you set a value for
containers_dir
in yournextflow.config
file, it should be no problem.If the containers directory is not set, the workflow will create one in your work directory.
under development