Skip to content

Commit

Permalink
Merge branch 'dev' into nf-core-template-merge-2.13
Browse files Browse the repository at this point in the history
  • Loading branch information
nvnieuwk committed Feb 22, 2024
2 parents 4c397c8 + c6ba2ad commit 5d4effb
Show file tree
Hide file tree
Showing 148 changed files with 7,410 additions and 794 deletions.
21 changes: 14 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)
[![Launch on Nextflow Tower](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Nextflow%20Tower-%234256e7)](https://tower.nf/launch?pipeline=https://github.com/nf-core/variantbenchmarking)

[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23variantbenchmarking-4A154B?labelColor=000000&logo=slack)](https://nfcore.slack.com/channels/variantbenchmarking)[![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core)[![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core)[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core)
[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23benchmark-4A154B?labelColor=000000&logo=slack)](https://nfcore.slack.com/channels/variantbenchmarking)[![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core)[![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core)[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core)

## Introduction

Expand All @@ -30,14 +30,22 @@
workflows use the "tube map" design for that. See https://nf-co.re/docs/contributing/design_guidelines#examples for examples. -->
<!-- TODO nf-core: Fill in short bullet-pointed list of the default steps in the pipeline -->

1. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))
2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))
1. Standardization of SVs in test VCF files
2. Normalization of SVs in test VCF files
3. Normalization of SVs in truth VCF files
4. SV stats and histograms
5. Germline benchmarking of SVs
6. Somatic benchmarking of SVs
7. Final report and comparisons

## Usage

> [!NOTE]
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.
Supported SV callers: Manta, SVaba, Dragen, Delly, Lumpy ..
Available Truth samples: HG002, SEQC2

<!-- TODO nf-core: Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.
Explain what rows and columns represent. For instance (please edit as appropriate):
Expand All @@ -46,12 +54,11 @@ First, prepare a samplesheet with your input data that looks as follows:
`samplesheet.csv`:
```csv
sample,fastq_1,fastq_2
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz
caller,test_vcf
caller1,test1.vcf.gz
caller2,test2.vcf
```
Each row represents a fastq file (single-end) or a pair of fastq files (paired end).
-->

Now, you can run the pipeline using:
Expand Down
6 changes: 3 additions & 3 deletions assets/samplesheet.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
sample,fastq_1,fastq_2
SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz
SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz,
test_vcf,caller,vartype
"https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.vcf.gz",mutect,sv
"https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/vcf/sv_query.vcf.gz",unknown,sv
5 changes: 5 additions & 0 deletions assets/samplesheet_HG002.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
test_vcf,caller,vartype
"/Users/w620-admin/Desktop/nf-core/dataset/hg37/dragen_paper/HG002_delly_SV_hg19.vcf.gz",delly,sv
"/Users/w620-admin/Desktop/nf-core/dataset/hg37/dragen_paper/HG002_lumpy_SV_hg19.sorted.vcf.gz",lumpy,sv
"/Users/w620-admin/Desktop/nf-core/dataset/hg37/dragen_paper/HG002_manta_SV_hg19_genotype2.vcf",manta,sv
"/Users/w620-admin/Desktop/nf-core/dataset/hg37/Broad_svaba_05052017/full.svaba.germline.sv.vcf",svaba,sv
7 changes: 7 additions & 0 deletions assets/samplesheet_HG002_hg38.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
test_vcf,caller,vartype
"/Users/w620-admin/Desktop/nf-core/dataset/hg38/GIAB_GRCh38_SVs_06252018/ajtrio.lumpy.svtyper.HG002.md.sorted.recal.vcf.gz",lumpy,sv
"/Users/w620-admin/Desktop/nf-core/dataset/hg38/GIAB_GRCh38_SVs_06252018/manta.HG002.vcf.gz",manta,sv
"/Users/w620-admin/Desktop/nf-core/dataset/hg38/Ashkenazim_unnanotated/Ashkenazim_HG002.filtered.sv.vcf.gz",merged,sv
"/Users/w620-admin/Desktop/nf-core/dataset/hg38/HG002_DRAGEN_SV_hg19.vcf.gz",dragen,sv


3 changes: 3 additions & 0 deletions assets/samplesheet_SEQC2.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
test_vcf,caller
"/Users/w620-admin/Desktop/nf-core/dataset/hg38/SEQC_somatic_mutation_truth/test/WGS.bwa.dedup-IL_T_1_vs_IL_N_1-Strelka.indel.vcf.gz",strelka
"/Users/w620-admin/Desktop/nf-core/dataset/hg38/SEQC_somatic_mutation_truth/test/WGS.bwa.dedup-IL_T_1_vs_IL_N_1-MuTect2.vcf.gz",mutect2
58 changes: 27 additions & 31 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
@@ -1,33 +1,29 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "https://raw.githubusercontent.com/nf-core/variantbenchmarking/master/assets/schema_input.json",
"title": "nf-core/variantbenchmarking pipeline - params.input schema",
"description": "Schema for the file provided with params.input",
"type": "array",
"items": {
"type": "object",
"properties": {
"sample": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Sample name must be provided and cannot contain spaces",
"meta": ["id"]
},
"fastq_1": {
"type": "string",
"format": "file-path",
"exists": true,
"pattern": "^\\S+\\.f(ast)?q\\.gz$",
"errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'"
},
"fastq_2": {
"type": "string",
"format": "file-path",
"exists": true,
"pattern": "^\\S+\\.f(ast)?q\\.gz$",
"errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'"
}
},
"required": ["sample", "fastq_1"]
}
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "https://raw.githubusercontent.com/nf-core/variantbenchmarking/master/assets/schema_input.json",
"title": "nf-core/variantbenchmarking pipeline - params.input schema",
"description": "Schema for the file provided with params.input",
"type": "array",
"items": {
"type": "object",
"properties": {
"test_vcf": {
"type": "string",
"pattern": "",
"errorMessage": "Test VCF must be provided, cannot contain spaces and must have extension '.vcf.gz'"
},
"caller": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Name of the variant caller used to generate test file"
},
"vartype": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Variant type to benchmark"
}

},
"required": ["test_vcf","caller","vartype"]
}
}
Empty file added assets/svync/default.yaml
Empty file.
69 changes: 69 additions & 0 deletions assets/svync/delly.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
id: delly_$INFO/SVTYPE
alt:
BND: TRA
info:
CALLER:
value: delly
number: 1
type: string
description: The caller used to determine this variant
SVLEN:
value: ~sub:$INFO/END,$POS
number: 1
type: integer
description: The length of the structural variant
alts:
DEL: -~sub:$INFO/END,$POS
INS: $INFO/SVLEN
TRA: 1
CIEND:
value: $INFO/CIEND
number: 2
type: integer
description: PE confidence interval around END
CIPOS:
value: $INFO/CIPOS
number: 2
type: integer
description: PE confidence interval around POS
SVTYPE:
value: $INFO/SVTYPE
number: 1
type: string
description: Type of structural variant
CHR2:
value:
number: 1
type: string
description: Chromosome for second position
alts:
TRA: $INFO/CHR2
END:
value: $INFO/END
number: 1
type: integer
description: End position of the structural variant
alts:
TRA: $INFO/POS2
IMPRECISE:
value: $INFO/IMPRECISE
number: 0
type: flag
description: Imprecise structural variation
format:
GT:
value: $FORMAT/GT
number: 1
type: string
description: Genotype
PE:
value: $FORMAT/DR,$FORMAT/DV
number: 2
type: integer
description: Paired-read support for the ref and alt alleles in the order listed
SR:
value: $FORMAT/RR,$FORMAT/RV
number: 2
type: integer
description: Split-read support for the ref and alt alleles in the order listed

65 changes: 65 additions & 0 deletions assets/svync/gridss.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
id: gridss_$INFO/SVTYPE
info:
CALLER:
value: gridss
number: 1
type: string
description: The caller used to determine this variant
SVLEN:
value: ~sub:$INFO/END,$POS
number: 1
type: integer
description: The length of the structural variant
alts:
BND:
TRA: 0
DEL: -~sub:$INFO/END,$POS
CIEND:
value: $INFO/CIRPOS
number: 2
type: integer
description: PE confidence interval around END
CIPOS:
value: $INFO/CIPOS
number: 2
type: integer
description: PE confidence interval around POS
SVTYPE:
value: $INFO/SVTYPE
number: 1
type: string
description: Type of structural variant
CHR2:
value:
number: 1
type: string
description: Chromosome for second position
alts:
TRA: $INFO/CHR2
END:
value: $INFO/END
number: 1
type: integer
description: End position of the structural variant
IMPRECISE:
value: $INFO/IMPRECISE
number: 0
type: flag
description: Imprecise structural variation
format:
GT:
value: $FORMAT/GT
number: 1
type: string
description: Genotype
PE:
value: $FORMAT/REFPAIR,$FORMAT/RP
number: 2
type: integer
description: Paired-read support for the ref and alt alleles in the order listed
SR:
value: .,$FORMAT/SR
number: 2
type: integer
description: Split-read support for the ref and alt alleles in the order listed

66 changes: 66 additions & 0 deletions assets/svync/manta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
id: manta_$INFO/SVTYPE
info:
CALLER:
value: manta
number: 1
type: string
description: The caller used to determine this variant
SVLEN:
value: $INFO/SVLEN
number: 1
type: integer
description: The length of the structural variant
alts:
INS: ~sum:~len:LEFT_SVINSSEQ,~len:RIGHT_SVINSSEQ
TRA: 1
CIEND:
value: $INFO/CIEND
number: 2
type: integer
description: PE confidence interval around END
CIPOS:
value: $INFO/CIPOS
number: 2
type: integer
description: PE confidence interval around POS
SVTYPE:
value: $INFO/SVTYPE
number: 1
type: string
description: Type of structural variant
CHR2:
value:
number: 1
type: string
description: Chromosome for second position
alts:
TRA: $INFO/CHR2
END:
value: $INFO/END
number: 1
type: integer
description: End position of the structural variant
alts:
TRA: $INFO/POS2
IMPRECISE:
value: $INFO/IMPRECISE
number: 0
type: flag
description: Imprecise structural variation
format:
GT:
value: $FORMAT/GT
number: 1
type: string
description: Genotype
PE:
value: $FORMAT/PR
number: 2
type: integer
description: Paired-read support for the ref and alt alleles in the order listed
SR:
value: $FORMAT/SR
number: 2
type: integer
description: Split-read support for the ref and alt alleles in the order listed

Loading

0 comments on commit 5d4effb

Please sign in to comment.