diff --git a/README.md b/README.md index 39daff6..e68c04e 100644 --- a/README.md +++ b/README.md @@ -39,31 +39,32 @@ This initial step ensures consistent formatting and alignment of variants in tes 4. Rename sample names in test and truth VCF files ([bcftools reheader](https://samtools.github.io/bcftools/bcftools.html#reheader)) 5. Splitting multi-allelic variants in test and truth VCF files ([bcftools norm](https://samtools.github.io/bcftools/bcftools.html#norm)) 6. Deduplication of variants in test and truth VCF files ([bcftools norm](https://samtools.github.io/bcftools/bcftools.html#norm)) -7. Use prepy in order to normalize test files. This option is only applicable for happy benchmarking of germline analysis ([prepy](https://github.com/Illumina/hap.py/tree/master)) -8. Split SNVs and indels if the given test VCF contains both. This is only applicable for somatic analysis ([bcftools view](https://samtools.github.io/bcftools/bcftools.html#view)) +7. Left aligning of variants in test and truth VCF files ([bcftools norm](https://samtools.github.io/bcftools/bcftools.html#norm)) +8. Use prepy in order to normalize test files. This option is only applicable for happy benchmarking of germline analysis ([prepy](https://github.com/Illumina/hap.py/tree/master)) +9. Split SNVs and indels if the given test VCF contains both. This is only applicable for somatic analysis ([bcftools view](https://samtools.github.io/bcftools/bcftools.html#view)) ### Filtering options: Applying filtering on the process of benchmarking itself might makes it impossible to compare different benchmarking strategies. Therefore, for whom like to compare benchmarking methods this subworkflow aims to provide filtering options for variants. -9. Filtration of contigs ([bcftools view](https://samtools.github.io/bcftools/bcftools.html#view)) -10. Include or exclude SNVs and INDELs ([bcftools filter](https://samtools.github.io/bcftools/bcftools.html#filter)) -11. Size and quality filtering for SVs ([SURVIVOR filter](https://github.com/fritzsedlazeck/SURVIVOR/wiki)) +10. Filtration of contigs ([bcftools view](https://samtools.github.io/bcftools/bcftools.html#view)) +11. Include or exclude SNVs and INDELs ([bcftools filter](https://samtools.github.io/bcftools/bcftools.html#filter)) +12. Size and quality filtering for SVs ([SURVIVOR filter](https://github.com/fritzsedlazeck/SURVIVOR/wiki)) ### Liftover of vcfs: -This sub-workflow provides option to convert genome coordinates of truth VCF and high confidence BED file to a new assembly. Golden standard truth files are build upon specific reference genomes which makes the necessity of lifting over depending on the test VCF in query. Lifting over one or more test vcfs is also possible. +This sub-workflow provides option to convert genome coordinates of truth VCF and test VCFs and high confidence BED file to a new assembly. Golden standard truth files are build upon specific reference genomes which makes the necessity of lifting over depending on the test VCF in query. Lifting over one or more test VCFs is also possible. -12. Create sequence dictionary for the reference ([picard CreateSequenceDictionary](https://gatk.broadinstitute.org/hc/en-us/articles/360037068312-CreateSequenceDictionary-Picard)). This file can be saved and reused. -13. Lifting over truth variants ([picard LiftoverVcf](https://gatk.broadinstitute.org/hc/en-us/articles/360037060932-LiftoverVcf-Picard)) -14. Lifting over high confidence coordinates ([UCSC liftover](http://hgdownload.cse.ucsc.edu/admin/exe)) +13. Create sequence dictionary for the reference ([picard CreateSequenceDictionary](https://gatk.broadinstitute.org/hc/en-us/articles/360037068312-CreateSequenceDictionary-Picard)). This file can be saved and reused. +14. Lifting over VCFs ([picard LiftoverVcf](https://gatk.broadinstitute.org/hc/en-us/articles/360037060932-LiftoverVcf-Picard)) +15. Lifting over high confidence coordinates ([UCSC liftover](http://hgdownload.cse.ucsc.edu/admin/exe)) ### Statistical inference of input test and truth variants: This step provides insights into the distribution of variants before benchmarking. -15. Get statistics of SNVs, INDELs and complex variants ([bcftools stats](https://samtools.github.io/bcftools/bcftools.html#stats)) -16. Get statistics of SVs by type ([SURVIVOR stats](https://github.com/fritzsedlazeck/SURVIVOR/wiki)) +16. Get statistics of SNVs, INDELs and complex variants ([bcftools stats](https://samtools.github.io/bcftools/bcftools.html#stats)) +17. Get statistics of SVs by type ([SURVIVOR stats](https://github.com/fritzsedlazeck/SURVIVOR/wiki)) ### Benchmarking of variants: @@ -71,34 +72,35 @@ Actual benchmarking of variants are split between SVs and small variants: Available methods for SVs: -17. Germline and somatic variant benchmarking using Truvari ([truvari bench](https://github.com/acenglish/truvari/wiki/bench)) -18. Germline and somatic variant benchmarking using SVanalyzer ([svanalyzer benchmark](https://github.com/nhansen/SVanalyzer/blob/master/docs/svbenchmark.rst)) +18. Germline and somatic variant benchmarking using Truvari ([truvari bench](https://github.com/acenglish/truvari/wiki/bench)) +19. Germline and somatic variant benchmarking using SVanalyzer ([svanalyzer benchmark](https://github.com/nhansen/SVanalyzer/blob/master/docs/svbenchmark.rst)) Available methods for CNVs: -19. Germline and somatic variant benchmarking using Wittyer ([witty.er](https://github.com/Illumina/witty.er/tree/master)) +20. Germline and somatic variant benchmarking using Wittyer ([witty.er](https://github.com/Illumina/witty.er/tree/master)) Available methods for SNVs and INDELs: -20. Germline variant benchmarking using RTG-tools ([rtg vcfeval](https://realtimegenomics.com/products/rtg-tools)) -21. Germline variant benchmarking using Happy tools ([hap.py](https://github.com/Illumina/hap.py/blob/master/doc/happy.md)) -22. Somatic variant benchmarking using Sompy ([som.py](https://github.com/Illumina/hap.py/tree/master?tab=readme-ov-file#sompy)) +21. Germline variant benchmarking using RTG-tools ([rtg vcfeval](https://realtimegenomics.com/products/rtg-tools)) +22. Germline variant benchmarking using Happy tools ([hap.py](https://github.com/Illumina/hap.py/blob/master/doc/happy.md)) +23. Somatic variant benchmarking using Sompy ([som.py](https://github.com/Illumina/hap.py/tree/master?tab=readme-ov-file#sompy)) ### Comparison of benchmarking results per TP, FP and FN files It is essential to compare benchmarking results in order to infer uniquely or commonly seen TPs, FPs and FNs. -23. Merging TP, FP and FN results for happy, rtgtools and sompy ([bcftools merge](https://samtools.github.io/bcftools/bcftools.html#merge)) -24. Merging TP, FP and FN results for Truvari and SVanalyzer ([SURVIVOR merge](https://github.com/fritzsedlazeck/SURVIVOR/wiki)) -25. Conversion of VCF files to CSV to infer common and unique variants per caller (python script) +24. Merging TP, FP and FN results for happy, rtgtools and sompy ([bcftools merge](https://samtools.github.io/bcftools/bcftools.html#merge)) +25. Merging TP, FP and FN results for Truvari and SVanalyzer ([SURVIVOR merge](https://github.com/fritzsedlazeck/SURVIVOR/wiki)) +26. Conversion of VCF files to CSV to infer common and unique variants per caller (python script) ### Reporting of benchmark results The generation of comprehensive report that consolidates all benchmarking results. -26. Merging summary statistics per benchmarking tool (python script) -27. Plotting benchmark metrics per benchmarking tool (R script) -28. Create visual HTML report for the integration of NCBENCH ([datavzrd](https://datavzrd.github.io/docs/index.html)) +27. Merging summary statistics per benchmarking tool (python script) +28. Plotting benchmark metrics per benchmarking tool (R script) +29. Create visual HTML report for the integration of NCBENCH ([datavzrd](https://datavzrd.github.io/docs/index.html)) +30. Apply MultiQC to visualize results @@ -121,7 +123,7 @@ test3,test3.vcf.gz,cnvkit Each row represents a vcf file (test-query file). For each vcf file and variant calling method (caller) have to be defined. -User has to define or provide truth vcf in config files. There are readily available vcf files for benchmarking from Genome in a bottle and SEQC2 studies which can be used readily. Please find detailed information about truth files [here](https://nf-co.re/variantbenchmarking/truth) +User _has to provide truth vcf in config files_. There are readily available vcf files for benchmarking from Genome in a bottle and SEQC2 studies which can be used readily. Please find detailed information about truth files [here](https://nf-co.re/variantbenchmarking/truth) For more details and further functionality, please refer to the [usage documentation](https://nf-co.re/variantbenchmarking/usage) and the [parameter documentation](https://nf-co.re/variantbenchmarking/parameters). @@ -133,8 +135,9 @@ nextflow run nf-core/variantbenchmarking \ --input samplesheet.csv \ --outdir \ --genome GRCh37 \ - --sample HG002 - --analysis germline + --analysis germline \ + --truth_id HG002 \ + --truth_vcf truth.vcf.gz ``` > [!WARNING] diff --git a/conf/modules.config b/conf/modules.config index ea28690..9c9888d 100644 --- a/conf/modules.config +++ b/conf/modules.config @@ -25,73 +25,167 @@ process { saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } - // standardization and normalization tools - withName: "BCFTOOLS_NORM" { - ext.prefix = { vcf.baseName - ".vcf" + ".norm"} - ext.args = {"--output-type z -m-any -c w" } + + // subsample_vcf test + + withName: BCFTOOLS_SORT { + ext.prefix = { vcf.baseName - ".vcf" + ".sort"} + ext.args = {"--output-type z --write-index=tbi" } + publishDir = [ + enabled: false + ] } - withName: "VARIANT_EXTRACTOR" { - ext.prefix = { input.baseName - ".vcf" } + + withName: BCFTOOLS_VIEW_SUBSAMPLE { + ext.prefix = { vcf.baseName - ".vcf" + ".subsample" } + ext.args = {"--output-type z -s ${meta.subsample}" } publishDir = [ - path: {"${params.outdir}/${params.variant_type}/${meta.id}/preprocess"}, - pattern: "*{.vcf.gz}", - mode: params.publish_dir_mode + enabled: false + ] + } + + withName: BCFTOOLS_VIEW_FILTERMISSING { + ext.prefix = { vcf.baseName - ".vcf" + ".filtermissing" } + ext.args = {"--output-type z -e 'AC=0'" } + publishDir = [ + enabled: false + ] + } + + // sv_vcf_conversions + + withName: VARIANT_EXTRACTOR { + ext.prefix = { input.baseName - ".vcf" + ".variantextract" } + publishDir = [ + enabled: false ] } + withName: SVYNC { - ext.prefix = {"${meta.id}.${meta.caller}.svync"} + ext.prefix = {vcf.baseName - ".vcf" + ".svync"} publishDir = [ - path: {"${params.outdir}/${params.variant_type}/${meta.id}/preprocess"}, - pattern: "*{.vcf.gz,vcf.gz.tbi}", + path: { "${params.outdir}/test" }, + enabled: false + ] + } + + withName: BGZIP_TABIX { + publishDir = [ + enabled: false + ] + } + //// prepare_vcfs //// + + // liftover_vcfs + + withName: PICARD_CREATESEQUENCEDICTIONARY { + publishDir = [ + path: {"${params.outdir}/references/dictionary"}, + pattern: "*{.dict}", mode: params.publish_dir_mode ] } - withName: "BCFTOOLS_DEDUP" { - ext.prefix = { vcf.baseName - ".vcf" + ".dedup"} - ext.args = {"--output-type z --rm-du exact -c w" } + + withName: PICARD_LIFTOVERVCF { + ext.prefix = {input_vcf.baseName - ".vcf"} + ext.args = {"--WARN_ON_MISSING_CONTIG true"} + publishDir = [ + enabled: false + ] } - withName: "BCFTOOLS_SORT" { - ext.prefix = { vcf.baseName - ".vcf" + ".sort"} - ext.args = {"--output-type z" } + withName: BCFTOOLS_RENAME_CHR { + ext.prefix = {input.baseName - ".vcf" + ".renamechr"} + ext.args = {"--output-type z"} publishDir = [ - path: {"${params.outdir}/${params.variant_type}/${meta.id}/preprocess"}, + path: {params.liftover == "truth"? "${params.outdir}/${params.variant_type}/${params.truth_id}/preprocess/liftover" : "${params.outdir}/${params.variant_type}/${meta.id}/preporcess/liftover"}, pattern: "*{.vcf.gz}", mode: params.publish_dir_mode ] } - withName: "BCFTOOLS_REHEADER" { + withName: UCSC_LIFTOVER { + ext.prefix = {bed.baseName - ".bed"} + publishDir = [ + enabled: false + ] + } + withName: SORT_BED { + ext.prefix = {bed.baseName - ".bed" + "sort"} + publishDir = [ + enabled: false + ] + } + withName: BEDTOOLS_MERGE { + ext.prefix = {bed.toString() - ".bed" + ".merged" } + publishDir = [ + path: {params.liftover == "truth"? "${params.outdir}/${params.variant_type}/${params.truth_id}/preprocess/liftover" : "${params.outdir}/${params.variant_type}/${meta.id}/preporcess/liftover"}, + pattern: "*{.bed}", + mode: params.publish_dir_mode + ] + } + withName: REFORMAT_HEADER { + ext.prefix ={["${meta.id}", + (meta.tag != null) ? ".${meta.tag}" : '' + ].join('').trim() + } + publishDir = [ + enabled: false + ] + + } + + withName: "BCFTOOLS_REHEADER*" { beforeScript = {[ "echo ${meta.id}", (meta.caller != null )? ".${meta.caller}" : "", " > ${meta.id}.txt" ].join('').trim()} ext.args = { "--samples ${meta.id}.txt" } - ext.args2 = {"--output-type z" } - ext.prefix = { vcf.baseName - ".vcf" + ".rh"} + ext.args2 = {"--output-type z --write-index=tbi" } + ext.prefix = { vcf.baseName - ".vcf" + ".reheader"} + publishDir = [ + enabled: false + ] } - // splitting tools - withName: BCFTOOLS_VIEW_SUBSAMPLE { - ext.prefix = { vcf.baseName - ".vcf" + ".subsample" } - ext.args = {"--output-type z -s ${meta.subsample}" } + + // filtering contigs + withName: BCFTOOLS_VIEW_CONTIGS { + ext.prefix = { vcf.baseName - ".vcf.gz" + ".filtercontigs" } + ext.args = {[ + "--output-type z --write-index=tbi", + (params.genome.contains("38"))? "-r chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY" : "-r 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y" + ].join(' ').trim() } + publishDir = [ + enabled: false + ] } - withName: BCFTOOLS_VIEW_SNV { - ext.prefix = { vcf.baseName - ".vcf" + ".snv" } - ext.args = {"--output-type v --types snps" } + + // bcftools normalize + withName: BCFTOOLS_NORM { + ext.prefix = { vcf.baseName - ".vcf" + ".norm"} + ext.args = {"--output-type z -c w --write-index=tbi" } publishDir = [ - path: {"${params.outdir}/${params.variant_type}/${meta.id}/preprocess"}, - pattern: "*{.vcf}", - mode: params.publish_dir_mode + enabled: false ] } - withName: BCFTOOLS_VIEW_INDEL { - ext.prefix = { vcf.baseName - ".vcf" + ".indel" } - ext.args = {"--output-type v --types indels" } + + //bcftools split multi allelics + withName: BCFTOOLS_SPLIT_MULTI { + ext.prefix = { vcf.baseName - ".vcf" + ".split"} + ext.args = {"--output-type z -m-any -c w --write-index=tbi" } publishDir = [ - path: {"${params.outdir}/${params.variant_type}/${meta.id}/preprocess"}, - pattern: "*{.vcf}", - mode: params.publish_dir_mode + enabled: false ] } + + // bcftools deduplicate variants + withName: BCFTOOLS_DEDUP { + ext.prefix = { vcf.baseName - ".vcf" + ".dedup"} + ext.args = {"--output-type z --rm-du exact -c w" } + publishDir = [ + enabled: false + ] + } + // filtering tools withName: BCFTOOLS_FILTER { ext.prefix = { vcf.baseName - ".vcf" + ".filter"} @@ -101,39 +195,51 @@ process { (params.exclude_expression != null )? "--exclude '$params.exclude_expression'" : "" ].join(' ').trim() } publishDir = [ - path: {"${params.outdir}/${params.variant_type}/${meta.id}/preprocess/"}, - pattern: "*{.vcf}", - mode: params.publish_dir_mode + enabled: false + ] + } + + withName: TABIX_TABIX { + publishDir = [ + enabled: false ] } + withName: SURVIVOR_FILTER { ext.prefix = { vcf_file.baseName - ".vcf" + ".filter"} publishDir = [ - path: {"${params.outdir}/${params.variant_type}/${meta.id}/preprocess"}, - pattern: "*{.vcf}", - mode: params.publish_dir_mode + enabled: false ] } - withName: BCFTOOLS_VIEW_FILTERMISSING { - ext.prefix = { vcf.baseName - ".vcf" + ".filtermiss" } - ext.args = {"--output-type z -e 'AC=0'" } + + // split_small_variants_test + + withName: BCFTOOLS_VIEW_SNV { + ext.prefix = { vcf.baseName - ".vcf" + ".snv" } + ext.args = {"--output-type v --types snps" } publishDir = [ - path: { "${params.outdir}/test" }, enabled: false ] } - withName: BCFTOOLS_VIEW_CONTIGS { - ext.prefix = { vcf.baseName - ".vcf" + ".nocontigs" } - ext.args = {[ - "--output-type v", - (params.genome.contains("38"))? "-r chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY" : "-r 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y" - ].join(' ').trim() } + + withName: BCFTOOLS_VIEW_INDEL { + ext.prefix = { vcf.baseName - ".vcf" + ".indel" } + ext.args = {"--output-type v --types indels" } publishDir = [ - path: { "${params.outdir}/test" }, enabled: false ] } - // Variant stats + + withName: 'PUBLISH_PROCESSED_VCF' { + publishDir = [ + path: {"${params.outdir}/${params.variant_type}/${meta.id}/preprocess"}, + pattern: "*{.vcf.gz,vcf.gz.tbi}", + mode: params.publish_dir_mode + ] + } + + // report_vcf_statistics + withName: SURVIVOR_STATS { ext.prefix ={["${meta.id}", (meta.caller != null) ? ".${meta.caller}_mqc" : '_mqc' @@ -145,6 +251,7 @@ process { mode: params.publish_dir_mode ] } + withName: BCFTOOLS_STATS { ext.prefix ={["${meta.id}", (meta.caller != null) ? ".${meta.caller}" : '' @@ -156,7 +263,9 @@ process { mode: params.publish_dir_mode ] } - // benchmark tools + + //// benchmarking // + withName: "RTGTOOLS_FORMAT" { publishDir = [ path: {"${params.outdir}/references/rtgtools"}, @@ -164,6 +273,7 @@ process { mode: params.publish_dir_mode ] } + withName: "RTGTOOLS_VCFEVAL" { ext.prefix = {"${meta.id}.${params.truth_id}.${meta.caller}"} //ext.args = {"--ref-overlap --all-record --output-mode ga4gh"} @@ -173,6 +283,7 @@ process { mode: params.publish_dir_mode ] } + withName: "HAPPY_HAPPY" { ext.prefix = {"${meta.id}.${params.truth_id}.${meta.caller}"} //ext.args = {""} @@ -182,6 +293,7 @@ process { mode: params.publish_dir_mode ] } + withName: "HAPPY_SOMPY" { ext.prefix = {"${meta.id}.${params.truth_id}.${meta.caller}"} ext.args = { meta.caller.contains("strelka") || meta.caller.contains("varscan") || meta.caller.contains("pisces") ? "--feature-table hcc.${meta.caller}.${params.variant_type} --bin-afs" : "--feature-table generic" } @@ -191,15 +303,15 @@ process { mode: params.publish_dir_mode ] } + withName: "HAPPY_PREPY" { ext.prefix = {"${meta.id}.${params.truth_id}.${meta.caller}.prepy"} ext.args = {"--fixchr --filter-nonref --bcftools-norm"} publishDir = [ - path: {"${params.outdir}/${params.variant_type}/${meta.id}/preprocess"}, - pattern: "*{vcf.gz}", - mode: params.publish_dir_mode + enabled: false ] } + withName: "TRUVARI_BENCH" { ext.prefix = {"${meta.id}.${params.truth_id}.${meta.caller}"} ext.args = {[ @@ -218,6 +330,7 @@ process { mode: params.publish_dir_mode ] } + withName: SVANALYZER_SVBENCHMARK { ext.prefix = {"${meta.id}.${params.truth_id}.${meta.caller}"} ext.args = {[ @@ -232,6 +345,7 @@ process { mode: params.publish_dir_mode ] } + withName: WITTYER { ext.prefix = {"${meta.id}.${params.truth_id}.${meta.caller}"} ext.args = {[ @@ -248,6 +362,19 @@ process { mode: params.publish_dir_mode ] } + + withName: "TABIX_BGZIP*"{ + ext.prefix = {input.toString() - ".vcf.gz"} + publishDir = [ + enabled: false + ] + } + withName: "TABIX_BGZIPTABIX*"{ + ext.prefix = { input.baseName - ".vcf" + ".bgzip"} + publishDir = [ + enabled: false + ] + } withName: BAMSURGEON_EVALUATOR { ext.prefix = {"${meta.id}.${params.truth_id}.${meta.caller}"} publishDir = [ @@ -256,7 +383,9 @@ process { mode: params.publish_dir_mode ] } - // summary reports + + // report_benchmark_statistics + withName: MERGE_REPORTS { ext.prefix = {"${meta.benchmark_tool}"} publishDir = [ @@ -265,6 +394,7 @@ process { mode: params.publish_dir_mode ] } + withName: PLOTS { ext.prefix = {"${meta.benchmark_tool}_mqc"} publishDir = [ @@ -273,6 +403,13 @@ process { mode: params.publish_dir_mode ] } + + withName: CREATE_DATAVZRD_INPUT { + publishDir = [ + enabled: false + ] + } + withName: DATAVZRD { ext.prefix = {"${meta.id}"} publishDir = [ @@ -281,17 +418,31 @@ process { mode: params.publish_dir_mode ] } - // compare vcf results - withName: "TABIX_BGZIP*"{ + + // compare_benchmark_results + + withName: TABIX_BGZIP_UNZIP{ ext.prefix = {input.toString() - ".vcf.gz"} + publishDir = [ + enabled: false + ] } + withName: SURVIVOR_MERGE { ext.prefix = {"${meta.id}.${meta.tag}"} + publishDir = [ + enabled: false + ] } + withName: BCFTOOLS_MERGE { ext.prefix = {"${meta.id}.${meta.tag}"} ext.args = {"--output-type v --force-samples --force-single"} + publishDir = [ + enabled: false + ] } + withName: VCF_TO_CSV { ext.prefix = {"${meta.id}.${meta.tag}"} publishDir = [ @@ -300,56 +451,5 @@ process { mode: params.publish_dir_mode ] } - withName: REFORMAT_HEADER { - ext.prefix ={["${meta.id}", - (meta.tag != null) ? ".${meta.tag}" : '' - ].join('').trim() - } - } - // liftOver - withName: PICARD_CREATESEQUENCEDICTIONARY { - publishDir = [ - path: {"${params.outdir}/references/dictionary"}, - pattern: "*{.dict}", - mode: params.publish_dir_mode - ] - } - withName: PICARD_LIFTOVERVCF { - ext.prefix = {"${meta.id}"} - ext.args = {"--WARN_ON_MISSING_CONTIG true"} - } - withName: BCFTOOLS_RENAME_CHR { - ext.prefix = {"${meta.id}.renamechr"} - ext.args = {"--output-type z"} - publishDir = [ - path: {"${params.outdir}/${params.variant_type}/${params.truth_id}/liftover"}, - pattern: "*{.vcf.gz}", - mode: params.publish_dir_mode - ] - } - withName: UCSC_LIFTOVER { - ext.prefix = {"${meta.id}.liftover"} - } - withName: SORT_BED { - ext.prefix = {"${meta.id}.sort"} - } - withName: BEDTOOLS_MERGE { - ext.prefix = {bed.toString() - ".bed" + ".merged" } - publishDir = [ - path: {"${params.outdir}/${params.variant_type}/${params.truth_id}/liftover"}, - pattern: "*{.bed}", - mode: params.publish_dir_mode - ] - } -} -// -// Don't publish results for these processes -// -process { - withName: 'TABIX_TABIX|TABIX_BGZIP|TABIX_BGZIPTABIX|BGZIP_TABIX|SURVIVOR_MERGE|BCFTOOLS_MERGE|REFORMAT_HEADER|BCFTOOLS_NORM|BCFTOOLS_DEDUP|BCFTOOLS_REHEADER|SORT_BED|UCSC_LIFTOVER|PICARD_LIFTOVERVCF|BCFTOOLS_VIEW_SUBSAMPLE|CREATE_DATAVZRD_INPUT' { - publishDir = [ - path: { "${params.outdir}/test" }, - enabled: false - ] - } + } diff --git a/conf/test.config b/conf/test.config index 5c09764..4715b0e 100644 --- a/conf/test.config +++ b/conf/test.config @@ -33,8 +33,8 @@ params { analysis = 'germline' variant_type = "small" method = 'happy,rtgtools' - preprocess = "normalization,deduplication,prepy" - include_expression = 'FILTER="."' + preprocess = "normalize,deduplicate,prepy" + // truth information truth_id = "HG002" diff --git a/conf/tests/germline_small.config b/conf/tests/germline_small.config index 871751a..1644408 100644 --- a/conf/tests/germline_small.config +++ b/conf/tests/germline_small.config @@ -33,7 +33,7 @@ params { analysis = 'germline' variant_type = "small" method = 'happy,rtgtools' - preprocess = "normalization,deduplication,prepy" + preprocess = "normalize,deduplicate,prepy" include_expression = '(ILEN >= -5 && ILEN <= 5)' // truth information diff --git a/conf/tests/germline_sv.config b/conf/tests/germline_sv.config index dc724f3..18d2729 100644 --- a/conf/tests/germline_sv.config +++ b/conf/tests/germline_sv.config @@ -33,7 +33,7 @@ params { analysis = 'germline' variant_type = "structural" method = 'svanalyzer,wittyer,truvari' - preprocess = "normalization,deduplication,filter_contigs" + preprocess = "normalize,deduplicate,filter_contigs" sv_standardization = "svync,homogenize" min_sv_size = 30 truth_id = "HG002" diff --git a/conf/tests/liftover_truth.config b/conf/tests/liftover_truth.config index 3ab7d01..e16bd88 100644 --- a/conf/tests/liftover_truth.config +++ b/conf/tests/liftover_truth.config @@ -29,7 +29,7 @@ params { truth_id = "HG002" variant_type = "small" method = 'rtgtools,happy' - preprocess = "normalization,deduplication,filter_contigs" + preprocess = "split_multiallelic,deduplicate,filter_contigs,normalize" truth_vcf = "https://raw.githubusercontent.com/kubranarci/benchmark_datasets/main/SV_testdata/hg37/truth/HG002_GRCh37_1_22_v4.2.1_highconf.chr21.vcf.gz" regions_bed = "https://raw.githubusercontent.com/kubranarci/benchmark_datasets/main/SV_testdata/hg37/truth/HG002_GRCh37_1_22_v4.2.1_highconf.bed" diff --git a/docs/output.md b/docs/output.md index 8d8771c..40fe887 100644 --- a/docs/output.md +++ b/docs/output.md @@ -33,7 +33,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
Output files -- `preprocesses/` +- `mpreprocesses/` - `*.vcf.gz`: The standardized and normalized VCF files
@@ -47,7 +47,7 @@ Outputs from standardization, normalization and filtration processes saved. When - ## `liftover/` -- `liftover/` +- `preporocesses/liftover/` - `*.vcf.gz`: Lifted over variants - `*.bed`: Lifted over regions diff --git a/docs/usage.md b/docs/usage.md index 7471a45..9c7e7d5 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -73,12 +73,24 @@ Consistent formatting and alignment of variants in test and truth VCF files for - `homogenize`: makes use of [variant-extractor](https://github.com/EUCANCan/variant-extractor) - `svync`: makes use of [svync](https://github.com/nvnieuwk/svync) -- `--preprocesses`: The preprocessing steps to perform on the input files. Should be a comma-separated list of one or more of the following options: `normalization,deduplication,prepy,filter_contigs` - - `normalization`: Splits multi-allelic variants in test and truth VCF files ([bcftools norm](https://samtools.github.io/bcftools/bcftools.html#norm)) - - `deduplication`: Deduplicates variants in test and truth VCF files ([bcftools norm](https://samtools.github.io/bcftools/bcftools.html#norm)) +- `--preprocesses`: The preprocessing steps to perform on the input files. Should be a comma-separated list of one or more of the following options: `split_multiallelic,normalize,deduplicate,prepy,filter_contigs` + - `split_multiallelic`: Splits multi-allelic variants in test and truth VCF files ([bcftools norm](https://samtools.github.io/bcftools/bcftools.html#norm)) + - `normalize`: Left aligns variants in test and truth VCF files ([bcftools norm](https://samtools.github.io/bcftools/bcftools.html#norm)) + - `deduplicate`: Deduplicates variants in test and truth VCF files ([bcftools norm](https://samtools.github.io/bcftools/bcftools.html#norm)) - `prepy`: Uses prepy in order to normalize test files. This option is only applicable for happy benchmarking of germline analysis ([prepy](https://github.com/Illumina/hap.py/tree/master)) - `filter_contigs`: Filter out extra contigs. It is common for truth files not to include extra contigs. +Filtration of tst variants are controlled through the following parameters: + +- `exclude_expression`: Use ([bcftools expressions](https://samtools.github.io/bcftools/bcftools.html#expressions) to exclude variants) +- `include_expression`: Use ([bcftools expressions](https://samtools.github.io/bcftools/bcftools.html#expressions) to include variants) +- `min_sv_size`: Minimum SV size of variants to benchmark. Uses ([SURVIVOR filter](https://github.com/fritzsedlazeck/SURVIVOR/wiki)) +- `max_sv_size`: Maximum SV size of variants to benchmark. Uses ([SURVIVOR filter](https://github.com/fritzsedlazeck/SURVIVOR/wiki)) +- `min_allele_freq`: Minimum Alele Frequency of variants to benchmark for SVs. Uses ([SURVIVOR filter](https://github.com/fritzsedlazeck/SURVIVOR/wiki)) +- `min_num_reads`: Minimum number of read supporting variants to benchmark for SVs. Uses ([SURVIVOR filter](https://github.com/fritzsedlazeck/SURVIVOR/wiki)) + +_tip_: One can use _exclude_expression_ or _include_expression_ to limit indel or SV variant size as well. + ## Using multi-sample vcf inputs If the input test vcf contains more than one sample, then user has to define which sample name to use. `subsample` will added to the samplesheet as an additional column as follows: diff --git a/main.nf b/main.nf index 2e63537..d84ea2b 100644 --- a/main.nf +++ b/main.nf @@ -35,7 +35,7 @@ include { PIPELINE_COMPLETION } from './subworkflows/local/utils_nfcore_vari ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -// WORKFLOW: Run main nf-core/sarek analysis pipeline +// WORKFLOW: Run main nf-core/variantbenchmarking analysis pipeline workflow NFCORE_VARIANTBENCHMARKING { take: diff --git a/modules.json b/modules.json index e43cf69..ac2a4b9 100644 --- a/modules.json +++ b/modules.json @@ -22,7 +22,7 @@ }, "bcftools/norm": { "branch": "master", - "git_sha": "44096c08ffdbc694f5f92ae174ea0f7ba0f37e09", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bcftools/query": { @@ -32,12 +32,12 @@ }, "bcftools/reheader": { "branch": "master", - "git_sha": "44096c08ffdbc694f5f92ae174ea0f7ba0f37e09", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bcftools/sort": { "branch": "master", - "git_sha": "487d92367b4d7bb9f1ca694bf72736be90720b15", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bcftools/stats": { @@ -47,7 +47,7 @@ }, "bcftools/view": { "branch": "master", - "git_sha": "1013101da4252623fd7acf19cc581bae91d4f839", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bedtools/merge": { diff --git a/modules/local/custom/publish_processed_vcf/environment.yml b/modules/local/custom/publish_processed_vcf/environment.yml new file mode 100644 index 0000000..203c32c --- /dev/null +++ b/modules/local/custom/publish_processed_vcf/environment.yml @@ -0,0 +1,8 @@ +name: publish_processed_vcf +channels: + - conda-forge + - bioconda + - defaults +dependencies: + - bioconda::tabix=1.11 + - bioconda::htslib=1.19.1 diff --git a/modules/local/custom/publish_processed_vcf/main.nf b/modules/local/custom/publish_processed_vcf/main.nf new file mode 100644 index 0000000..d13313c --- /dev/null +++ b/modules/local/custom/publish_processed_vcf/main.nf @@ -0,0 +1,35 @@ +process PUBLISH_PROCESSED_VCF { + tag "$meta.id" + label 'process_single' + + conda "" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/tabix:1.11--hdfd78af_0' : + 'biocontainers/tabix:1.11--hdfd78af_0' }" + + input: + tuple val(meta), path(vcf), path(index) + + output: + tuple val(meta),path("*.vcf.gz"), path("*.vcf.gz.tbi"), emit: gz_tbi + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + + """ + cp $vcf ${prefix}.vcf.gz + cp $index ${prefix}.vcf.gz.tbi + + """ + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + touch ${prefix}.vcf.gz + touch ${prefix}.vcf.gz.tbi + + """ +} diff --git a/modules/nf-core/bcftools/norm/environment.yml b/modules/nf-core/bcftools/norm/environment.yml index fe80e4e..5c00b11 100644 --- a/modules/nf-core/bcftools/norm/environment.yml +++ b/modules/nf-core/bcftools/norm/environment.yml @@ -1,7 +1,5 @@ -name: bcftools_norm channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::bcftools=1.18 + - bioconda::bcftools=1.20 diff --git a/modules/nf-core/bcftools/norm/main.nf b/modules/nf-core/bcftools/norm/main.nf index 47d3dab..bd7a250 100644 --- a/modules/nf-core/bcftools/norm/main.nf +++ b/modules/nf-core/bcftools/norm/main.nf @@ -4,16 +4,18 @@ process BCFTOOLS_NORM { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bcftools:1.18--h8b25389_0': - 'biocontainers/bcftools:1.18--h8b25389_0' }" + 'https://depot.galaxyproject.org/singularity/bcftools:1.20--h8b25389_0': + 'biocontainers/bcftools:1.20--h8b25389_0' }" input: tuple val(meta), path(vcf), path(tbi) tuple val(meta2), path(fasta) output: - tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}") , emit: vcf - path "versions.yml" , emit: versions + tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}"), emit: vcf + tuple val(meta), path("*.tbi") , emit: tbi, optional: true + tuple val(meta), path("*.csi") , emit: csi, optional: true + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -30,7 +32,7 @@ process BCFTOOLS_NORM { """ bcftools norm \\ --fasta-ref ${fasta} \\ - --output ${prefix}.${extension}\\ + --output ${prefix}.${extension} \\ $args \\ --threads $task.cpus \\ ${vcf} @@ -49,8 +51,16 @@ process BCFTOOLS_NORM { args.contains("--output-type z") || args.contains("-Oz") ? "vcf.gz" : args.contains("--output-type v") || args.contains("-Ov") ? "vcf" : "vcf.gz" + def index = args.contains("--write-index=tbi") || args.contains("-W=tbi") ? "tbi" : + args.contains("--write-index=csi") || args.contains("-W=csi") ? "csi" : + args.contains("--write-index") || args.contains("-W") ? "csi" : + "" + def create_cmd = extension.endsWith(".gz") ? "echo '' | gzip >" : "touch" + def create_index = extension.endsWith(".gz") && index.matches("csi|tbi") ? "touch ${prefix}.${extension}.${index}" : "" + """ - touch ${prefix}.${extension} + ${create_cmd} ${prefix}.${extension} + ${create_index} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/bcftools/norm/meta.yml b/modules/nf-core/bcftools/norm/meta.yml index 1f3e1b6..b6edeb4 100644 --- a/modules/nf-core/bcftools/norm/meta.yml +++ b/modules/nf-core/bcftools/norm/meta.yml @@ -13,46 +13,70 @@ tools: documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: | - The vcf file to be normalized - e.g. 'file1.vcf' - pattern: "*.{vcf,vcf.gz}" - - tbi: - type: file - description: | - An optional index of the VCF file (for when the VCF is compressed) - pattern: "*.vcf.gz.tbi" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: FASTA reference file - pattern: "*.{fasta,fa}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: | + The vcf file to be normalized + e.g. 'file1.vcf' + pattern: "*.{vcf,vcf.gz}" + - tbi: + type: file + description: | + An optional index of the VCF file (for when the VCF is compressed) + pattern: "*.vcf.gz.tbi" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: FASTA reference file + pattern: "*.{fasta,fa}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: One of uncompressed VCF (.vcf), compressed VCF (.vcf.gz), compressed BCF (.bcf.gz) or uncompressed BCF (.bcf) normalized output file - pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: One of uncompressed VCF (.vcf), compressed VCF (.vcf.gz), compressed + BCF (.bcf.gz) or uncompressed BCF (.bcf) normalized output file + pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" + - tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@abhi18av" - "@ramprasadn" diff --git a/modules/nf-core/bcftools/norm/tests/main.nf.test b/modules/nf-core/bcftools/norm/tests/main.nf.test new file mode 100644 index 0000000..dbc4150 --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/main.nf.test @@ -0,0 +1,563 @@ +nextflow_process { + + name "Test Process BCFTOOLS_NORM" + script "../main.nf" + process "BCFTOOLS_NORM" + + tag "modules" + tag "modules_nfcore" + tag "bcftools" + tag "bcftools/norm" + + test("sarscov2 - [ vcf, [] ], fasta") { + + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, [] ], fasta - vcf_gz_index") { + + config "./vcf_gz_index.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } } + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [ vcf, [] ], fasta - vcf_gz_index_csi") { + + config "./vcf_gz_index_csi.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } } + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [ vcf, [] ], fasta - vcf_gz_index_tbi") { + + config "./vcf_gz_index_tbi.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } } + ).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } + + test("sarscov2 - [ vcf, tbi ], fasta") { + + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, tbi ], fasta - vcf output") { + + config "./nextflow.vcf.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, tbi ], fasta - vcf_gz output") { + + config "./nextflow.vcf.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - [ vcf, tbi ], fasta - bcf output") { + + config "./nextflow.bcf.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, tbi ], fasta - bcf_gz output") { + + config "./nextflow.bcf_gz.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, [] ], fasta - stub") { + + config "./nextflow.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, tbi ], fasta -stub") { + + config "./nextflow.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, tbi ], fasta - vcf output -stub") { + + config "./nextflow.vcf.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, tbi ], fasta - vcf_gz output - stub") { + + config "./nextflow.vcf.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, tbi ], fasta - bcf output - stub") { + + config "./nextflow.bcf.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, tbi ], fasta - bcf_gz output - stub") { + + config "./nextflow.bcf_gz.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [ vcf, [] ], fasta - vcf_gz_index - stub") { + + config "./vcf_gz_index.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [ vcf, [] ], fasta - vcf_gz_index_csi - stub") { + + config "./vcf_gz_index_csi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [ vcf, [] ], fasta - vcf_gz_index_tbi - stub") { + + config "./vcf_gz_index_tbi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } + + +} \ No newline at end of file diff --git a/modules/nf-core/bcftools/norm/tests/main.nf.test.snap b/modules/nf-core/bcftools/norm/tests/main.nf.test.snap new file mode 100644 index 0000000..3be5211 --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/main.nf.test.snap @@ -0,0 +1,758 @@ +{ + "sarscov2 - [ vcf, tbi ], fasta - vcf_gz output - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.vcf:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.vcf:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:38:42.639095032" + }, + "sarscov2 - [ vcf, [] ], fasta - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:38:05.448449893" + }, + "sarscov2 - [ vcf, tbi ], fasta - vcf output": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.vcf:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.vcf:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:37:12.741719961" + }, + "sarscov2 - [ vcf, [] ], fasta - vcf_gz_index - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:39:22.875147941" + }, + "sarscov2 - [ vcf, tbi ], fasta - vcf_gz output": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_norm.vcf:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + [ + + ], + [ + + ], + [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T08:15:23.38765384" + }, + "sarscov2 - [ vcf, [] ], fasta": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.vcf.gz:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.vcf.gz:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:36:21.519977754" + }, + "sarscov2 - [ vcf, tbi ], fasta - vcf output -stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.vcf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.vcf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:38:27.8230994" + }, + "sarscov2 - [ vcf, tbi ], fasta - bcf_gz output": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.bcf:md5,f35545c26a788b5eb697d9c0490339d9" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.bcf:md5,f35545c26a788b5eb697d9c0490339d9" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:37:53.942403192" + }, + "sarscov2 - [ vcf, [] ], fasta - vcf_gz_index_csi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T13:56:05.3799488" + }, + "sarscov2 - [ vcf, [] ], fasta - vcf_gz_index_tbi": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + [ + + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T13:53:28.356741947" + }, + "sarscov2 - [ vcf, tbi ], fasta": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.vcf.gz:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.vcf.gz:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:36:58.39445154" + }, + "sarscov2 - [ vcf, tbi ], fasta -stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:38:16.259516142" + }, + "sarscov2 - [ vcf, tbi ], fasta - bcf_gz output - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.bcf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.bcf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:39:10.503208929" + }, + "sarscov2 - [ vcf, [] ], fasta - vcf_gz_index": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi" + ] + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T07:52:58.381931979" + }, + "sarscov2 - [ vcf, tbi ], fasta - bcf output - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.bcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.bcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:38:59.121377258" + }, + "sarscov2 - [ vcf, [] ], fasta - vcf_gz_index_tbi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T13:56:16.404380471" + }, + "sarscov2 - [ vcf, [] ], fasta - vcf_gz_index_csi": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,63e5adbaf3dd94550e9e3d7935dd28db" + ] + ], + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi" + ] + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T13:53:09.808834237" + }, + "sarscov2 - [ vcf, tbi ], fasta - bcf output": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_norm.bcf.gz:md5,638c3c25bdd495c90ecbccb69ee77f07" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_norm.bcf.gz:md5,638c3c25bdd495c90ecbccb69ee77f07" + ] + ], + "versions": [ + "versions.yml:md5,ff760495922469e56d0fc3372773000d" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T14:37:42.141945244" + } +} \ No newline at end of file diff --git a/modules/nf-core/bcftools/norm/tests/nextflow.bcf.config b/modules/nf-core/bcftools/norm/tests/nextflow.bcf.config new file mode 100644 index 0000000..b79af86 --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/nextflow.bcf.config @@ -0,0 +1,4 @@ +process { + ext.args = '-m -any --output-type b --no-version' + ext.prefix = "test_norm" +} diff --git a/modules/nf-core/bcftools/norm/tests/nextflow.bcf_gz.config b/modules/nf-core/bcftools/norm/tests/nextflow.bcf_gz.config new file mode 100644 index 0000000..f36f397 --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/nextflow.bcf_gz.config @@ -0,0 +1,4 @@ +process { + ext.args = '-m -any --output-type u --no-version' + ext.prefix = "test_norm" +} diff --git a/modules/nf-core/bcftools/norm/tests/nextflow.config b/modules/nf-core/bcftools/norm/tests/nextflow.config new file mode 100644 index 0000000..510803b --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/nextflow.config @@ -0,0 +1,4 @@ +process { + ext.args = '-m -any --no-version' + ext.prefix = "test_norm" +} diff --git a/modules/nf-core/bcftools/norm/tests/nextflow.vcf.config b/modules/nf-core/bcftools/norm/tests/nextflow.vcf.config new file mode 100644 index 0000000..10bf93e --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/nextflow.vcf.config @@ -0,0 +1,4 @@ +process { + ext.args = '-m -any --output-type v --no-version' + ext.prefix = "test_norm" +} diff --git a/modules/nf-core/bcftools/norm/tests/nextflow.vcf_gz.config b/modules/nf-core/bcftools/norm/tests/nextflow.vcf_gz.config new file mode 100644 index 0000000..b31dd2d --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/nextflow.vcf_gz.config @@ -0,0 +1,4 @@ +process { + ext.args = '-m -any --output-type z ---no-version' + ext.prefix = "test_norm" +} diff --git a/modules/nf-core/bcftools/norm/tests/tags.yml b/modules/nf-core/bcftools/norm/tests/tags.yml new file mode 100644 index 0000000..f6f5e35 --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/tags.yml @@ -0,0 +1,2 @@ +bcftools/norm: + - "modules/nf-core/bcftools/norm/**" diff --git a/modules/nf-core/bcftools/norm/tests/vcf_gz_index.config b/modules/nf-core/bcftools/norm/tests/vcf_gz_index.config new file mode 100644 index 0000000..7dd696e --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/vcf_gz_index.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index --no-version" +} diff --git a/modules/nf-core/bcftools/norm/tests/vcf_gz_index_csi.config b/modules/nf-core/bcftools/norm/tests/vcf_gz_index_csi.config new file mode 100644 index 0000000..aebffb6 --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/vcf_gz_index_csi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index=csi --no-version" +} diff --git a/modules/nf-core/bcftools/norm/tests/vcf_gz_index_tbi.config b/modules/nf-core/bcftools/norm/tests/vcf_gz_index_tbi.config new file mode 100644 index 0000000..b192ae7 --- /dev/null +++ b/modules/nf-core/bcftools/norm/tests/vcf_gz_index_tbi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index=tbi --no-version" +} diff --git a/modules/nf-core/bcftools/query/environment.yml b/modules/nf-core/bcftools/query/environment.yml deleted file mode 100644 index d8c4f4e..0000000 --- a/modules/nf-core/bcftools/query/environment.yml +++ /dev/null @@ -1,7 +0,0 @@ -name: bcftools_query -channels: - - conda-forge - - bioconda - - defaults -dependencies: - - bioconda::bcftools=1.20 diff --git a/modules/nf-core/bcftools/query/main.nf b/modules/nf-core/bcftools/query/main.nf deleted file mode 100644 index 58019f4..0000000 --- a/modules/nf-core/bcftools/query/main.nf +++ /dev/null @@ -1,56 +0,0 @@ -process BCFTOOLS_QUERY { - tag "$meta.id" - label 'process_single' - - conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bcftools:1.20--h8b25389_0': - 'biocontainers/bcftools:1.20--h8b25389_0' }" - - input: - tuple val(meta), path(vcf), path(tbi) - path regions - path targets - path samples - - output: - tuple val(meta), path("*.${suffix}"), emit: output - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - def args = task.ext.args ?: '' - def prefix = task.ext.prefix ?: "${meta.id}" - suffix = task.ext.suffix ?: "txt" - def regions_file = regions ? "--regions-file ${regions}" : "" - def targets_file = targets ? "--targets-file ${targets}" : "" - def samples_file = samples ? "--samples-file ${samples}" : "" - """ - bcftools query \\ - $regions_file \\ - $targets_file \\ - $samples_file \\ - $args \\ - $vcf \\ - > ${prefix}.${suffix} - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') - END_VERSIONS - """ - - stub: - def prefix = task.ext.prefix ?: "${meta.id}" - suffix = task.ext.suffix ?: "txt" - """ - touch ${prefix}.${suffix} \\ - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') - END_VERSIONS - """ -} diff --git a/modules/nf-core/bcftools/query/meta.yml b/modules/nf-core/bcftools/query/meta.yml deleted file mode 100644 index 303ef61..0000000 --- a/modules/nf-core/bcftools/query/meta.yml +++ /dev/null @@ -1,63 +0,0 @@ -name: bcftools_query -description: Extracts fields from VCF or BCF files and outputs them in user-defined format. -keywords: - - query - - variant calling - - bcftools - - VCF -tools: - - query: - description: | - Extracts fields from VCF or BCF files and outputs them in user-defined format. - homepage: http://samtools.github.io/bcftools/bcftools.html - documentation: http://www.htslib.org/doc/bcftools.html - doi: 10.1093/bioinformatics/btp352 - licence: ["MIT"] -input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: | - The vcf file to be qeuried. - pattern: "*.{vcf.gz, vcf}" - - tbi: - type: file - description: | - The tab index for the VCF file to be inspected. - pattern: "*.tbi" - - regions: - type: file - description: | - Optionally, restrict the operation to regions listed in this file. - - targets: - type: file - description: | - Optionally, restrict the operation to regions listed in this file (doesn't rely upon index files) - - samples: - type: file - description: | - Optional, file of sample names to be included or excluded. - e.g. 'file.tsv' -output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - output: - type: file - description: BCFTools query output file - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" -authors: - - "@abhi18av" - - "@drpatelh" -maintainers: - - "@abhi18av" - - "@drpatelh" diff --git a/modules/nf-core/bcftools/query/tests/main.nf.test b/modules/nf-core/bcftools/query/tests/main.nf.test deleted file mode 100644 index e9ea5a9..0000000 --- a/modules/nf-core/bcftools/query/tests/main.nf.test +++ /dev/null @@ -1,101 +0,0 @@ -nextflow_process { - - name "Test Process BCFTOOLS_QUERY" - script "../main.nf" - process "BCFTOOLS_QUERY" - - tag "modules" - tag "modules_nfcore" - tag "bcftools" - tag "bcftools/query" - - config "./nextflow.config" - - test("sarscov2 - [vcf, tbi], [], [], []") { - - when { - process { - """ - input[0] = [ - [ id:'out' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_vcf_gz_tbi'], checkIfExists: true) - ] - input[1] = [] - input[2] = [] - input[3] = [] - """ - } - } - - then { - assertAll( - { assert process.success }, - { assert snapshot( - process.out.output, - process.out.versions - ).match() } - ) - } - - } - - test("sarscov2 - [vcf, tbi], vcf, tsv, []") { - - when { - process { - """ - input[0] = [ - [ id:'out' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_vcf_gz_tbi'], checkIfExists: true) - ] - input[1] = file(params.test_data['sarscov2']['illumina']['test3_vcf_gz'], checkIfExists: true) - input[2] = file(params.test_data['sarscov2']['illumina']['test2_vcf_targets_tsv_gz'], checkIfExists: true) - input[3] = [] - """ - } - } - - then { - assertAll( - { assert process.success }, - { assert snapshot( - process.out.output, - process.out.versions - ).match() } - ) - } - - } - - test("sarscov2 - [vcf, tbi], [], [], [] - stub") { - - when { - process { - """ - input[0] = [ - [ id:'out' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_vcf_gz_tbi'], checkIfExists: true) - ] - input[1] = [] - input[2] = [] - input[3] = [] - """ - } - } - - then { - assertAll( - { assert process.success }, - { assert snapshot( - file(process.out.output[0][1]).name, - process.out.versions - ).match() } - ) - } - - } - -} diff --git a/modules/nf-core/bcftools/query/tests/main.nf.test.snap b/modules/nf-core/bcftools/query/tests/main.nf.test.snap deleted file mode 100644 index 3ead1f2..0000000 --- a/modules/nf-core/bcftools/query/tests/main.nf.test.snap +++ /dev/null @@ -1,55 +0,0 @@ -{ - "sarscov2 - [vcf, tbi], vcf, tsv, []": { - "content": [ - [ - [ - { - "id": "out" - }, - "out.txt:md5,75a6bd0084e2e1838cf7baba11b99d19" - ] - ], - [ - "versions.yml:md5,3d93ea9cd5d314743254618b49e4bd16" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-05-31T15:15:44.916249758" - }, - "sarscov2 - [vcf, tbi], [], [], [] - stub": { - "content": [ - "out.txt", - [ - "versions.yml:md5,3d93ea9cd5d314743254618b49e4bd16" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-05-31T15:15:49.932359271" - }, - "sarscov2 - [vcf, tbi], [], [], []": { - "content": [ - [ - [ - { - "id": "out" - }, - "out.txt:md5,87a2ab194e1ee3219b44e58429ec3307" - ] - ], - [ - "versions.yml:md5,3d93ea9cd5d314743254618b49e4bd16" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-05-31T15:15:39.930697926" - } -} \ No newline at end of file diff --git a/modules/nf-core/bcftools/query/tests/nextflow.config b/modules/nf-core/bcftools/query/tests/nextflow.config deleted file mode 100644 index da81c2a..0000000 --- a/modules/nf-core/bcftools/query/tests/nextflow.config +++ /dev/null @@ -1,3 +0,0 @@ -process { - ext.args = "-f '%CHROM %POS %REF %ALT[%SAMPLE=%GT]'" -} \ No newline at end of file diff --git a/modules/nf-core/bcftools/query/tests/tags.yml b/modules/nf-core/bcftools/query/tests/tags.yml deleted file mode 100644 index fb9455c..0000000 --- a/modules/nf-core/bcftools/query/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -bcftools/query: - - "modules/nf-core/bcftools/query/**" diff --git a/modules/nf-core/bcftools/reheader/environment.yml b/modules/nf-core/bcftools/reheader/environment.yml index aab0dc9..5c00b11 100644 --- a/modules/nf-core/bcftools/reheader/environment.yml +++ b/modules/nf-core/bcftools/reheader/environment.yml @@ -1,7 +1,5 @@ -name: bcftools_reheader channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::bcftools=1.18 + - bioconda::bcftools=1.20 diff --git a/modules/nf-core/bcftools/reheader/main.nf b/modules/nf-core/bcftools/reheader/main.nf index 8252716..9cf6d0d 100644 --- a/modules/nf-core/bcftools/reheader/main.nf +++ b/modules/nf-core/bcftools/reheader/main.nf @@ -4,8 +4,8 @@ process BCFTOOLS_REHEADER { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bcftools:1.18--h8b25389_0': - 'biocontainers/bcftools:1.18--h8b25389_0' }" + 'https://depot.galaxyproject.org/singularity/bcftools:1.20--h8b25389_0': + 'biocontainers/bcftools:1.20--h8b25389_0' }" input: tuple val(meta), path(vcf), path(header), path(samples) @@ -13,6 +13,7 @@ process BCFTOOLS_REHEADER { output: tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}"), emit: vcf + tuple val(meta), path("*.{csi,tbi}") , emit: index, optional: true path "versions.yml" , emit: versions when: @@ -59,8 +60,16 @@ process BCFTOOLS_REHEADER { args2.contains("--output-type z") || args2.contains("-Oz") ? "vcf.gz" : args2.contains("--output-type v") || args2.contains("-Ov") ? "vcf" : "vcf" + def index = args2.contains("--write-index=tbi") || args2.contains("-W=tbi") ? "tbi" : + args2.contains("--write-index=csi") || args2.contains("-W=csi") ? "csi" : + args2.contains("--write-index") || args2.contains("-W") ? "csi" : + "" + def create_cmd = extension.endsWith(".gz") ? "echo '' | gzip >" : "touch" + def create_index = extension.endsWith(".gz") && index.matches("csi|tbi") ? "touch ${prefix}.${extension}.${index}" : "" + """ - touch ${prefix}.${extension} + ${create_cmd} ${prefix}.${extension} + ${create_index} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/bcftools/reheader/meta.yml b/modules/nf-core/bcftools/reheader/meta.yml index 690d4ea..47e5344 100644 --- a/modules/nf-core/bcftools/reheader/meta.yml +++ b/modules/nf-core/bcftools/reheader/meta.yml @@ -12,47 +12,60 @@ tools: documentation: http://samtools.github.io/bcftools/bcftools.html#reheader doi: 10.1093/gigascience/giab008 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: VCF/BCF file - pattern: "*.{vcf.gz,vcf,bcf}" - - header: - type: file - description: New header to add to the VCF - pattern: "*.{header.txt}" - - samples: - type: file - description: File containing sample names to update (one sample per line) - pattern: "*.{samples.txt}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Fasta index to update header sequences with - pattern: "*.{fai}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: VCF/BCF file + pattern: "*.{vcf.gz,vcf,bcf}" + - header: + type: file + description: New header to add to the VCF + pattern: "*.{header.txt}" + - samples: + type: file + description: File containing sample names to update (one sample per line) + pattern: "*.{samples.txt}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Fasta index to update header sequences with + pattern: "*.{fai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: VCF with updated header, bgzipped per default - pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: VCF with updated header, bgzipped per default + pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" + - index: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{csi,tbi}": + type: file + description: Index of VCF with updated header + pattern: "*.{csi,tbi}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@bjohnnyd" - "@jemten" diff --git a/modules/nf-core/bcftools/reheader/tests/main.nf.test b/modules/nf-core/bcftools/reheader/tests/main.nf.test index f3200cb..96c1b7b 100644 --- a/modules/nf-core/bcftools/reheader/tests/main.nf.test +++ b/modules/nf-core/bcftools/reheader/tests/main.nf.test @@ -17,13 +17,13 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), [], [] ] - input[1] = [ + input[1] = [ [ id:'genome' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) ] """ } @@ -47,13 +47,13 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), [], [] ] - input[1] = [ + input[1] = [ [ id:'genome' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) ] """ } @@ -68,6 +68,111 @@ nextflow_process { } + test("sarscov2 - [vcf, [], []], fai - vcf.gz output - index") { + + config "./vcf_gz_index.config" + when { + + process { + """ + input[0] = [ + [ id:'test', single_end:false ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [], + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.index.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.index[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [vcf, [], []], fai - vcf.gz output - csi index") { + + config "./vcf_gz_index_csi.config" + when { + + process { + """ + input[0] = [ + [ id:'test', single_end:false ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [], + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.index.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.index[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [vcf, [], []], fai - vcf.gz output - tbi index") { + + config "./vcf_gz_index_tbi.config" + when { + + process { + """ + input[0] = [ + [ id:'test', single_end:false ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [], + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.index.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.index[0][1].endsWith(".tbi") } + ) + } + + } + test("sarscov2 - [vcf, [], []], fai - bcf output") { config "./bcf.config" @@ -77,13 +182,13 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), [], [] ] - input[1] = [ + input[1] = [ [ id:'genome' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) ] """ } @@ -107,11 +212,11 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_vcf'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), [] ] - input[1] = [ + input[1] = [ [ id:'genome' ], // meta map [] ] @@ -137,15 +242,15 @@ nextflow_process { """ ch_no_samples = Channel.of([ [ id:'test', single_end:false ], - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), [] ]) ch_samples = Channel.of(["samples.txt", "new_name"]) .collectFile(newLine:true) input[0] = ch_no_samples.combine(ch_samples) - input[1] = [ + input[1] = [ [ id:'genome' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) ] """ } @@ -170,13 +275,13 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), [], [] ] - input[1] = [ + input[1] = [ [ id:'genome' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) ] """ } @@ -193,5 +298,97 @@ nextflow_process { } } + test("sarscov2 - [vcf, [], []], fai - vcf.gz output - index -stub") { + + options "-stub" + config "./vcf_gz_index.config" + when { + + process { + """ + input[0] = [ + [ id:'test', single_end:false ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [], + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [vcf, [], []], fai - vcf.gz output - csi index -stub") { + + options "-stub" + config "./vcf_gz_index_csi.config" + when { + + process { + """ + input[0] = [ + [ id:'test', single_end:false ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [], + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - [vcf, [], []], fai - vcf.gz output - tbi index -stub") { + + options "-stub" + config "./vcf_gz_index_tbi.config" + when { + + process { + """ + input[0] = [ + [ id:'test', single_end:false ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [], + [] + ] + input[1] = [ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } } diff --git a/modules/nf-core/bcftools/reheader/tests/main.nf.test.snap b/modules/nf-core/bcftools/reheader/tests/main.nf.test.snap index 112736a..87a3654 100644 --- a/modules/nf-core/bcftools/reheader/tests/main.nf.test.snap +++ b/modules/nf-core/bcftools/reheader/tests/main.nf.test.snap @@ -1,4 +1,140 @@ { + "sarscov2 - [vcf, [], []], fai - vcf.gz output - tbi index": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.tbi" + ] + ], + [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T10:09:05.955833763" + }, + "sarscov2 - [vcf, [], []], fai - vcf.gz output - index -stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ], + "index": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T09:52:41.444952182" + }, + "sarscov2 - [vcf, [], []], fai - vcf.gz output - tbi index -stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ], + "index": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T09:53:04.314827944" + }, "sarscov2 - [vcf, [], []], fai - vcf output": { "content": [ { @@ -12,7 +148,13 @@ ] ], "1": [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + + ], + "2": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ], + "index": [ + ], "vcf": [ [ @@ -24,11 +166,15 @@ ] ], "versions": [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" ] } ], - "timestamp": "2023-11-29T13:05:44.058376693" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T09:50:41.983008108" }, "sarscov2 - [vcf, [], []], fai - bcf output": { "content": [ @@ -39,11 +185,17 @@ "id": "test", "single_end": false }, - "tested.bcf.gz:md5,c31d9afd8614832c2a46d9a55682c97a" + "tested.bcf.gz:md5,c8a304c8d2892039201154153c8cd536" ] ], "1": [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + + ], + "2": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ], + "index": [ + ], "vcf": [ [ @@ -51,15 +203,19 @@ "id": "test", "single_end": false }, - "tested.bcf.gz:md5,c31d9afd8614832c2a46d9a55682c97a" + "tested.bcf.gz:md5,c8a304c8d2892039201154153c8cd536" ] ], "versions": [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" ] } ], - "timestamp": "2023-11-29T13:06:03.793372514" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T09:51:43.072513252" }, "sarscov2 - [vcf, [], []], fai - vcf.gz output": { "content": [ @@ -70,11 +226,17 @@ "id": "test", "single_end": false }, - "tested.vcf.gz:md5,a1e45fe6d2b386fc2611766e5d2937ee" + "tested.vcf.gz:md5,8e722884ffb75155212a3fc053918766" ] ], "1": [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + + ], + "2": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ], + "index": [ + ], "vcf": [ [ @@ -82,24 +244,145 @@ "id": "test", "single_end": false }, - "tested.vcf.gz:md5,a1e45fe6d2b386fc2611766e5d2937ee" + "tested.vcf.gz:md5,8e722884ffb75155212a3fc053918766" ] ], "versions": [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" ] } ], - "timestamp": "2023-11-29T13:05:53.954090441" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T09:50:53.055630152" + }, + "sarscov2 - [vcf, [], []], fai - vcf.gz output - index": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi" + ] + ], + [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T10:08:37.999924355" + }, + "sarscov2 - [vcf, [], []], fai - vcf.gz output - csi index -stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ], + "index": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T09:52:52.512269206" }, "sarscov2 - [vcf, [], []], fai - stub": { "content": [ "tested.vcf", [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" ] ], - "timestamp": "2023-11-29T13:06:33.549685303" + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-31T15:16:36.337112514" + }, + "sarscov2 - [vcf, [], []], fai - vcf.gz output - csi index": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi" + ] + ], + [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T10:08:55.434831174" }, "sarscov2 - [vcf, [], samples], fai": { "content": [ @@ -114,7 +397,13 @@ ] ], "1": [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + + ], + "2": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ], + "index": [ + ], "vcf": [ [ @@ -126,11 +415,15 @@ ] ], "versions": [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" ] } ], - "timestamp": "2023-11-29T13:06:23.474745156" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T09:52:12.216002665" }, "sarscov2 - [vcf, header, []], []": { "content": [ @@ -145,7 +438,13 @@ ] ], "1": [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + + ], + "2": [ + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" + ], + "index": [ + ], "vcf": [ [ @@ -157,10 +456,14 @@ ] ], "versions": [ - "versions.yml:md5,fbf8ac8da771b6295a47392003f983ce" + "versions.yml:md5,486e3d4ebc1dbf5c0a4dfaebae12ea34" ] } ], - "timestamp": "2023-11-29T13:06:13.841648691" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-09-03T09:51:54.062386022" } } \ No newline at end of file diff --git a/modules/nf-core/bcftools/reheader/tests/vcf_gz_index.config b/modules/nf-core/bcftools/reheader/tests/vcf_gz_index.config new file mode 100644 index 0000000..1e050ec --- /dev/null +++ b/modules/nf-core/bcftools/reheader/tests/vcf_gz_index.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args2 = "--output-type z --write-index --no-version" +} diff --git a/modules/nf-core/bcftools/reheader/tests/vcf_gz_index_csi.config b/modules/nf-core/bcftools/reheader/tests/vcf_gz_index_csi.config new file mode 100644 index 0000000..536e4b4 --- /dev/null +++ b/modules/nf-core/bcftools/reheader/tests/vcf_gz_index_csi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args2 = "--output-type z --write-index=csi --no-version" +} diff --git a/modules/nf-core/bcftools/reheader/tests/vcf_gz_index_tbi.config b/modules/nf-core/bcftools/reheader/tests/vcf_gz_index_tbi.config new file mode 100644 index 0000000..91a80db --- /dev/null +++ b/modules/nf-core/bcftools/reheader/tests/vcf_gz_index_tbi.config @@ -0,0 +1,5 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args2 = "--output-type z --write-index=tbi --no-version" + +} diff --git a/modules/nf-core/bcftools/sort/environment.yml b/modules/nf-core/bcftools/sort/environment.yml index 89cf911..5c00b11 100644 --- a/modules/nf-core/bcftools/sort/environment.yml +++ b/modules/nf-core/bcftools/sort/environment.yml @@ -1,7 +1,5 @@ -name: bcftools_sort channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::bcftools=1.18 + - bioconda::bcftools=1.20 diff --git a/modules/nf-core/bcftools/sort/main.nf b/modules/nf-core/bcftools/sort/main.nf index c982944..7d4c9b8 100644 --- a/modules/nf-core/bcftools/sort/main.nf +++ b/modules/nf-core/bcftools/sort/main.nf @@ -4,15 +4,17 @@ process BCFTOOLS_SORT { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bcftools:1.18--h8b25389_0': - 'biocontainers/bcftools:1.18--h8b25389_0' }" + 'https://depot.galaxyproject.org/singularity/bcftools:1.20--h8b25389_0': + 'biocontainers/bcftools:1.20--h8b25389_0' }" input: tuple val(meta), path(vcf) output: - tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}") , emit: vcf - path "versions.yml" , emit: versions + tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}"), emit: vcf + tuple val(meta), path("*.tbi") , emit: tbi, optional: true + tuple val(meta), path("*.csi") , emit: csi, optional: true + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -49,9 +51,16 @@ process BCFTOOLS_SORT { args.contains("--output-type z") || args.contains("-Oz") ? "vcf.gz" : args.contains("--output-type v") || args.contains("-Ov") ? "vcf" : "vcf" + def index = args.contains("--write-index=tbi") || args.contains("-W=tbi") ? "tbi" : + args.contains("--write-index=csi") || args.contains("-W=csi") ? "csi" : + args.contains("--write-index") || args.contains("-W") ? "csi" : + "" + def create_cmd = extension.endsWith(".gz") ? "echo '' | gzip >" : "touch" + def create_index = extension.endsWith(".gz") && index.matches("csi|tbi") ? "touch ${prefix}.${extension}.${index}" : "" """ - touch ${prefix}.${extension} + ${create_cmd} ${prefix}.${extension} + ${create_index} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/bcftools/sort/meta.yml b/modules/nf-core/bcftools/sort/meta.yml index 84747c6..f7a6eff 100644 --- a/modules/nf-core/bcftools/sort/meta.yml +++ b/modules/nf-core/bcftools/sort/meta.yml @@ -12,30 +12,53 @@ tools: tool_dev_url: https://github.com/samtools/bcftools doi: "10.1093/bioinformatics/btp352" licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: The VCF/BCF file to be sorted - pattern: "*.{vcf.gz,vcf,bcf}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: The VCF/BCF file to be sorted + pattern: "*.{vcf.gz,vcf,bcf}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: Sorted VCF file - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: Sorted VCF file + pattern: "*.{vcf.gz}" + - tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@Gwennid" maintainers: diff --git a/modules/nf-core/bcftools/sort/tests/main.nf.test b/modules/nf-core/bcftools/sort/tests/main.nf.test index fec59cf..b9bdd76 100644 --- a/modules/nf-core/bcftools/sort/tests/main.nf.test +++ b/modules/nf-core/bcftools/sort/tests/main.nf.test @@ -9,13 +9,125 @@ nextflow_process { tag "bcftools" tag "bcftools/sort" - test("SarsCov2 VCF") { + test("sarscov2 - vcf") { when { process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match("vcf") } + ) + } + + } + + test("sarscov2 - vcf_gz_index") { + + config "./vcf_gz_index.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - vcf_gz_index_csi") { + + config "./vcf_gz_index_csi.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - vcf_gz_index_tbi") { + + config "./vcf_gz_index_tbi.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } + + test("sarscov2 - vcf - stub") { + options "-stub" + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) ] """ } @@ -29,4 +141,82 @@ nextflow_process { } } -} + + test("sarscov2 - vcf_gz_index - stub") { + + config "./vcf_gz_index.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - vcf_gz_index_csi - stub") { + + config "./vcf_gz_index_csi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - vcf_gz_index_tbi - stub") { + + config "./vcf_gz_index_tbi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } +} \ No newline at end of file diff --git a/modules/nf-core/bcftools/sort/tests/main.nf.test.snap b/modules/nf-core/bcftools/sort/tests/main.nf.test.snap index 7f59955..f38272c 100644 --- a/modules/nf-core/bcftools/sort/tests/main.nf.test.snap +++ b/modules/nf-core/bcftools/sort/tests/main.nf.test.snap @@ -1,5 +1,60 @@ { - "SarsCov2 VCF": { + "sarscov2 - vcf_gz_index_tbi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ], + "csi": [ + + ], + "tbi": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:06:05.201680777" + }, + "vcf": { "content": [ { "0": [ @@ -11,7 +66,19 @@ ] ], "1": [ - "versions.yml:md5,622bd32d4ff0fac3360cd534ae0f0168" + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ], + "csi": [ + + ], + "tbi": [ + ], "vcf": [ [ @@ -22,14 +89,262 @@ ] ], "versions": [ - "versions.yml:md5,622bd32d4ff0fac3360cd534ae0f0168" + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" ] } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.04.2" + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:04:43.889971134" + }, + "sarscov2 - vcf_gz_index": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi" + ] + ], + [ + + ], + [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:04:55.385964497" + }, + "sarscov2 - vcf_gz_index_csi": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi" + ] + ], + [ + + ], + [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:05:06.662818922" + }, + "sarscov2 - vcf_gz_index - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ], + "csi": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:05:40.012912381" + }, + "sarscov2 - vcf_gz_index_csi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ], + "csi": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:05:52.405673587" + }, + "sarscov2 - vcf - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:05:29.117946461" + }, + "sarscov2 - vcf_gz_index_tbi": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + + ], + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.tbi" + ] + ], + [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" }, - "timestamp": "2024-03-18T12:50:10.340362246" + "timestamp": "2024-06-05T12:05:17.217274984" } } \ No newline at end of file diff --git a/modules/nf-core/bcftools/sort/tests/vcf_gz_index.config b/modules/nf-core/bcftools/sort/tests/vcf_gz_index.config new file mode 100644 index 0000000..aacd134 --- /dev/null +++ b/modules/nf-core/bcftools/sort/tests/vcf_gz_index.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index" +} diff --git a/modules/nf-core/bcftools/sort/tests/vcf_gz_index_csi.config b/modules/nf-core/bcftools/sort/tests/vcf_gz_index_csi.config new file mode 100644 index 0000000..640eb0b --- /dev/null +++ b/modules/nf-core/bcftools/sort/tests/vcf_gz_index_csi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index=csi" +} diff --git a/modules/nf-core/bcftools/sort/tests/vcf_gz_index_tbi.config b/modules/nf-core/bcftools/sort/tests/vcf_gz_index_tbi.config new file mode 100644 index 0000000..589a50c --- /dev/null +++ b/modules/nf-core/bcftools/sort/tests/vcf_gz_index_tbi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index=tbi" +} diff --git a/modules/nf-core/bcftools/view/environment.yml b/modules/nf-core/bcftools/view/environment.yml index 8937c6d..5c00b11 100644 --- a/modules/nf-core/bcftools/view/environment.yml +++ b/modules/nf-core/bcftools/view/environment.yml @@ -1,7 +1,5 @@ -name: bcftools_view channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::bcftools=1.18 + - bioconda::bcftools=1.20 diff --git a/modules/nf-core/bcftools/view/main.nf b/modules/nf-core/bcftools/view/main.nf index 5237adc..7fe4303 100644 --- a/modules/nf-core/bcftools/view/main.nf +++ b/modules/nf-core/bcftools/view/main.nf @@ -4,8 +4,8 @@ process BCFTOOLS_VIEW { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bcftools:1.18--h8b25389_0': - 'biocontainers/bcftools:1.18--h8b25389_0' }" + 'https://depot.galaxyproject.org/singularity/bcftools:1.20--h8b25389_0': + 'biocontainers/bcftools:1.20--h8b25389_0' }" input: tuple val(meta), path(vcf), path(index) @@ -15,6 +15,8 @@ process BCFTOOLS_VIEW { output: tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}"), emit: vcf + tuple val(meta), path("*.tbi") , emit: tbi, optional: true + tuple val(meta), path("*.csi") , emit: csi, optional: true path "versions.yml" , emit: versions when: @@ -55,8 +57,16 @@ process BCFTOOLS_VIEW { args.contains("--output-type z") || args.contains("-Oz") ? "vcf.gz" : args.contains("--output-type v") || args.contains("-Ov") ? "vcf" : "vcf" + def index = args.contains("--write-index=tbi") || args.contains("-W=tbi") ? "tbi" : + args.contains("--write-index=csi") || args.contains("-W=csi") ? "csi" : + args.contains("--write-index") || args.contains("-W") ? "csi" : + "" + def create_cmd = extension.endsWith(".gz") ? "echo '' | gzip >" : "touch" + def create_index = extension.endsWith(".gz") && index.matches("csi|tbi") ? "touch ${prefix}.${extension}.${index}" : "" + """ - touch ${prefix}.${extension} + ${create_cmd} ${prefix}.${extension} + ${create_index} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/bcftools/view/meta.yml b/modules/nf-core/bcftools/view/meta.yml index 6baa34a..aa7785f 100644 --- a/modules/nf-core/bcftools/view/meta.yml +++ b/modules/nf-core/bcftools/view/meta.yml @@ -1,5 +1,6 @@ name: bcftools_view -description: View, subset and filter VCF or BCF files by position and filtering expression. Convert between VCF and BCF +description: View, subset and filter VCF or BCF files by position and filtering expression. + Convert between VCF and BCF keywords: - variant calling - view @@ -13,51 +14,74 @@ tools: documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: | - The vcf file to be inspected. - e.g. 'file.vcf' - - index: - type: file - description: | - The tab index for the VCF file to be inspected. - e.g. 'file.tbi' - - regions: - type: file - description: | - Optionally, restrict the operation to regions listed in this file. - e.g. 'file.vcf' - - targets: - type: file - description: | - Optionally, restrict the operation to regions listed in this file (doesn't rely upon index files) - e.g. 'file.vcf' - - samples: - type: file - description: | - Optional, file of sample names to be included or excluded. - e.g. 'file.tsv' + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: | + The vcf file to be inspected. + e.g. 'file.vcf' + - index: + type: file + description: | + The tab index for the VCF file to be inspected. + e.g. 'file.tbi' + - - regions: + type: file + description: | + Optionally, restrict the operation to regions listed in this file. + e.g. 'file.vcf' + - - targets: + type: file + description: | + Optionally, restrict the operation to regions listed in this file (doesn't rely upon index files) + e.g. 'file.vcf' + - - samples: + type: file + description: | + Optional, file of sample names to be included or excluded. + e.g. 'file.tsv' output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: VCF normalized output file - pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: VCF normalized output file + pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" + - tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@abhi18av" maintainers: diff --git a/modules/nf-core/bcftools/view/tests/main.nf.test b/modules/nf-core/bcftools/view/tests/main.nf.test index c285674..1e60c50 100644 --- a/modules/nf-core/bcftools/view/tests/main.nf.test +++ b/modules/nf-core/bcftools/view/tests/main.nf.test @@ -9,17 +9,17 @@ nextflow_process { tag "bcftools" tag "bcftools/view" - config "./nextflow.config" - test("sarscov2 - [vcf, tbi], [], [], []") { + config "./nextflow.config" + when { process { """ input[0] = [ [ id:'out', single_end:false ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_vcf_gz_tbi'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) ] input[1] = [] input[2] = [] @@ -40,18 +40,122 @@ nextflow_process { } + test("sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index") { + + config "./vcf_gz_index.config" + + when { + process { + """ + input[0] = [ + [ id:'out', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + input[2] = [] + input[3] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index_csi") { + + config "./vcf_gz_index_csi.config" + + when { + process { + """ + input[0] = [ + [ id:'out', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + input[2] = [] + input[3] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index_tbi") { + + config "./vcf_gz_index_tbi.config" + + when { + process { + """ + input[0] = [ + [ id:'out', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + input[2] = [] + input[3] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } + test("sarscov2 - [vcf, tbi], vcf, tsv, []") { + config "./nextflow.config" + when { process { """ input[0] = [ [ id:'out', single_end:false ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_vcf_gz_tbi'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) ] - input[1] = file(params.test_data['sarscov2']['illumina']['test3_vcf_gz'], checkIfExists: true) - input[2] = file(params.test_data['sarscov2']['illumina']['test2_vcf_targets_tsv_gz'], checkIfExists: true) + input[1] = file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test3.vcf.gz', checkIfExists: true) + input[2] = file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.targets.tsv.gz', checkIfExists: true) input[3] = [] """ } @@ -71,6 +175,7 @@ nextflow_process { test("sarscov2 - [vcf, tbi], [], [], [] - stub") { + config "./nextflow.config" options "-stub" when { @@ -78,8 +183,8 @@ nextflow_process { """ input[0] = [ [ id:'out', single_end:false ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_vcf_gz_tbi'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) ] input[1] = [] input[2] = [] @@ -100,4 +205,94 @@ nextflow_process { } -} + test("sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index - stub") { + + config "./vcf_gz_index.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'out', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + input[2] = [] + input[3] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index_csi - stub") { + + config "./vcf_gz_index_csi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'out', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + input[2] = [] + input[3] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index_tbi - stub") { + + config "./vcf_gz_index_tbi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'out', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + input[2] = [] + input[3] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } + +} \ No newline at end of file diff --git a/modules/nf-core/bcftools/view/tests/main.nf.test.snap b/modules/nf-core/bcftools/view/tests/main.nf.test.snap index b59be93..fec22e3 100644 --- a/modules/nf-core/bcftools/view/tests/main.nf.test.snap +++ b/modules/nf-core/bcftools/view/tests/main.nf.test.snap @@ -1,4 +1,214 @@ { + "sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index_csi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,241125d00357804552689c37bbabe1f5" + ], + "csi": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,241125d00357804552689c37bbabe1f5" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:14:38.717458272" + }, + "sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index_tbi": { + "content": [ + [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + + ], + [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz.tbi" + ] + ], + [ + "versions.yml:md5,241125d00357804552689c37bbabe1f5" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:13:44.760671384" + }, + "sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index - stub": { + "content": [ + { + "0": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,241125d00357804552689c37bbabe1f5" + ], + "csi": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,241125d00357804552689c37bbabe1f5" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-04T16:06:21.669668533" + }, + "sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index_tbi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,241125d00357804552689c37bbabe1f5" + ], + "csi": [ + + ], + "tbi": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,241125d00357804552689c37bbabe1f5" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:14:53.026083914" + }, "sarscov2 - [vcf, tbi], vcf, tsv, []": { "content": [ [ @@ -11,19 +221,60 @@ ] ], [ - "versions.yml:md5,106d119dde844ec7fee1cdd30828bcdc" + "versions.yml:md5,241125d00357804552689c37bbabe1f5" ] ], - "timestamp": "2024-02-05T17:12:20.799849895" + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-31T15:15:14.663512924" }, "sarscov2 - [vcf, tbi], [], [], [] - stub": { "content": [ "out.vcf", [ - "versions.yml:md5,106d119dde844ec7fee1cdd30828bcdc" + "versions.yml:md5,241125d00357804552689c37bbabe1f5" ] ], - "timestamp": "2024-02-05T16:53:34.652746985" + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-31T15:15:19.723448323" + }, + "sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index": { + "content": [ + [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz.csi" + ] + ], + [ + + ], + [ + "versions.yml:md5,241125d00357804552689c37bbabe1f5" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T08:24:36.358469315" }, "sarscov2 - [vcf, tbi], [], [], []": { "content": [ @@ -37,9 +288,46 @@ ] ], [ - "versions.yml:md5,106d119dde844ec7fee1cdd30828bcdc" + "versions.yml:md5,241125d00357804552689c37bbabe1f5" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-31T15:15:09.588867653" + }, + "sarscov2 - [vcf, tbi], [], [], [] - vcf_gz_index_csi": { + "content": [ + [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + [ + { + "id": "out", + "single_end": false + }, + "out_vcf.vcf.gz.csi" + ] + ], + [ + + ], + [ + "versions.yml:md5,241125d00357804552689c37bbabe1f5" ] ], - "timestamp": "2024-02-05T17:12:14.247465409" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:13:33.834986869" } } \ No newline at end of file diff --git a/modules/nf-core/bcftools/view/tests/vcf_gz_index.config b/modules/nf-core/bcftools/view/tests/vcf_gz_index.config new file mode 100644 index 0000000..7dd696e --- /dev/null +++ b/modules/nf-core/bcftools/view/tests/vcf_gz_index.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index --no-version" +} diff --git a/modules/nf-core/bcftools/view/tests/vcf_gz_index_csi.config b/modules/nf-core/bcftools/view/tests/vcf_gz_index_csi.config new file mode 100644 index 0000000..aebffb6 --- /dev/null +++ b/modules/nf-core/bcftools/view/tests/vcf_gz_index_csi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index=csi --no-version" +} diff --git a/modules/nf-core/bcftools/view/tests/vcf_gz_index_tbi.config b/modules/nf-core/bcftools/view/tests/vcf_gz_index_tbi.config new file mode 100644 index 0000000..b192ae7 --- /dev/null +++ b/modules/nf-core/bcftools/view/tests/vcf_gz_index_tbi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index=tbi --no-version" +} diff --git a/nextflow.config b/nextflow.config index f1e969b..2badde6 100644 --- a/nextflow.config +++ b/nextflow.config @@ -32,7 +32,7 @@ params { sv_standardization = "" // Filtering parameters - min_sv_size = 30 + min_sv_size = 0 max_sv_size = -1 min_allele_freq = -1 min_num_reads = -1 diff --git a/nextflow_schema.json b/nextflow_schema.json index fc4b2f4..5dd3d26 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -41,7 +41,7 @@ "enum": ["germline", "somatic"], "pattern": "(germline|somatic)", "fa_icon": "fas fa-folder-open", - "errorMessage": "Analysis type has to be choosen: germline or somatic" + "errorMessage": "Analysis type has to be chosen: germline or somatic" }, "variant_type": { "type": "string", @@ -78,9 +78,9 @@ "preprocess": { "type": "string", - "description": "The preprocessing steps to perform on the input files. Should be a comma-separated list of one or more of the following options: normalization, deduplication, prepy, filter_contigs", - "errorMessage": "A wrong input has been detected. It should be a comma-separated list of on or more of these options: normalization, deduplication, prepy, filter_contigs", - "pattern": "^((normalization|deduplication|prepy|filter_contigs)?,?)*(? + [ meta, file, [], [] ] + }, fai ) - versions = versions.mix(VCF_REHEADER_SAMPLENAME.out.versions.first()) - vcf_ch = VCF_REHEADER_SAMPLENAME.out.ch_vcf + versions = versions.mix(BCFTOOLS_REHEADER_QUERY.out.versions.first()) + + BCFTOOLS_REHEADER_QUERY.out.vcf.join(BCFTOOLS_REHEADER_QUERY.out.index) + .set{vcf_ch} if (params.preprocess.contains("filter_contigs")){ // filter out extra contigs! @@ -84,32 +92,25 @@ workflow PREPARE_VCFS_TEST { ) versions = versions.mix(BCFTOOLS_VIEW_CONTIGS.out.versions.first()) - TABIX_BGZIPTABIX( - BCFTOOLS_VIEW_CONTIGS.out.vcf - ) - versions = versions.mix(TABIX_BGZIPTABIX.out.versions.first()) - vcf_ch = TABIX_BGZIPTABIX.out.gz_tbi + BCFTOOLS_VIEW_CONTIGS.out.vcf.join(BCFTOOLS_VIEW_CONTIGS.out.tbi, by:0) + .set{vcf_ch} } - if (params.preprocess.contains("normalization")){ + if (params.preprocess.contains("split_multiallelic")){ // Split -any- multi-allelic variants - BCFTOOLS_NORM( + BCFTOOLS_SPLIT_MULTI( vcf_ch, fasta ) - versions = versions.mix(BCFTOOLS_NORM.out.versions.first()) + versions = versions.mix(BCFTOOLS_SPLIT_MULTI.out.versions.first()) - TABIX_TABIX( - BCFTOOLS_NORM.out.vcf - ) - versions = versions.mix(TABIX_TABIX.out.versions.first()) - BCFTOOLS_NORM.out.vcf.join(TABIX_TABIX.out.tbi, by:0) + BCFTOOLS_SPLIT_MULTI.out.vcf.join(BCFTOOLS_SPLIT_MULTI.out.tbi, by:0) .set{vcf_ch} } - if (params.include_expression != null | params.exclude_expression != null | params.min_sv_size > 0 | params.max_sv_size != -1 | params.min_allele_freq != -1 | params.min_num_reads != -1 ){ + if (params.include_expression != null || params.exclude_expression != null || params.min_sv_size > 0 || params.max_sv_size != -1 || params.min_allele_freq != -1 || params.min_num_reads != -1 ){ - // Filters variants and SVs with given paramaters + // Filters variants and SVs with given parameters VCF_VARIANT_FILTERING( vcf_ch ) @@ -117,7 +118,7 @@ workflow PREPARE_VCFS_TEST { versions = versions.mix(VCF_VARIANT_FILTERING.out.versions.first()) } - if (params.preprocess.contains("deduplication")){ + if (params.preprocess.contains("deduplicate")){ // Deduplicate variants at the same position test VCF_VARIANT_DEDUPLICATION( @@ -129,6 +130,19 @@ workflow PREPARE_VCFS_TEST { } + if (params.preprocess.contains("normalize")){ + + // Turn on left alignment and normalization + BCFTOOLS_NORM( + vcf_ch, + fasta + ) + versions = versions.mix(BCFTOOLS_NORM.out.versions.first()) + + BCFTOOLS_NORM.out.vcf.join(BCFTOOLS_NORM.out.tbi, by:0) + .set{vcf_ch} + } + if (params.analysis.contains("somatic")){ // somatic specific preparations @@ -142,6 +156,10 @@ workflow PREPARE_VCFS_TEST { } + PUBLISH_PROCESSED_VCF( + vcf_ch + ) + emit: vcf_ch // channel: [val(meta), vcf.gz, tbi] versions // channel: [versions.yml] diff --git a/subworkflows/local/prepare_vcfs_truth.nf b/subworkflows/local/prepare_vcfs_truth.nf index 1548741..bbf3834 100644 --- a/subworkflows/local/prepare_vcfs_truth.nf +++ b/subworkflows/local/prepare_vcfs_truth.nf @@ -3,11 +3,12 @@ // -include { BCFTOOLS_NORM } from '../../modules/nf-core/bcftools/norm' -include { TABIX_TABIX } from '../../modules/nf-core/tabix/tabix' -include { VCF_REHEADER_SAMPLENAME } from '../local/vcf_reheader_samplename' include { VCF_VARIANT_DEDUPLICATION } from '../local/vcf_variant_deduplication' include { LIFTOVER_VCFS } from '../local/liftover_vcfs' +include { BCFTOOLS_NORM } from '../../modules/nf-core/bcftools/norm' +include { PUBLISH_PROCESSED_VCF } from '../../modules/local/custom/publish_processed_vcf' +include { BCFTOOLS_NORM as BCFTOOLS_SPLIT_MULTI } from '../../modules/nf-core/bcftools/norm' +include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_TRUTH} from '../../modules/nf-core/bcftools/reheader' workflow PREPARE_VCFS_TRUTH { @@ -41,32 +42,32 @@ workflow PREPARE_VCFS_TRUTH { } // Reheader sample name for truth file - using meta.caller - VCF_REHEADER_SAMPLENAME( - truth_ch, + // rename sample name + BCFTOOLS_REHEADER_TRUTH( + truth_ch.map{ meta, file -> + [ meta, file, [], [] ] + }, fai ) - versions = versions.mix(VCF_REHEADER_SAMPLENAME.out.versions.first()) - vcf_ch = VCF_REHEADER_SAMPLENAME.out.ch_vcf + versions = versions.mix(BCFTOOLS_REHEADER_TRUTH.out.versions.first()) - if (params.preprocess.contains("normalization")){ + BCFTOOLS_REHEADER_TRUTH.out.vcf.join(BCFTOOLS_REHEADER_TRUTH.out.index) + .set{vcf_ch} - // multi-allelic variants will be splitter. - BCFTOOLS_NORM( + if (params.preprocess.contains("split_multiallelic")){ + + // Split -any- multi-allelic variants + BCFTOOLS_SPLIT_MULTI( vcf_ch, fasta ) - versions = versions.mix(BCFTOOLS_NORM.out.versions.first()) - - // index vcf file - TABIX_TABIX( - BCFTOOLS_NORM.out.vcf - ) - versions = versions.mix(TABIX_TABIX.out.versions) + versions = versions.mix(BCFTOOLS_SPLIT_MULTI.out.versions.first()) - BCFTOOLS_NORM.out.vcf.join(TABIX_TABIX.out.tbi, by:0) + BCFTOOLS_SPLIT_MULTI.out.vcf.join(BCFTOOLS_SPLIT_MULTI.out.tbi, by:0) .set{vcf_ch} } - if (params.preprocess.contains("deduplication")){ + + if (params.preprocess.contains("deduplicate")){ // Deduplicates variants at the same position test VCF_VARIANT_DEDUPLICATION( @@ -77,6 +78,22 @@ workflow PREPARE_VCFS_TRUTH { versions = versions.mix(VCF_VARIANT_DEDUPLICATION.out.versions) } + if (params.preprocess.contains("normalize")){ + + // Turn on left alignment and m\normalization + BCFTOOLS_NORM( + vcf_ch, + fasta + ) + versions = versions.mix(BCFTOOLS_NORM.out.versions.first()) + + BCFTOOLS_NORM.out.vcf.join(BCFTOOLS_NORM.out.tbi, by:0) + .set{vcf_ch} + } + + PUBLISH_PROCESSED_VCF( + vcf_ch + ) emit: vcf_ch // channel: [val(meta), vcf, tbi] high_conf_ch // channel: [val(meta), bed] diff --git a/subworkflows/local/small_germline_benchmark.nf b/subworkflows/local/small_germline_benchmark.nf index 5e07217..c8c6ab6 100644 --- a/subworkflows/local/small_germline_benchmark.nf +++ b/subworkflows/local/small_germline_benchmark.nf @@ -2,14 +2,14 @@ // SMALL_GERMLINE_BENCHMARK: SUBWORKFLOW FOR SMALL GERMLINE VARIANTS // -include { RTGTOOLS_FORMAT } from '../../modules/nf-core/rtgtools/format/main' -include { RTGTOOLS_VCFEVAL } from '../../modules/nf-core/rtgtools/vcfeval/main' -include { HAPPY_HAPPY } from '../../modules/nf-core/happy/happy/main' -include { HAPPY_PREPY } from '../../modules/nf-core/happy/prepy/main' -include { VCF_REHEADER_SAMPLENAME as VCF_REHEADER_SAMPLENAME_1 } from '../local/vcf_reheader_samplename' -include { VCF_REHEADER_SAMPLENAME as VCF_REHEADER_SAMPLENAME_2 } from '../local/vcf_reheader_samplename' -include { VCF_REHEADER_SAMPLENAME as VCF_REHEADER_SAMPLENAME_3 } from '../local/vcf_reheader_samplename' -include { VCF_REHEADER_SAMPLENAME as VCF_REHEADER_SAMPLENAME_4 } from '../local/vcf_reheader_samplename' +include { RTGTOOLS_FORMAT } from '../../modules/nf-core/rtgtools/format/main' +include { RTGTOOLS_VCFEVAL } from '../../modules/nf-core/rtgtools/vcfeval/main' +include { HAPPY_HAPPY } from '../../modules/nf-core/happy/happy/main' +include { HAPPY_PREPY } from '../../modules/nf-core/happy/prepy/main' +include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_1 } from '../../modules/nf-core/bcftools/reheader' +include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_2 } from '../../modules/nf-core/bcftools/reheader' +include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_3 } from '../../modules/nf-core/bcftools/reheader' +include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_4 } from '../../modules/nf-core/bcftools/reheader' workflow SMALL_GERMLINE_BENCHMARK { take: @@ -55,43 +55,55 @@ workflow SMALL_GERMLINE_BENCHMARK { summary_reports = summary_reports.mix(report) // reheader benchmarking results properly and tag meta - VCF_REHEADER_SAMPLENAME_1( - RTGTOOLS_VCFEVAL.out.fn_vcf, + BCFTOOLS_REHEADER_1( + RTGTOOLS_VCFEVAL.out.fn_vcf.map{ meta, vcf -> + [ meta, vcf, [], [] ] + }, fai ) - versions = versions.mix(VCF_REHEADER_SAMPLENAME_1.out.versions) + versions = versions.mix(BCFTOOLS_REHEADER_1.out.versions.first()) - VCF_REHEADER_SAMPLENAME_1.out.ch_vcf + BCFTOOLS_REHEADER_1.out.vcf + .join(BCFTOOLS_REHEADER_1.out.index) .map { _meta, file, index -> tuple([vartype: params.variant_type] + [tag: "FN"] + [id: "rtgtools"], file, index) } .set { vcf_fn } - VCF_REHEADER_SAMPLENAME_2( - RTGTOOLS_VCFEVAL.out.fp_vcf, + BCFTOOLS_REHEADER_2( + RTGTOOLS_VCFEVAL.out.fp_vcf.map{ meta, vcf -> + [ meta, vcf, [], [] ] + }, fai ) - versions = versions.mix(VCF_REHEADER_SAMPLENAME_2.out.versions) + versions = versions.mix(BCFTOOLS_REHEADER_2.out.versions) - VCF_REHEADER_SAMPLENAME_2.out.ch_vcf + BCFTOOLS_REHEADER_2.out.vcf + .join(BCFTOOLS_REHEADER_2.out.index) .map { _meta, file, index -> tuple([vartype: params.variant_type] + [tag: "FP"] + [id: "rtgtools"], file, index) } .set { vcf_fp } - VCF_REHEADER_SAMPLENAME_3( - RTGTOOLS_VCFEVAL.out.baseline_vcf, + BCFTOOLS_REHEADER_3( + RTGTOOLS_VCFEVAL.out.baseline_vcf.map{ meta, vcf -> + [ meta, vcf, [], [] ] + }, fai ) - versions = versions.mix(VCF_REHEADER_SAMPLENAME_3.out.versions) + versions = versions.mix(BCFTOOLS_REHEADER_3.out.versions) - VCF_REHEADER_SAMPLENAME_3.out.ch_vcf + BCFTOOLS_REHEADER_3.out.vcf + .join(BCFTOOLS_REHEADER_3.out.index) .map { _meta, file, index -> tuple([vartype: params.variant_type] + [tag: "TP_base"] + [id: "rtgtools"], file, index) } .set { vcf_tp_base } - VCF_REHEADER_SAMPLENAME_4( - RTGTOOLS_VCFEVAL.out.tp_vcf, + BCFTOOLS_REHEADER_4( + RTGTOOLS_VCFEVAL.out.tp_vcf.map{ meta, vcf -> + [ meta, vcf, [], [] ] + }, fai ) - versions = versions.mix(VCF_REHEADER_SAMPLENAME_4.out.versions) + versions = versions.mix(BCFTOOLS_REHEADER_4.out.versions) - VCF_REHEADER_SAMPLENAME_4.out.ch_vcf + BCFTOOLS_REHEADER_4.out.vcf + .join(BCFTOOLS_REHEADER_4.out.index) .map { _meta, file, index -> tuple([vartype: params.variant_type] + [tag: "TP_comp"] + [id: "rtgtools"], file, index) } .set { vcf_tp_comp } diff --git a/subworkflows/local/split_small_variants_test.nf b/subworkflows/local/split_small_variants_test.nf index 00f26fd..d595729 100644 --- a/subworkflows/local/split_small_variants_test.nf +++ b/subworkflows/local/split_small_variants_test.nf @@ -4,8 +4,8 @@ include { BCFTOOLS_VIEW as BCFTOOLS_VIEW_SNV } from '../../modules/nf-core/bcftools/view' include { BCFTOOLS_VIEW as BCFTOOLS_VIEW_INDEL } from '../../modules/nf-core/bcftools/view' -include { TABIX_BGZIPTABIX as TABIX_BGZIPTABIX_1 } from '../../modules/nf-core/tabix/bgziptabix' -include { TABIX_BGZIPTABIX as TABIX_BGZIPTABIX_2 } from '../../modules/nf-core/tabix/bgziptabix' +include { TABIX_BGZIPTABIX as TABIX_BGZIPTABIX_SNV } from '../../modules/nf-core/tabix/bgziptabix' +include { TABIX_BGZIPTABIX as TABIX_BGZIPTABIX_INDEL } from '../../modules/nf-core/tabix/bgziptabix' workflow SPLIT_SMALL_VARIANTS_TEST { take: @@ -25,12 +25,12 @@ workflow SPLIT_SMALL_VARIANTS_TEST { ) versions = versions.mix(BCFTOOLS_VIEW_SNV.out.versions.first()) - TABIX_BGZIPTABIX_1( + TABIX_BGZIPTABIX_SNV( BCFTOOLS_VIEW_SNV.out.vcf ) - versions = versions.mix(TABIX_BGZIPTABIX_1.out.versions.first()) + versions = versions.mix(TABIX_BGZIPTABIX_SNV.out.versions.first()) - TABIX_BGZIPTABIX_1.out.gz_tbi + TABIX_BGZIPTABIX_SNV.out.gz_tbi .map { meta, file, index -> tuple(meta + [vartype: "snv"], file, index) } .set{split_snv_vcf} out_vcf_ch = out_vcf_ch.mix(split_snv_vcf) @@ -43,11 +43,11 @@ workflow SPLIT_SMALL_VARIANTS_TEST { ) versions = versions.mix(BCFTOOLS_VIEW_INDEL.out.versions.first()) - TABIX_BGZIPTABIX_2( + TABIX_BGZIPTABIX_INDEL( BCFTOOLS_VIEW_INDEL.out.vcf ) - versions = versions.mix(TABIX_BGZIPTABIX_2.out.versions.first()) - TABIX_BGZIPTABIX_2.out.gz_tbi + versions = versions.mix(TABIX_BGZIPTABIX_INDEL.out.versions.first()) + TABIX_BGZIPTABIX_INDEL.out.gz_tbi .map { meta, file, index -> tuple(meta + [vartype: "indel"], file, index) } .set{split_indel_vcf} out_vcf_ch = out_vcf_ch.mix(split_indel_vcf) diff --git a/subworkflows/local/sv_germline_benchmark.nf b/subworkflows/local/sv_germline_benchmark.nf index 780265c..1ad0b3b 100644 --- a/subworkflows/local/sv_germline_benchmark.nf +++ b/subworkflows/local/sv_germline_benchmark.nf @@ -2,15 +2,15 @@ // SV_GERMLINE_BENCHMARK: SUBWORKFLOW FOR SV GERMLINE VARIANTS // -include { TRUVARI_BENCH } from '../../modules/nf-core/truvari/bench' -include { SVANALYZER_SVBENCHMARK } from '../../modules/nf-core/svanalyzer/svbenchmark' -include { WITTYER } from '../../modules/nf-core/wittyer' -include { TABIX_BGZIP as TABIX_BGZIP_QUERY } from '../../modules/nf-core/tabix/bgzip' -include { TABIX_BGZIP as TABIX_BGZIP_TRUTH } from '../../modules/nf-core/tabix/bgzip' -include { VCF_REHEADER_SAMPLENAME as VCF_REHEADER_SAMPLENAME_1 } from '../local/vcf_reheader_samplename' -include { VCF_REHEADER_SAMPLENAME as VCF_REHEADER_SAMPLENAME_2 } from '../local/vcf_reheader_samplename' -include { VCF_REHEADER_SAMPLENAME as VCF_REHEADER_SAMPLENAME_3 } from '../local/vcf_reheader_samplename' -include { VCF_REHEADER_SAMPLENAME as VCF_REHEADER_SAMPLENAME_4 } from '../local/vcf_reheader_samplename' +include { TRUVARI_BENCH } from '../../modules/nf-core/truvari/bench' +include { SVANALYZER_SVBENCHMARK } from '../../modules/nf-core/svanalyzer/svbenchmark' +include { WITTYER } from '../../modules/nf-core/wittyer' +include { TABIX_BGZIP as TABIX_BGZIP_QUERY } from '../../modules/nf-core/tabix/bgzip' +include { TABIX_BGZIP as TABIX_BGZIP_TRUTH } from '../../modules/nf-core/tabix/bgzip' +include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_1 } from '../../modules/nf-core/bcftools/reheader' +include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_2 } from '../../modules/nf-core/bcftools/reheader' +include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_3 } from '../../modules/nf-core/bcftools/reheader' +include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_4 } from '../../modules/nf-core/bcftools/reheader' workflow SV_GERMLINE_BENCHMARK { take: @@ -43,49 +43,61 @@ workflow SV_GERMLINE_BENCHMARK { summary_reports = summary_reports.mix(report) // reheader fn vcf files for tagged results - VCF_REHEADER_SAMPLENAME_1( - TRUVARI_BENCH.out.fn_vcf, + BCFTOOLS_REHEADER_1( + TRUVARI_BENCH.out.fn_vcf.map{ meta, vcf -> + [ meta, vcf, [], [] ] + }, fai ) - versions = versions.mix(VCF_REHEADER_SAMPLENAME_1.out.versions) + versions = versions.mix(BCFTOOLS_REHEADER_1.out.versions) - VCF_REHEADER_SAMPLENAME_1.out.ch_vcf + BCFTOOLS_REHEADER_1.out.vcf + .join(BCFTOOLS_REHEADER_1.out.index) .map { _meta, file, _index -> tuple([vartype: params.variant_type] + [tag: "FN"] + [id: "truvari"], file) } .set { vcf_fn } // reheader fp vcf files for tagged results - VCF_REHEADER_SAMPLENAME_2( - TRUVARI_BENCH.out.fp_vcf, + BCFTOOLS_REHEADER_2( + TRUVARI_BENCH.out.fp_vcf.map{ meta, vcf -> + [ meta, vcf, [], [] ] + }, fai ) - versions = versions.mix(VCF_REHEADER_SAMPLENAME_2.out.versions) + versions = versions.mix(BCFTOOLS_REHEADER_2.out.versions) // add tag and to meta - VCF_REHEADER_SAMPLENAME_2.out.ch_vcf + BCFTOOLS_REHEADER_2.out.vcf + .join(BCFTOOLS_REHEADER_2.out.index) .map { _meta, file, _index -> tuple([vartype: params.variant_type] + [tag: "FP"] + [id: "truvari"], file) } .set { vcf_fp } // reheader base tp vcf files for tagged results - VCF_REHEADER_SAMPLENAME_3( - TRUVARI_BENCH.out.tp_base_vcf, + BCFTOOLS_REHEADER_3( + TRUVARI_BENCH.out.tp_base_vcf.map{ meta, vcf -> + [ meta, vcf, [], [] ] + }, fai ) - versions = versions.mix(VCF_REHEADER_SAMPLENAME_3.out.versions) + versions = versions.mix(BCFTOOLS_REHEADER_3.out.versions) // add tag and to meta - VCF_REHEADER_SAMPLENAME_3.out.ch_vcf + BCFTOOLS_REHEADER_3.out.vcf + .join(BCFTOOLS_REHEADER_3.out.index) .map { _meta, file, _index -> tuple([vartype: params.variant_type] + [tag: "TP_base"] + [id: "truvari"], file) } .set { vcf_tp_base } // reheader comp tp vcf files for tagged results - VCF_REHEADER_SAMPLENAME_4( - TRUVARI_BENCH.out.tp_comp_vcf, + BCFTOOLS_REHEADER_4( + TRUVARI_BENCH.out.tp_comp_vcf.map{ meta, vcf -> + [ meta, vcf, [], [] ] + }, fai ) - versions = versions.mix(VCF_REHEADER_SAMPLENAME_4.out.versions) + versions = versions.mix(BCFTOOLS_REHEADER_4.out.versions) // add tag and to meta - VCF_REHEADER_SAMPLENAME_4.out.ch_vcf + BCFTOOLS_REHEADER_4.out.vcf + .join(BCFTOOLS_REHEADER_4.out.index) .map { _meta, file, _index -> tuple([vartype: params.variant_type] + [tag: "TP_comp"] + [id: "truvari"], file) } .set { vcf_tp_comp } diff --git a/subworkflows/local/vcf_reheader_samplename.nf b/subworkflows/local/vcf_reheader_samplename.nf deleted file mode 100644 index 5d4384d..0000000 --- a/subworkflows/local/vcf_reheader_samplename.nf +++ /dev/null @@ -1,41 +0,0 @@ -// -// VCF_REHEADER_SAMPLENAME: reheader sample names when needed -// - -include { TABIX_TABIX } from '../../modules/nf-core/tabix/tabix' -include { BCFTOOLS_REHEADER } from '../../modules/nf-core/bcftools/reheader' - -workflow VCF_REHEADER_SAMPLENAME { - take: - vcf_ch // channel: [val(meta), vcf] - fai // reference channel [val(meta), ref.fai] - - main: - - versions = Channel.empty() - - // rename sample name - BCFTOOLS_REHEADER( - - vcf_ch.map{ meta, vcf -> - [ meta, vcf, [], [] ] - }, - fai - ) - versions = versions.mix(BCFTOOLS_REHEADER.out.versions.first()) - - TABIX_TABIX( - BCFTOOLS_REHEADER.out.vcf - ) - versions = versions.mix(TABIX_TABIX.out.versions.first()) - - BCFTOOLS_REHEADER.out.vcf - .join(TABIX_TABIX.out.tbi, failOnDuplicate:true, failOnMismatch:true) - .set{ch_vcf} - - - emit: - ch_vcf // channel: [val(meta), vcf, index ] - versions // channel: [versions.yml ] - -} diff --git a/subworkflows/local/vcf_variant_deduplication.nf b/subworkflows/local/vcf_variant_deduplication.nf index 81b7496..b6331e1 100644 --- a/subworkflows/local/vcf_variant_deduplication.nf +++ b/subworkflows/local/vcf_variant_deduplication.nf @@ -3,7 +3,6 @@ // include { BCFTOOLS_SORT } from '../../modules/nf-core/bcftools/sort' -include { TABIX_TABIX } from '../../modules/nf-core/tabix/tabix' include { BCFTOOLS_NORM as BCFTOOLS_DEDUP } from '../../modules/nf-core/bcftools/norm' workflow VCF_VARIANT_DEDUPLICATION { @@ -29,13 +28,7 @@ workflow VCF_VARIANT_DEDUPLICATION { ) versions = versions.mix(BCFTOOLS_SORT.out.versions.first()) - TABIX_TABIX( - BCFTOOLS_SORT.out.vcf - ) - versions = versions.mix(TABIX_TABIX.out.versions.first()) - - BCFTOOLS_SORT.out.vcf - .join(TABIX_TABIX.out.tbi, failOnDuplicate:true, failOnMismatch:true) + BCFTOOLS_SORT.out.vcf.join(BCFTOOLS_SORT.out.tbi) .set{ch_vcf} emit: diff --git a/tests/germline_small.nf.test.snap b/tests/germline_small.nf.test.snap index fabe72f..6e64de8 100644 --- a/tests/germline_small.nf.test.snap +++ b/tests/germline_small.nf.test.snap @@ -1,10 +1,10 @@ { "-stub": { "content": [ - 76, + 62, { "BCFTOOLS_DEDUP": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_FILTER": { "bcftools": 1.2 @@ -13,13 +13,28 @@ "bcftools": 1.2 }, "BCFTOOLS_NORM": { - "bcftools": 1.18 + "bcftools": 1.2 }, - "BCFTOOLS_REHEADER": { - "bcftools": 1.18 + "BCFTOOLS_REHEADER_1": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_2": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_3": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_4": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_QUERY": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_TRUTH": { + "bcftools": 1.2 }, "BCFTOOLS_SORT": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_STATS": { "bcftools": 1.18 @@ -48,9 +63,6 @@ "RTGTOOLS_VCFEVAL": { "rtg-tools": "3.12.1" }, - "TABIX_TABIX": { - "tabix": "1.19.1" - }, "VCF_TO_CSV": { "python": "3.12.4" }, @@ -67,7 +79,8 @@ "small", "small/HG002", "small/HG002/preprocess", - "small/HG002/preprocess/HG002_GRCh38_CMRG_smallvar_v1.00.rh.norm.dedup.sort.vcf.gz", + "small/HG002/preprocess/HG002.vcf.gz", + "small/HG002/preprocess/HG002.vcf.gz.tbi", "small/HG002/stats", "small/HG002/stats/bcftools", "small/HG002/stats/bcftools/HG002.bcftools_stats.txt", @@ -162,9 +175,8 @@ "small/test1/benchmarks/rtgtools/test1.HG002.strelka.tp.vcf.gz.tbi", "small/test1/benchmarks/rtgtools/test1.HG002.strelka.weighted_roc.tsv.gz", "small/test1/preprocess", - "small/test1/preprocess/HG002.strelka.variants.chr21.rh.norm.filter.vcf", - "small/test1/preprocess/test1.HG002.strelka.prepy.vcf.gz", - "small/test1/preprocess/test1.dedup.sort.vcf.gz", + "small/test1/preprocess/test1.vcf.gz", + "small/test1/preprocess/test1.vcf.gz.tbi", "small/test1/stats", "small/test1/stats/bcftools", "small/test1/stats/bcftools/test1.strelka.bcftools_stats.txt", @@ -197,9 +209,8 @@ "small/test2/benchmarks/rtgtools/test2.HG002.bcftools.tp.vcf.gz.tbi", "small/test2/benchmarks/rtgtools/test2.HG002.bcftools.weighted_roc.tsv.gz", "small/test2/preprocess", - "small/test2/preprocess/HG002.bcftools.chr21.rh.norm.filter.vcf", - "small/test2/preprocess/test2.HG002.bcftools.prepy.vcf.gz", - "small/test2/preprocess/test2.dedup.sort.vcf.gz", + "small/test2/preprocess/test2.vcf.gz", + "small/test2/preprocess/test2.vcf.gz.tbi", "small/test2/stats", "small/test2/stats/bcftools", "small/test2/stats/bcftools/test2.bcftools.bcftools_stats.txt" @@ -233,14 +244,14 @@ "nf-test": "0.9.2", "nextflow": "24.10.3" }, - "timestamp": "2025-01-13T13:40:17.22451061" + "timestamp": "2025-01-16T15:55:45.473583366" }, "Params: --analysis 'germline' --variant_type 'small' --method 'happy,rtgtools'": { "content": [ - 76, + 62, { "BCFTOOLS_DEDUP": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_FILTER": { "bcftools": 1.2 @@ -249,13 +260,28 @@ "bcftools": 1.2 }, "BCFTOOLS_NORM": { - "bcftools": 1.18 + "bcftools": 1.2 }, - "BCFTOOLS_REHEADER": { - "bcftools": 1.18 + "BCFTOOLS_REHEADER_1": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_2": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_3": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_4": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_QUERY": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_TRUTH": { + "bcftools": 1.2 }, "BCFTOOLS_SORT": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_STATS": { "bcftools": 1.18 @@ -284,9 +310,6 @@ "RTGTOOLS_VCFEVAL": { "rtg-tools": "3.12.1" }, - "TABIX_TABIX": { - "tabix": "1.19.1" - }, "VCF_TO_CSV": { "python": "3.12.4" }, @@ -324,7 +347,8 @@ "small", "small/HG002", "small/HG002/preprocess", - "small/HG002/preprocess/HG002_GRCh38_CMRG_smallvar_v1.00.rh.norm.dedup.sort.vcf.gz", + "small/HG002/preprocess/HG002.vcf.gz", + "small/HG002/preprocess/HG002.vcf.gz.tbi", "small/HG002/stats", "small/HG002/stats/bcftools", "small/HG002/stats/bcftools/HG002.bcftools_stats.txt", @@ -493,9 +517,8 @@ "small/test1/benchmarks/rtgtools/test1.HG002.strelka.tp.vcf.gz.tbi", "small/test1/benchmarks/rtgtools/test1.HG002.strelka.weighted_roc.tsv.gz", "small/test1/preprocess", - "small/test1/preprocess/HG002.strelka.variants.chr21.rh.norm.filter.vcf", - "small/test1/preprocess/test1.HG002.strelka.prepy.vcf.gz", - "small/test1/preprocess/test1.dedup.sort.vcf.gz", + "small/test1/preprocess/test1.vcf.gz", + "small/test1/preprocess/test1.vcf.gz.tbi", "small/test1/stats", "small/test1/stats/bcftools", "small/test1/stats/bcftools/test1.strelka.bcftools_stats.txt", @@ -528,9 +551,8 @@ "small/test2/benchmarks/rtgtools/test2.HG002.bcftools.tp.vcf.gz.tbi", "small/test2/benchmarks/rtgtools/test2.HG002.bcftools.weighted_roc.tsv.gz", "small/test2/preprocess", - "small/test2/preprocess/HG002.bcftools.chr21.rh.norm.filter.vcf", - "small/test2/preprocess/test2.HG002.bcftools.prepy.vcf.gz", - "small/test2/preprocess/test2.dedup.sort.vcf.gz", + "small/test2/preprocess/test2.vcf.gz", + "small/test2/preprocess/test2.vcf.gz.tbi", "small/test2/stats", "small/test2/stats/bcftools", "small/test2/stats/bcftools/test2.bcftools.bcftools_stats.txt" @@ -552,33 +574,33 @@ "suffixIndex0:md5,b7bb2ba061ab54c0bf07c0a941d0277a", "suffixdata0:md5,f2876dd730673cd49c4de191001f634e", "suffixpointer0:md5,468281ffb10d7dd934289af762a03781", - "HG002.bcftools_stats.txt:md5,b215fc0030c53bc8887e28b23b97efb6", - "test1.HG002.strelka.extended.csv:md5,4362260b357ac0221414095f4c5a8981", - "test1.HG002.strelka.roc.Locations.INDEL.PASS.csv.gz:md5,a5ba4044a89ae80fb0ddf95147b5ae4c", - "test1.HG002.strelka.roc.Locations.INDEL.csv.gz:md5,52edef7d20ac8a7e03771037f5c93fe4", - "test1.HG002.strelka.roc.Locations.SNP.PASS.csv.gz:md5,f71e697d7ebaf1d670e5dc2c0e0106d3", - "test1.HG002.strelka.roc.Locations.SNP.csv.gz:md5,c388d5a15ba2ae8dff709b030f1b4828", - "test1.HG002.strelka.roc.all.csv.gz:md5,59ebbe78bf428b5c2c78c8ff92f54545", - "test1.HG002.strelka.summary.csv:md5,60af846379cf4fe078fbed6b9d1e8178", - "test1.HG002.strelka.phasing.txt:md5,838e67ae5b9cd9e218095596c03fbee3", - "test1.HG002.strelka.summary.txt:md5,e79779d3faebe02bae943ddc17c4cf91", - "test1.strelka.bcftools_stats.txt:md5,492f42090004470e7e0ea7abc5f89bdf", - "test2.HG002.bcftools.extended.csv:md5,2f8ef20f46c821333ba970e3034a6ccd", - "test2.HG002.bcftools.roc.Locations.INDEL.PASS.csv.gz:md5,cdf3fdb7c5b4c54d9896e37a99dbf4f9", - "test2.HG002.bcftools.roc.Locations.INDEL.csv.gz:md5,9b16abcfe483356020c550e1292554ed", - "test2.HG002.bcftools.roc.Locations.SNP.PASS.csv.gz:md5,f71e697d7ebaf1d670e5dc2c0e0106d3", - "test2.HG002.bcftools.roc.Locations.SNP.csv.gz:md5,c388d5a15ba2ae8dff709b030f1b4828", - "test2.HG002.bcftools.roc.all.csv.gz:md5,e772fefec84f9a6e60c6979bac14cedc", - "test2.HG002.bcftools.summary.csv:md5,05722d23f523141fdce842a18f1d8aa2", + "HG002.bcftools_stats.txt:md5,25a566bfca26275fbdce5d80b4c43d45", + "test1.HG002.strelka.extended.csv:md5,07297e4480e0b765b290a10f8bddbfe8", + "test1.HG002.strelka.roc.Locations.INDEL.PASS.csv.gz:md5,0f68dbeccf04d6bbe942ae90a70da219", + "test1.HG002.strelka.roc.Locations.INDEL.csv.gz:md5,f15e3ec73cba5af3632bd276e9d93278", + "test1.HG002.strelka.roc.Locations.SNP.PASS.csv.gz:md5,aa2ccc4ceb4dcfc7173edcdc78b7b89b", + "test1.HG002.strelka.roc.Locations.SNP.csv.gz:md5,73928333c290a527d8faccf22bd834fc", + "test1.HG002.strelka.roc.all.csv.gz:md5,cbf4db06f9cbdd0b3b3a29ee4bfc362b", + "test1.HG002.strelka.summary.csv:md5,8a5fff1b336e93cee21b3a81f49ab290", + "test1.HG002.strelka.phasing.txt:md5,8ed2c215bc14dc461a10e923d0127347", + "test1.HG002.strelka.summary.txt:md5,537e7469d43f94647018e173928f088e", + "test1.strelka.bcftools_stats.txt:md5,32586b4f0851fb4f386534a29fe8658f", + "test2.HG002.bcftools.extended.csv:md5,9ecb6d4bd55bc7e1ac18bd6cddaa15d9", + "test2.HG002.bcftools.roc.Locations.INDEL.PASS.csv.gz:md5,8200b1ab6ab8d4665967ce24b28df103", + "test2.HG002.bcftools.roc.Locations.INDEL.csv.gz:md5,b223c143e1e534c0c8381c1d1bd130a7", + "test2.HG002.bcftools.roc.Locations.SNP.PASS.csv.gz:md5,aa2ccc4ceb4dcfc7173edcdc78b7b89b", + "test2.HG002.bcftools.roc.Locations.SNP.csv.gz:md5,73928333c290a527d8faccf22bd834fc", + "test2.HG002.bcftools.roc.all.csv.gz:md5,c8799bbce78b88317baccbde04d453b1", + "test2.HG002.bcftools.summary.csv:md5,1a1e3471f9df31365787f09dcafbcb79", "test2.HG002.bcftools.phasing.txt:md5,38920536b8c3e241e873c07ba61762e6", - "test2.HG002.bcftools.summary.txt:md5,a7bbcdf86cd3f1f7815ea9bc25b57b61", - "test2.bcftools.bcftools_stats.txt:md5,440fd66ee557b215bbacd05953215f7a" + "test2.HG002.bcftools.summary.txt:md5,b8b6ba8ea69ecf241de01b347c226386", + "test2.bcftools.bcftools_stats.txt:md5,648803535cd0f6f94f52a2ad4587be8b" ] ], "meta": { "nf-test": "0.9.2", "nextflow": "24.10.3" }, - "timestamp": "2025-01-13T14:00:31.553072869" + "timestamp": "2025-01-17T09:31:09.115891055" } } \ No newline at end of file diff --git a/tests/germline_sv.nf.test.snap b/tests/germline_sv.nf.test.snap index 84d6427..63b16d5 100644 --- a/tests/germline_sv.nf.test.snap +++ b/tests/germline_sv.nf.test.snap @@ -1,28 +1,43 @@ { "Params: --analysis 'germline' --variant_type 'structural' --method 'truvari,svbenchmark,wittyer'": { "content": [ - 144, + 121, { "BCFTOOLS_DEDUP": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_FILTER": { "bcftools": 1.2 }, "BCFTOOLS_NORM": { - "bcftools": 1.18 + "bcftools": 1.2 }, - "BCFTOOLS_REHEADER": { - "bcftools": 1.18 + "BCFTOOLS_REHEADER_1": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_2": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_3": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_4": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_QUERY": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_TRUTH": { + "bcftools": 1.2 }, "BCFTOOLS_SORT": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_STATS": { "bcftools": 1.18 }, "BCFTOOLS_VIEW_CONTIGS": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BGZIP_TABIX": { "tabix": 1.12 @@ -48,19 +63,13 @@ "SVYNC": { "svync": "0.1.2" }, - "TABIX_BGZIP": { - "tabix": "1.19.1" - }, - "TABIX_BGZIPTABIX": { - "tabix": "1.19.1" - }, "TABIX_BGZIP_QUERY": { "tabix": "1.19.1" }, "TABIX_BGZIP_TRUTH": { "tabix": "1.19.1" }, - "TABIX_TABIX": { + "TABIX_BGZIP_UNZIP": { "tabix": "1.19.1" }, "TRUVARI_BENCH": { @@ -85,7 +94,8 @@ "structural", "structural/HG002", "structural/HG002/preprocess", - "structural/HG002/preprocess/HG002_GRCh38_difficult_medical_gene_SV_benchmark_v0.01.chr21.rh.norm.dedup.sort.vcf.gz", + "structural/HG002/preprocess/HG002.vcf.gz", + "structural/HG002/preprocess/HG002.vcf.gz.tbi", "structural/HG002/stats", "structural/HG002/stats/bcftools", "structural/HG002/stats/bcftools/HG002.bcftools_stats.txt", @@ -263,12 +273,8 @@ "structural/test1/benchmarks/wittyer/test1.HG002.manta.vcf.gz", "structural/test1/benchmarks/wittyer/test1.HG002.manta.vcf.gz.tbi", "structural/test1/preprocess", - "structural/test1/preprocess/manta.HG002.chr21.norm.sort.vcf.gz", - "structural/test1/preprocess/manta.HG002.chr21.norm.vcf.gz", - "structural/test1/preprocess/test1.dedup.sort.vcf.gz", - "structural/test1/preprocess/test1.manta.svync.vcf.gz", - "structural/test1/preprocess/test1.norm.filter.filter.vcf", - "structural/test1/preprocess/test1.norm.filter.vcf", + "structural/test1/preprocess/test1.vcf.gz", + "structural/test1/preprocess/test1.vcf.gz.tbi", "structural/test1/stats", "structural/test1/stats/bcftools", "structural/test1/stats/bcftools/test1.manta.bcftools_stats.txt", @@ -297,11 +303,8 @@ "structural/test2/benchmarks/wittyer/test2.HG002.merged.vcf.gz", "structural/test2/benchmarks/wittyer/test2.HG002.merged.vcf.gz.tbi", "structural/test2/preprocess", - "structural/test2/preprocess/Ashkenazim_HG002.filtered.sv.chr21.norm.sort.vcf.gz", - "structural/test2/preprocess/Ashkenazim_HG002.filtered.sv.chr21.norm.vcf.gz", - "structural/test2/preprocess/test2.dedup.sort.vcf.gz", - "structural/test2/preprocess/test2.norm.filter.filter.vcf", - "structural/test2/preprocess/test2.norm.filter.vcf", + "structural/test2/preprocess/test2.vcf.gz", + "structural/test2/preprocess/test2.vcf.gz.tbi", "structural/test2/stats", "structural/test2/stats/bcftools", "structural/test2/stats/bcftools/test2.merged.bcftools_stats.txt", @@ -330,12 +333,8 @@ "structural/test3/benchmarks/wittyer/test3.HG002.dragen.vcf.gz", "structural/test3/benchmarks/wittyer/test3.HG002.dragen.vcf.gz.tbi", "structural/test3/preprocess", - "structural/test3/preprocess/HG002_DRAGEN_SV_hg19.chr21.norm.sort.vcf.gz", - "structural/test3/preprocess/HG002_DRAGEN_SV_hg19.chr21.norm.vcf.gz", - "structural/test3/preprocess/test3.dedup.sort.vcf.gz", - "structural/test3/preprocess/test3.dragen.svync.vcf.gz", - "structural/test3/preprocess/test3.norm.filter.filter.vcf", - "structural/test3/preprocess/test3.norm.filter.vcf", + "structural/test3/preprocess/test3.vcf.gz", + "structural/test3/preprocess/test3.vcf.gz.tbi", "structural/test3/stats", "structural/test3/stats/bcftools", "structural/test3/stats/bcftools/test3.dragen.bcftools_stats.txt", @@ -343,19 +342,19 @@ "structural/test3/stats/survivor/test3.dragen_mqc.stats" ], [ - "HG002.bcftools_stats.txt:md5,8294f172a72ca7219a32db9c27e2524c", + "HG002.bcftools_stats.txt:md5,06ee97dcd1b34ddca03a7e5f9af23e68", "HG002_mqc.stats:md5,68681df47b35e3193be03610f5c6e3d6", "test1.HG002.manta.distances:md5,346f18a5cbeece98716951c8fc2aaea4", "test1.HG002.manta.report:md5,4a53712a9d15fa6dfe6ddd5848ca691c", - "test1.manta.bcftools_stats.txt:md5,7d65792aa3a84de09675facf62135c93", + "test1.manta.bcftools_stats.txt:md5,9a82df362fe77db3330544f2b98b439e", "test1.manta_mqc.stats:md5,011ad66fec4287d32cb728c40e240c0b", "test2.HG002.merged.distances:md5,346f18a5cbeece98716951c8fc2aaea4", "test2.HG002.merged.report:md5,4a53712a9d15fa6dfe6ddd5848ca691c", - "test2.merged.bcftools_stats.txt:md5,1445742129b0ee67d8706af3dcf0ab2d", + "test2.merged.bcftools_stats.txt:md5,bbd579cedaf78b6199fb71e52acfc14d", "test2.merged_mqc.stats:md5,011ad66fec4287d32cb728c40e240c0b", "test3.HG002.dragen.distances:md5,346f18a5cbeece98716951c8fc2aaea4", "test3.HG002.dragen.report:md5,4a53712a9d15fa6dfe6ddd5848ca691c", - "test3.dragen.bcftools_stats.txt:md5,5d2b48ac5f194f5a2cf01b9623a28cce", + "test3.dragen.bcftools_stats.txt:md5,73648a9eb3b6cd5049f579ef861637b2", "test3.dragen_mqc.stats:md5,011ad66fec4287d32cb728c40e240c0b" ] ], @@ -363,32 +362,47 @@ "nf-test": "0.9.2", "nextflow": "24.10.3" }, - "timestamp": "2025-01-13T13:53:07.620717905" + "timestamp": "2025-01-17T09:36:16.512872909" }, "-stub": { "content": [ - 144, + 121, { "BCFTOOLS_DEDUP": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_FILTER": { "bcftools": 1.2 }, "BCFTOOLS_NORM": { - "bcftools": 1.18 + "bcftools": 1.2 }, - "BCFTOOLS_REHEADER": { - "bcftools": 1.18 + "BCFTOOLS_REHEADER_1": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_2": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_3": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_4": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_QUERY": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_TRUTH": { + "bcftools": 1.2 }, "BCFTOOLS_SORT": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_STATS": { "bcftools": 1.18 }, "BCFTOOLS_VIEW_CONTIGS": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BGZIP_TABIX": { "tabix": 1.12 @@ -414,19 +428,13 @@ "SVYNC": { "svync": "0.1.2" }, - "TABIX_BGZIP": { - "tabix": "1.19.1" - }, - "TABIX_BGZIPTABIX": { - "tabix": "1.19.1" - }, "TABIX_BGZIP_QUERY": { "tabix": "1.19.1" }, "TABIX_BGZIP_TRUTH": { "tabix": "1.19.1" }, - "TABIX_TABIX": { + "TABIX_BGZIP_UNZIP": { "tabix": "1.19.1" }, "TRUVARI_BENCH": { @@ -451,7 +459,8 @@ "structural", "structural/HG002", "structural/HG002/preprocess", - "structural/HG002/preprocess/HG002_GRCh38_difficult_medical_gene_SV_benchmark_v0.01.chr21.rh.norm.dedup.sort.vcf.gz", + "structural/HG002/preprocess/HG002.vcf.gz", + "structural/HG002/preprocess/HG002.vcf.gz.tbi", "structural/HG002/stats", "structural/HG002/stats/bcftools", "structural/HG002/stats/bcftools/HG002.bcftools_stats.txt", @@ -568,12 +577,8 @@ "structural/test1/benchmarks/wittyer/test1.HG002.manta.vcf.gz", "structural/test1/benchmarks/wittyer/test1.HG002.manta.vcf.gz.tbi", "structural/test1/preprocess", - "structural/test1/preprocess/manta.HG002.chr21.norm.sort.vcf.gz", - "structural/test1/preprocess/manta.HG002.chr21.norm.vcf.gz", - "structural/test1/preprocess/test1.dedup.sort.vcf.gz", - "structural/test1/preprocess/test1.manta.svync.vcf.gz", - "structural/test1/preprocess/test1.norm.filter.filter.vcf", - "structural/test1/preprocess/test1.norm.filter.vcf", + "structural/test1/preprocess/test1.vcf.gz", + "structural/test1/preprocess/test1.vcf.gz.tbi", "structural/test1/stats", "structural/test1/stats/bcftools", "structural/test1/stats/bcftools/test1.manta.bcftools_stats.txt", @@ -602,11 +607,8 @@ "structural/test2/benchmarks/wittyer/test2.HG002.merged.vcf.gz", "structural/test2/benchmarks/wittyer/test2.HG002.merged.vcf.gz.tbi", "structural/test2/preprocess", - "structural/test2/preprocess/Ashkenazim_HG002.filtered.sv.chr21.norm.sort.vcf.gz", - "structural/test2/preprocess/Ashkenazim_HG002.filtered.sv.chr21.norm.vcf.gz", - "structural/test2/preprocess/test2.dedup.sort.vcf.gz", - "structural/test2/preprocess/test2.norm.filter.filter.vcf", - "structural/test2/preprocess/test2.norm.filter.vcf", + "structural/test2/preprocess/test2.vcf.gz", + "structural/test2/preprocess/test2.vcf.gz.tbi", "structural/test2/stats", "structural/test2/stats/bcftools", "structural/test2/stats/bcftools/test2.merged.bcftools_stats.txt", @@ -635,12 +637,8 @@ "structural/test3/benchmarks/wittyer/test3.HG002.dragen.vcf.gz", "structural/test3/benchmarks/wittyer/test3.HG002.dragen.vcf.gz.tbi", "structural/test3/preprocess", - "structural/test3/preprocess/HG002_DRAGEN_SV_hg19.chr21.norm.sort.vcf.gz", - "structural/test3/preprocess/HG002_DRAGEN_SV_hg19.chr21.norm.vcf.gz", - "structural/test3/preprocess/test3.dedup.sort.vcf.gz", - "structural/test3/preprocess/test3.dragen.svync.vcf.gz", - "structural/test3/preprocess/test3.norm.filter.filter.vcf", - "structural/test3/preprocess/test3.norm.filter.vcf", + "structural/test3/preprocess/test3.vcf.gz", + "structural/test3/preprocess/test3.vcf.gz.tbi", "structural/test3/stats", "structural/test3/stats/bcftools", "structural/test3/stats/bcftools/test3.dragen.bcftools_stats.txt", @@ -668,6 +666,6 @@ "nf-test": "0.9.2", "nextflow": "24.10.3" }, - "timestamp": "2025-01-13T13:55:36.985456265" + "timestamp": "2025-01-16T15:43:53.130315988" } } \ No newline at end of file diff --git a/tests/liftover_test.nf.test.snap b/tests/liftover_test.nf.test.snap index 023d23d..44a77ed 100644 --- a/tests/liftover_test.nf.test.snap +++ b/tests/liftover_test.nf.test.snap @@ -1,19 +1,31 @@ { "Params: --analysis 'germline' --variant_type 'structural' --method 'truvari' --liftover 'test'": { "content": [ - 67, + 57, { - "BCFTOOLS_FILTER": { + "BCFTOOLS_REHEADER_1": { "bcftools": 1.2 }, - "BCFTOOLS_REHEADER": { - "bcftools": 1.18 + "BCFTOOLS_REHEADER_2": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_3": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_4": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_QUERY": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_TRUTH": { + "bcftools": 1.2 }, "BCFTOOLS_STATS": { "bcftools": 1.18 }, "BCFTOOLS_VIEW_CONTIGS": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BGZIP_TABIX": { "tabix": 1.12 @@ -39,10 +51,7 @@ "TABIX_BGZIP": { "tabix": "1.19.1" }, - "TABIX_BGZIPTABIX": { - "tabix": "1.19.1" - }, - "TABIX_TABIX": { + "TABIX_BGZIP_UNZIP": { "tabix": "1.19.1" }, "TRUVARI_BENCH": { @@ -63,8 +72,9 @@ "references/dictionary/genome.dict", "structural", "structural/HG002", - "structural/HG002/liftover", - "structural/HG002/liftover/test2.renamechr.vcf.gz", + "structural/HG002/preprocess", + "structural/HG002/preprocess/HG002.vcf.gz", + "structural/HG002/preprocess/HG002.vcf.gz.tbi", "structural/HG002/stats", "structural/HG002/stats/bcftools", "structural/HG002/stats/bcftools/HG002.bcftools_stats.txt", @@ -167,8 +177,8 @@ "structural/test1/benchmarks/truvari/test1.HG002.delly.tp-comp.vcf.gz", "structural/test1/benchmarks/truvari/test1.HG002.delly.tp-comp.vcf.gz.tbi", "structural/test1/preprocess", - "structural/test1/preprocess/test1.filter.filter.vcf", - "structural/test1/preprocess/test1.filter.vcf", + "structural/test1/preprocess/test1.vcf.gz", + "structural/test1/preprocess/test1.vcf.gz.tbi", "structural/test1/stats", "structural/test1/stats/bcftools", "structural/test1/stats/bcftools/test1.delly.bcftools_stats.txt", @@ -186,9 +196,12 @@ "structural/test2/benchmarks/truvari/test2.HG002.manta.tp-base.vcf.gz.tbi", "structural/test2/benchmarks/truvari/test2.HG002.manta.tp-comp.vcf.gz", "structural/test2/benchmarks/truvari/test2.HG002.manta.tp-comp.vcf.gz.tbi", + "structural/test2/preporcess", + "structural/test2/preporcess/liftover", + "structural/test2/preporcess/liftover/test2.reformatted.renamechr.vcf.gz", "structural/test2/preprocess", - "structural/test2/preprocess/test2.filter.filter.vcf", - "structural/test2/preprocess/test2.filter.vcf", + "structural/test2/preprocess/test2.vcf.gz", + "structural/test2/preprocess/test2.vcf.gz.tbi", "structural/test2/stats", "structural/test2/stats/bcftools", "structural/test2/stats/bcftools/test2.manta.bcftools_stats.txt", @@ -196,18 +209,18 @@ "structural/test2/stats/survivor/test2.manta_mqc.stats" ], [ - "HG002.bcftools_stats.txt:md5,7c007a87b5730787e570712e784d3cc3", + "HG002.bcftools_stats.txt:md5,142c1fab782c9eb2aa932440e1589461", "HG002_mqc.stats:md5,68681df47b35e3193be03610f5c6e3d6", - "test1.delly.bcftools_stats.txt:md5,aace5a23af8b02f56bcaf03722f9bb50", - "test1.delly_mqc.stats:md5,011ad66fec4287d32cb728c40e240c0b", - "test2.manta.bcftools_stats.txt:md5,1e700076bd303ee079b23658485eb7d4", - "test2.manta_mqc.stats:md5,011ad66fec4287d32cb728c40e240c0b" + "test1.delly.bcftools_stats.txt:md5,f204ab10d82b3cbc9ee51cb6ad548121", + "test1.delly_mqc.stats:md5,e140ad55975c767578b0dd6aff58ba29", + "test2.manta.bcftools_stats.txt:md5,0de1446278a5448b7dc5a02bcbf927a4", + "test2.manta_mqc.stats:md5,1d74b41c6b970992e3a39682b0a68e23" ] ], "meta": { "nf-test": "0.9.2", "nextflow": "24.10.3" }, - "timestamp": "2025-01-13T14:41:30.39094733" + "timestamp": "2025-01-17T10:22:18.275352814" } } \ No newline at end of file diff --git a/tests/liftover_truth.nf.test.snap b/tests/liftover_truth.nf.test.snap index b6d9288..98508cc 100644 --- a/tests/liftover_truth.nf.test.snap +++ b/tests/liftover_truth.nf.test.snap @@ -1,31 +1,46 @@ { "Params: --analysis 'germline' --variant_type 'small' --method 'happy,rtgtools' --liftover 'truth'": { "content": [ - 85, + 68, { "BCFTOOLS_DEDUP": { - "bcftools": 1.18 - }, - "BCFTOOLS_FILTER": { "bcftools": 1.2 }, "BCFTOOLS_MERGE": { "bcftools": 1.2 }, "BCFTOOLS_NORM": { - "bcftools": 1.18 + "bcftools": 1.2 }, - "BCFTOOLS_REHEADER": { - "bcftools": 1.18 + "BCFTOOLS_REHEADER_1": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_2": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_3": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_4": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_QUERY": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_TRUTH": { + "bcftools": 1.2 }, "BCFTOOLS_SORT": { - "bcftools": 1.18 + "bcftools": 1.2 + }, + "BCFTOOLS_SPLIT_MULTI": { + "bcftools": 1.2 }, "BCFTOOLS_STATS": { "bcftools": 1.18 }, "BCFTOOLS_VIEW_CONTIGS": { - "bcftools": 1.18 + "bcftools": 1.2 }, "DATAVZRD": { "datavzrd": "2.36.12" @@ -48,12 +63,6 @@ "RTGTOOLS_VCFEVAL": { "rtg-tools": "3.12.1" }, - "TABIX_BGZIPTABIX": { - "tabix": "1.19.1" - }, - "TABIX_TABIX": { - "tabix": "1.19.1" - }, "UCSC_LIFTOVER": { "ucsc": 377 }, @@ -95,11 +104,12 @@ "references/rtgtools/genome.sdf/summary.txt", "small", "small/HG002", - "small/HG002/liftover", - "small/HG002/liftover/HG002.renamechr.vcf.gz", - "small/HG002/liftover/HG002.sort.merged.bed", "small/HG002/preprocess", - "small/HG002/preprocess/HG002.renamechr.rh.norm.dedup.sort.vcf.gz", + "small/HG002/preprocess/HG002.vcf.gz", + "small/HG002/preprocess/HG002.vcf.gz.tbi", + "small/HG002/preprocess/liftover", + "small/HG002/preprocess/liftover/HG002.reformatted.renamechr.vcf.gz", + "small/HG002/preprocess/liftover/HG002_GRCh37_1_22_v4.2.1_highconf.liftedsort.merged.bed", "small/HG002/stats", "small/HG002/stats/bcftools", "small/HG002/stats/bcftools/HG002.bcftools_stats.txt", @@ -268,8 +278,8 @@ "small/test1/benchmarks/rtgtools/test1.HG002.strelka.tp.vcf.gz.tbi", "small/test1/benchmarks/rtgtools/test1.HG002.strelka.weighted_roc.tsv.gz", "small/test1/preprocess", - "small/test1/preprocess/test1.dedup.sort.vcf.gz", - "small/test1/preprocess/test1.norm.filter.vcf", + "small/test1/preprocess/test1.vcf.gz", + "small/test1/preprocess/test1.vcf.gz.tbi", "small/test1/stats", "small/test1/stats/bcftools", "small/test1/stats/bcftools/test1.strelka.bcftools_stats.txt", @@ -302,8 +312,8 @@ "small/test2/benchmarks/rtgtools/test2.HG002.bcftools.tp.vcf.gz.tbi", "small/test2/benchmarks/rtgtools/test2.HG002.bcftools.weighted_roc.tsv.gz", "small/test2/preprocess", - "small/test2/preprocess/test2.dedup.sort.vcf.gz", - "small/test2/preprocess/test2.norm.filter.vcf", + "small/test2/preprocess/test2.vcf.gz", + "small/test2/preprocess/test2.vcf.gz.tbi", "small/test2/stats", "small/test2/stats/bcftools", "small/test2/stats/bcftools/test2.bcftools.bcftools_stats.txt" @@ -325,18 +335,18 @@ "suffixIndex0:md5,b7bb2ba061ab54c0bf07c0a941d0277a", "suffixdata0:md5,f2876dd730673cd49c4de191001f634e", "suffixpointer0:md5,468281ffb10d7dd934289af762a03781", - "HG002.sort.merged.bed:md5,5e997133249c2227cb5363b314245636", - "HG002.bcftools_stats.txt:md5,07fafa430abc857969015ea6a41d1032", - "test1.HG002.strelka.extended.csv:md5,111a236ca6b747face70e8111c09c51f", - "test1.HG002.strelka.roc.Locations.INDEL.PASS.csv.gz:md5,5425b34f4688dda4d55f9f7a73667b18", - "test1.HG002.strelka.roc.Locations.INDEL.csv.gz:md5,eb9834ec5660d1e32ade0c856b9cfb7c", - "test1.HG002.strelka.roc.Locations.SNP.PASS.csv.gz:md5,778eb91f54cb830fb91fcddc688668e0", - "test1.HG002.strelka.roc.Locations.SNP.csv.gz:md5,516a4b58814cf32245a5b2f2c0445c8e", - "test1.HG002.strelka.roc.all.csv.gz:md5,a69fb6ae5cd051d4607da0224009f1d1", - "test1.HG002.strelka.summary.csv:md5,d29cba645e161dbdeb6247f57fce5780", - "test1.HG002.strelka.phasing.txt:md5,38920536b8c3e241e873c07ba61762e6", - "test1.HG002.strelka.summary.txt:md5,6c43a0d1ad065f851942828259d551a5", - "test1.strelka.bcftools_stats.txt:md5,7d65792aa3a84de09675facf62135c93", + "HG002_GRCh37_1_22_v4.2.1_highconf.liftedsort.merged.bed:md5,5e997133249c2227cb5363b314245636", + "HG002.bcftools_stats.txt:md5,db248c6e7e92253abd836c1d8a326371", + "test1.HG002.strelka.extended.csv:md5,fadf06276179c1cc388007e5651fb9af", + "test1.HG002.strelka.roc.Locations.INDEL.PASS.csv.gz:md5,a08665be8ea20855a14d6418d07521dc", + "test1.HG002.strelka.roc.Locations.INDEL.csv.gz:md5,8d4a20a5914379230952aeb252845e2d", + "test1.HG002.strelka.roc.Locations.SNP.PASS.csv.gz:md5,c546609a3e23b4c5c97ee0fc2e864e07", + "test1.HG002.strelka.roc.Locations.SNP.csv.gz:md5,8d2600c17665ed1299e2dd483705ddaa", + "test1.HG002.strelka.roc.all.csv.gz:md5,d0a4408438b2182cb82caec918c090f1", + "test1.HG002.strelka.summary.csv:md5,001fa2371777d1d2dc3eb0f4dcaca197", + "test1.HG002.strelka.phasing.txt:md5,0c9340e64e032b5f74d0ea3773464afd", + "test1.HG002.strelka.summary.txt:md5,4b775edb878759727ef66d5d5b2b9ee8", + "test1.strelka.bcftools_stats.txt:md5,869e7eac4d1bde15c33ac3598af3b58e", "test2.HG002.bcftools.extended.csv:md5,05c6efd85a9b9823a5ddfe6619be3323", "test2.HG002.bcftools.roc.Locations.INDEL.PASS.csv.gz:md5,463cee548b7bdf133c0ae9b5803a50ca", "test2.HG002.bcftools.roc.Locations.INDEL.csv.gz:md5,424e39c327ae6ef9001942bc895abdc9", @@ -346,13 +356,13 @@ "test2.HG002.bcftools.summary.csv:md5,38f3d8eb32c4d006a5623777ad265c39", "test2.HG002.bcftools.phasing.txt:md5,38920536b8c3e241e873c07ba61762e6", "test2.HG002.bcftools.summary.txt:md5,264eb2b064a9ee4b098470a13c2887c9", - "test2.bcftools.bcftools_stats.txt:md5,57aff2f0a6f830e920869b987502a343" + "test2.bcftools.bcftools_stats.txt:md5,2efc3e61c4aaa5bb2da48803c81d8fcc" ] ], "meta": { "nf-test": "0.9.2", "nextflow": "24.10.3" }, - "timestamp": "2025-01-13T14:48:00.444469734" + "timestamp": "2025-01-17T09:02:13.833941457" } } \ No newline at end of file diff --git a/tests/somatic_indel.nf.test.snap b/tests/somatic_indel.nf.test.snap index 5f59ce6..a446e99 100644 --- a/tests/somatic_indel.nf.test.snap +++ b/tests/somatic_indel.nf.test.snap @@ -1,28 +1,28 @@ { "Params: --analysis 'somatic' --variant_type 'indel' --method 'sompy'": { "content": [ - 27, + 21, { - "BCFTOOLS_FILTER": { + "BCFTOOLS_REHEADER_QUERY": { "bcftools": 1.2 }, - "BCFTOOLS_REHEADER": { - "bcftools": 1.18 + "BCFTOOLS_REHEADER_TRUTH": { + "bcftools": 1.2 }, "BCFTOOLS_SORT": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_STATS": { "bcftools": 1.18 }, "BCFTOOLS_VIEW_CONTIGS": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_VIEW_FILTERMISSING": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_VIEW_SUBSAMPLE": { - "bcftools": 1.18 + "bcftools": 1.2 }, "DATAVZRD": { "datavzrd": "2.36.12" @@ -36,9 +36,6 @@ "PLOTS": { "r-base": "4.3.1" }, - "TABIX_BGZIPTABIX": { - "tabix": "1.19.1" - }, "Workflow": { "nf-core/variantbenchmarking": "v1.0dev" } @@ -46,6 +43,9 @@ [ "indel", "indel/SEQC2", + "indel/SEQC2/preprocess", + "indel/SEQC2/preprocess/SEQC2.vcf.gz", + "indel/SEQC2/preprocess/SEQC2.vcf.gz.tbi", "indel/SEQC2/stats", "indel/SEQC2/stats/bcftools", "indel/SEQC2/stats/bcftools/SEQC2.bcftools_stats.txt", @@ -151,7 +151,8 @@ "indel/test1/benchmarks/sompy/test1.SEQC2.freebayes.metrics.json", "indel/test1/benchmarks/sompy/test1.SEQC2.freebayes.stats.csv", "indel/test1/preprocess", - "indel/test1/preprocess/test1.filter.vcf", + "indel/test1/preprocess/test1.vcf.gz", + "indel/test1/preprocess/test1.vcf.gz.tbi", "indel/test1/stats", "indel/test1/stats/bcftools", "indel/test1/stats/bcftools/test1.freebayes.bcftools_stats.txt", @@ -162,8 +163,8 @@ "indel/test2/benchmarks/sompy/test2.SEQC2.strelka.metrics.json", "indel/test2/benchmarks/sompy/test2.SEQC2.strelka.stats.csv", "indel/test2/preprocess", - "indel/test2/preprocess/HCC1395T_vs_HCC1395N.strelka.somatic_indels.sort.vcf.gz", - "indel/test2/preprocess/test2.filter.vcf", + "indel/test2/preprocess/test2.vcf.gz", + "indel/test2/preprocess/test2.vcf.gz.tbi", "indel/test2/stats", "indel/test2/stats/bcftools", "indel/test2/stats/bcftools/test2.strelka.bcftools_stats.txt", @@ -171,15 +172,15 @@ "pipeline_info/nf_core_pipeline_software_mqc_versions.yml" ], [ - "SEQC2.bcftools_stats.txt:md5,e530daf4f6a4923f1cd85c51893d5747", - "test1.freebayes.bcftools_stats.txt:md5,c5025b5c6256c1d808badee73a7c220c", - "test2.strelka.bcftools_stats.txt:md5,1e700076bd303ee079b23658485eb7d4" + "SEQC2.bcftools_stats.txt:md5,97beec3b6ce6471625018d91c75df08d", + "test1.freebayes.bcftools_stats.txt:md5,ca4f9a72e5cb852b78dbdf92142a4a6f", + "test2.strelka.bcftools_stats.txt:md5,a098fdaee64c05cd8e9efcc13c2b374d" ] ], "meta": { "nf-test": "0.9.2", "nextflow": "24.10.3" }, - "timestamp": "2025-01-13T14:32:00.898226798" + "timestamp": "2025-01-17T09:20:01.232442793" } } \ No newline at end of file diff --git a/tests/somatic_snv.nf.test.snap b/tests/somatic_snv.nf.test.snap index 45b7e8c..765e3f6 100644 --- a/tests/somatic_snv.nf.test.snap +++ b/tests/somatic_snv.nf.test.snap @@ -1,28 +1,28 @@ { "-stub": { "content": [ - 38, + 29, { - "BCFTOOLS_FILTER": { + "BCFTOOLS_REHEADER_QUERY": { "bcftools": 1.2 }, - "BCFTOOLS_REHEADER": { - "bcftools": 1.18 + "BCFTOOLS_REHEADER_TRUTH": { + "bcftools": 1.2 }, "BCFTOOLS_SORT": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_STATS": { "bcftools": 1.18 }, "BCFTOOLS_VIEW_CONTIGS": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_VIEW_FILTERMISSING": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_VIEW_SUBSAMPLE": { - "bcftools": 1.18 + "bcftools": 1.2 }, "DATAVZRD": { "datavzrd": "2.36.12" @@ -36,9 +36,6 @@ "PLOTS": { "r-base": "4.3.1" }, - "TABIX_BGZIPTABIX": { - "tabix": "1.19.1" - }, "Workflow": { "nf-core/variantbenchmarking": "v1.0dev" } @@ -48,6 +45,9 @@ "pipeline_info/nf_core_pipeline_software_mqc_versions.yml", "snv", "snv/SEQC2", + "snv/SEQC2/preprocess", + "snv/SEQC2/preprocess/SEQC2.vcf.gz", + "snv/SEQC2/preprocess/SEQC2.vcf.gz.tbi", "snv/SEQC2/stats", "snv/SEQC2/stats/bcftools", "snv/SEQC2/stats/bcftools/SEQC2.bcftools_stats.txt", @@ -91,7 +91,8 @@ "snv/test1/benchmarks/sompy/test1.SEQC2.freebayes.metrics.json", "snv/test1/benchmarks/sompy/test1.SEQC2.freebayes.stats.csv", "snv/test1/preprocess", - "snv/test1/preprocess/test1.filter.vcf", + "snv/test1/preprocess/test1.vcf.gz", + "snv/test1/preprocess/test1.vcf.gz.tbi", "snv/test1/stats", "snv/test1/stats/bcftools", "snv/test1/stats/bcftools/test1.freebayes.bcftools_stats.txt", @@ -102,8 +103,8 @@ "snv/test2/benchmarks/sompy/test2.SEQC2.manta.metrics.json", "snv/test2/benchmarks/sompy/test2.SEQC2.manta.stats.csv", "snv/test2/preprocess", - "snv/test2/preprocess/HCC1395T_vs_HCC1395N.manta.somatic_sv.sort.vcf.gz", - "snv/test2/preprocess/test2.filter.vcf", + "snv/test2/preprocess/test2.vcf.gz", + "snv/test2/preprocess/test2.vcf.gz.tbi", "snv/test2/stats", "snv/test2/stats/bcftools", "snv/test2/stats/bcftools/test2.manta.bcftools_stats.txt", @@ -114,8 +115,8 @@ "snv/test3/benchmarks/sompy/test3.SEQC2.strelka.metrics.json", "snv/test3/benchmarks/sompy/test3.SEQC2.strelka.stats.csv", "snv/test3/preprocess", - "snv/test3/preprocess/HCC1395T_vs_HCC1395N.strelka.somatic_snvs.sort.vcf.gz", - "snv/test3/preprocess/test3.filter.vcf", + "snv/test3/preprocess/test3.vcf.gz", + "snv/test3/preprocess/test3.vcf.gz.tbi", "snv/test3/stats", "snv/test3/stats/bcftools", "snv/test3/stats/bcftools/test3.strelka.bcftools_stats.txt" @@ -131,6 +132,6 @@ "nf-test": "0.9.2", "nextflow": "24.10.3" }, - "timestamp": "2025-01-13T14:15:33.922876808" + "timestamp": "2025-01-17T09:17:06.41202116" } } \ No newline at end of file diff --git a/tests/somatic_sv.nf.test.snap b/tests/somatic_sv.nf.test.snap index f5cf4b6..c495cc5 100644 --- a/tests/somatic_sv.nf.test.snap +++ b/tests/somatic_sv.nf.test.snap @@ -1,28 +1,40 @@ { "Params: --analysis 'somatic' --variant_type 'structural' --method 'truvari,svbenchmark'": { "content": [ - 69, + 53, { - "BCFTOOLS_FILTER": { + "BCFTOOLS_REHEADER_1": { "bcftools": 1.2 }, - "BCFTOOLS_REHEADER": { - "bcftools": 1.18 + "BCFTOOLS_REHEADER_2": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_3": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_4": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_QUERY": { + "bcftools": 1.2 + }, + "BCFTOOLS_REHEADER_TRUTH": { + "bcftools": 1.2 }, "BCFTOOLS_SORT": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_STATS": { "bcftools": 1.18 }, "BCFTOOLS_VIEW_CONTIGS": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_VIEW_FILTERMISSING": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BCFTOOLS_VIEW_SUBSAMPLE": { - "bcftools": 1.18 + "bcftools": 1.2 }, "BGZIP_TABIX": { "tabix": 1.12 @@ -42,13 +54,7 @@ "SURVIVOR_STATS": { "survivor": "1.0.7" }, - "TABIX_BGZIP": { - "tabix": "1.19.1" - }, - "TABIX_BGZIPTABIX": { - "tabix": "1.19.1" - }, - "TABIX_TABIX": { + "TABIX_BGZIP_UNZIP": { "tabix": "1.19.1" }, "TRUVARI_BENCH": { @@ -66,6 +72,9 @@ "pipeline_info/nf_core_pipeline_software_mqc_versions.yml", "structural", "structural/SEQC2", + "structural/SEQC2/preprocess", + "structural/SEQC2/preprocess/SEQC2.vcf.gz", + "structural/SEQC2/preprocess/SEQC2.vcf.gz.tbi", "structural/SEQC2/stats", "structural/SEQC2/stats/bcftools", "structural/SEQC2/stats/bcftools/SEQC2.bcftools_stats.txt", @@ -74,6 +83,10 @@ "structural/multiqc", "structural/multiqc/multiqc_data", "structural/multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "structural/multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "structural/multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "structural/multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "structural/multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", "structural/multiqc/multiqc_data/multiqc.log", "structural/multiqc/multiqc_data/multiqc_bcftools_stats.txt", "structural/multiqc/multiqc_data/multiqc_citations.txt", @@ -85,16 +98,28 @@ "structural/multiqc/multiqc_plots", "structural/multiqc/multiqc_plots/pdf", "structural/multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "structural/multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "structural/multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "structural/multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "structural/multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", "structural/multiqc/multiqc_plots/pdf/general_stats_table.pdf", "structural/multiqc/multiqc_plots/pdf/survivor-cnt.pdf", "structural/multiqc/multiqc_plots/pdf/survivor-pct.pdf", "structural/multiqc/multiqc_plots/png", "structural/multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "structural/multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "structural/multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "structural/multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "structural/multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", "structural/multiqc/multiqc_plots/png/general_stats_table.png", "structural/multiqc/multiqc_plots/png/survivor-cnt.png", "structural/multiqc/multiqc_plots/png/survivor-pct.png", "structural/multiqc/multiqc_plots/svg", "structural/multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "structural/multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "structural/multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "structural/multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "structural/multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", "structural/multiqc/multiqc_plots/svg/general_stats_table.svg", "structural/multiqc/multiqc_plots/svg/survivor-cnt.svg", "structural/multiqc/multiqc_plots/svg/survivor-pct.svg", @@ -152,9 +177,8 @@ "structural/test1/benchmarks/truvari/test1.SEQC2.tiddit.tp-comp.vcf.gz", "structural/test1/benchmarks/truvari/test1.SEQC2.tiddit.tp-comp.vcf.gz.tbi", "structural/test1/preprocess", - "structural/test1/preprocess/HCC1395T_vs_HCC1395N.tiddit_sv_merge.sort.vcf.gz", - "structural/test1/preprocess/test1.filter.filter.vcf", - "structural/test1/preprocess/test1.filter.vcf", + "structural/test1/preprocess/test1.vcf.gz", + "structural/test1/preprocess/test1.vcf.gz.tbi", "structural/test1/stats", "structural/test1/stats/bcftools", "structural/test1/stats/bcftools/test1.tiddit.bcftools_stats.txt", @@ -173,9 +197,8 @@ "structural/test2/benchmarks/truvari/test2.SEQC2.manta.tp-comp.vcf.gz", "structural/test2/benchmarks/truvari/test2.SEQC2.manta.tp-comp.vcf.gz.tbi", "structural/test2/preprocess", - "structural/test2/preprocess/HCC1395T_vs_HCC1395N.manta.somatic_sv.sort.vcf.gz", - "structural/test2/preprocess/test2.filter.filter.vcf", - "structural/test2/preprocess/test2.filter.vcf", + "structural/test2/preprocess/test2.vcf.gz", + "structural/test2/preprocess/test2.vcf.gz.tbi", "structural/test2/stats", "structural/test2/stats/bcftools", "structural/test2/stats/bcftools/test2.manta.bcftools_stats.txt", @@ -183,18 +206,18 @@ "structural/test2/stats/survivor/test2.manta_mqc.stats" ], [ - "SEQC2.bcftools_stats.txt:md5,a60de9c4d2f3db87d8ae404b73859bfa", + "SEQC2.bcftools_stats.txt:md5,4830a787124a30a79fb0ec3c1d8c6206", "SEQC2_mqc.stats:md5,a1327ec0cd3131f9e16cccf0024f61a1", - "test1.tiddit.bcftools_stats.txt:md5,aace5a23af8b02f56bcaf03722f9bb50", - "test1.tiddit_mqc.stats:md5,011ad66fec4287d32cb728c40e240c0b", - "test2.manta.bcftools_stats.txt:md5,1e700076bd303ee079b23658485eb7d4", - "test2.manta_mqc.stats:md5,011ad66fec4287d32cb728c40e240c0b" + "test1.tiddit.bcftools_stats.txt:md5,cbd62a9be5344b2d6223f8600483e6a2", + "test1.tiddit_mqc.stats:md5,0eb749429a6072f6bf63d377e37514d1", + "test2.manta.bcftools_stats.txt:md5,1926259ac4ad828a4a50a8e07e2b1d6f", + "test2.manta_mqc.stats:md5,73bc60cd2754202ddca8be6552e85ffa" ] ], "meta": { "nf-test": "0.9.2", "nextflow": "24.10.3" }, - "timestamp": "2025-01-13T14:18:00.311055864" + "timestamp": "2025-01-17T09:27:15.598502393" } } \ No newline at end of file