mrdetect

Workflow for MRdetect, detection of Minimal Residual Disease from paired tumor-plasma sample

Overview

Dependencies

Usage

Cromwell

java -jar cromwell.jar run mrdetect.wdl --inputs inputs.json

Inputs

Required workflow parameters:

Parameter	Value	Description
`tumorSampleName`	String	ID for WGS tumor sample, must match .vcf header
`tumorvcf`	File	tumor vcf file, bgzip
`tumorvcfindex`	File	tumor vcf index file
`reference`	String	genome reference build. Only hg38 supported
`instrument`	String	sequencing instrument used (Illumina NovaSeq X Plus or Illumina NovaSeq 6000)

Optional workflow parameters:

Parameter	Value	Default	Description
`plasmabam`	File?	None	plasma input .bam file
`plasmabai`	File?	None	plasma input .bai file
`plasmaSampleName`	String?	None	name for plasma sample (from bam)
`full_analysis_mode`	Boolean	true	Enable full analysis mode with this flag

Optional task parameters:

Parameter	Value	Default	Description
`filterVCF.tumorVCFfilter`	String	"FILTER~'haplotype'	FILTER~'clustered_events'
`filterVCF.tumorVAF`	String	"0.1"	Variant Allele Frequency for tumor VCF
`filterVCF.jobMemory`	Int	64	Memory allocated for this job (GB)
`filterVCF.threads`	Int	4	Requested CPU threads
`filterVCF.timeout`	Int	10	Hours before task timeout
`parseControls.jobMemory`	Int	4	Memory for this task in GB
`parseControls.timeout`	Int	12	Timeout in hours, needed to override imposed limits
`controlPullReads.jobMemory`	Int	64	Memory allocated for this job (GB)
`controlPullReads.threads`	Int	4	Requested CPU threads
`controlPullReads.timeout`	Int	20	Hours before task timeout
`controlPullReads.pullreadsScript`	String	"$MRDETECT_ROOT/bin/pull_reads"	pull_reads.py executable
`controlQualityScore.jobMemory`	Int	64	Memory allocated for this job (GB)
`controlQualityScore.threads`	Int	4	Requested CPU threads
`controlQualityScore.timeout`	Int	20	Hours before task timeout
`controlQualityScore.pickle`	String	"$MRDETECT_ROOT/bin/MRDetectSNV/trained_SVM.pkl"	trained pickle for detecting real tumor reads
`controlQualityScore.qualityscoreScript`	String	"$MRDETECT_ROOT/bin/quality_score"	quality_score.py executable
`controlDetectSNVs.tumorSampleName`	String	basename(tumorvcf,".vcf")	name for tumour sample (from vcf)
`controlDetectSNVs.jobMemory`	Int	64	Memory allocated for this job (GB)
`controlDetectSNVs.threads`	Int	4	Requested CPU threads
`controlDetectSNVs.timeout`	Int	20	Hours before task timeout
`controlDetectSNVs.blocklist`	String	"$PWGS_BLOCKLIST_ROOT/blocklist.vcf.gz"	list of sites to exclude from analysis, gzipped
`controlDetectSNVs.filterAndDetectScript`	String	"$MRDETECT_ROOT/bin/filterAndDetect"	filterAndDetect.py executable
`samplePullReads.jobMemory`	Int	64	Memory allocated for this job (GB)
`samplePullReads.threads`	Int	4	Requested CPU threads
`samplePullReads.timeout`	Int	20	Hours before task timeout
`samplePullReads.pullreadsScript`	String	"$MRDETECT_ROOT/bin/pull_reads"	pull_reads.py executable
`sampleQualityScore.jobMemory`	Int	64	Memory allocated for this job (GB)
`sampleQualityScore.threads`	Int	4	Requested CPU threads
`sampleQualityScore.timeout`	Int	20	Hours before task timeout
`sampleQualityScore.pickle`	String	"$MRDETECT_ROOT/bin/MRDetectSNV/trained_SVM.pkl"	trained pickle for detecting real tumor reads
`sampleQualityScore.qualityscoreScript`	String	"$MRDETECT_ROOT/bin/quality_score"	quality_score.py executable
`sampleDetectSNVs.tumorSampleName`	String	basename(tumorvcf,".vcf")	name for tumour sample (from vcf)
`sampleDetectSNVs.jobMemory`	Int	64	Memory allocated for this job (GB)
`sampleDetectSNVs.threads`	Int	4	Requested CPU threads
`sampleDetectSNVs.timeout`	Int	20	Hours before task timeout
`sampleDetectSNVs.blocklist`	String	"$PWGS_BLOCKLIST_ROOT/blocklist.vcf.gz"	list of sites to exclude from analysis, gzipped
`sampleDetectSNVs.filterAndDetectScript`	String	"$MRDETECT_ROOT/bin/filterAndDetect"	filterAndDetect.py executable
`snvDetectionSummary.pvalue`	String	"0.00001"	p-value for HBC error rate
`snvDetectionSummary.jobMemory`	Int	20	Memory allocated for this job (GB)
`snvDetectionSummary.threads`	Int	1	Requested CPU threads
`snvDetectionSummary.timeout`	Int	2	Hours before task timeout
`snvDetectionSummary.modules`	String	"mrdetect/1.1.1"	Required environment modules
`snvDetectionSummary.pwgtestscript`	String	"$MRDETECT_ROOT/bin/pwg_test"	executable of pwg_test.R

Outputs

Output	Type	Description	Labels
`snvDetectionResult`	File?	Result from SNV detection incl sample HBCs	vidarr_label: snvDetectionResult
`pWGS_svg`	File?	pWGS svg	vidarr_label: pWGS_svg
`snpcount`	File	number of SNPs in vcf after filtering	vidarr_label: snpcount
`snvDetectionVAF`	File?	VAF from SNV detection for sample	vidarr_label: snvDetectionVAF
`final_call`	File?	Final file of mrdetect results	vidarr_label: final_call
`filteredvcf`	File?	Filtered vcf

Commands

This section lists command(s) run by mrdetect workflow

Running mrdtect

		set -euo pipefail

		$BCFTOOLS_ROOT/bin/bcftools view -s ~{tumorSampleName} --regions-file ~{difficultRegions} ~{tumorvcf} |\
		$BCFTOOLS_ROOT/bin/bcftools norm --multiallelics - --fasta-ref ~{genome} |\
		$BCFTOOLS_ROOT/bin/bcftools filter -i "TYPE='snps'" |\
		$BCFTOOLS_ROOT/bin/bcftools filter -e "~{tumorVCFfilter}" |\
		$BCFTOOLS_ROOT/bin/bcftools filter -i "(FORMAT/AD[0:1])/(FORMAT/AD[0:0]+FORMAT/AD[0:1]) >= ~{tumorVAF}" > ~{tumorSampleName}.SNP.vcf

		awk '$1 !~ "#" {print}' ~{tumorSampleName}.SNP.vcf | wc -l > ~{tumorSampleName}.SNP.count.txt

		set -euo pipefail

		~{pullreadsScript} \
			--bam ~{plasmabam} \
			--vcf ~{tumorvcf} \
			--out PLASMA_VS_TUMOR.tsv

		set -euo pipefail

		~{qualityscoreScript} \
			--pickle-name ~{pickle} \
			--detections ~{snvDetectionReads} \
			--output_file PLASMA_VS_TUMOR.svm.tsv

		set -euo pipefail

		~{filterAndDetectScript} \
			--vcfid ~{tumorSampleName} --bamid ~{plasmaSampleName} \
			--svm ~{snvDetectionReadsScored} \
			--vcf ~{tumorvcf} \
			--output ./ \
			--blocklist ~{blocklist} 
		mv ~{plasmaSampleName}.mrdetect.results.csv ~{tumorSampleName}_~{plasmaSampleName}.mrdetect.results.csv
		mv ~{plasmaSampleName}.mrdetect.vaf.txt ~{tumorSampleName}_~{plasmaSampleName}.mrdetect.vaf.txt

		python <<CODE
		import os

		with open(os.environ.get("~{controlFileList}")) as f:
			for line in f:
				line = line.rstrip()
				tmp = line.split("\t")
				r = tmp[0] + "\t" + tmp[1]
				print(r)
		f.close()
		CODE

		set -euo pipefail

		cat ~{sep=' ' controlCalls} | awk '$1 !~ "BAM" {print}' > HBCs.csv

		cat ~{sampleCalls} HBCs.csv > ~{plasmaSampleName}.HBCs.csv

		~{pwgtestscript} \
			--sampleName ~{plasmaSampleName} \
			--results ~{plasmaSampleName}.HBCs.csv \
			--candidateSNVsCountFile ~{snpcount} \
			--vafFile ~{vafFile} \
			--pval ~{pvalue} 
		mv ~{plasmaSampleName}.pWGS.svg ~{tumorSampleName}_~{plasmaSampleName}.pWGS.svg
		mv ~{plasmaSampleName}.HBCs.csv ~{tumorSampleName}_~{plasmaSampleName}.HBCs.csv
		mv ~{plasmaSampleName}.mrdetect.txt ~{tumorSampleName}_~{plasmaSampleName}.mrdetect.txt

Support

For support, please file an issue on the Github project or send an email to [email protected] .

Generated with generate-markdown-readme (https://github.com/oicr-gsi/gsi-wdl-tools/)

Name		Name	Last commit message	Last commit date
Latest commit History 151 Commits
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Jenkinsfile		Jenkinsfile
README.md		README.md
commands.txt		commands.txt
mrdetect.wdl		mrdetect.wdl
vidarrbuild.json		vidarrbuild.json
vidarrtest-regression.json.in		vidarrtest-regression.json.in

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mrdetect

Overview

Dependencies

Usage

Cromwell

Inputs

Required workflow parameters:

Optional workflow parameters:

Optional task parameters:

Outputs

Commands

Support

About

Releases 6

Packages

Contributors 7

Languages

oicr-gsi/mrdetect

Folders and files

Latest commit

History

Repository files navigation

mrdetect

Overview

Dependencies

Usage

Cromwell

Inputs

Required workflow parameters:

Optional workflow parameters:

Optional task parameters:

Outputs

Commands

Support

About

Topics

Resources

Stars

Watchers

Forks

Releases 6

Packages 0

Contributors 7

Languages

Packages