Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to specify ploidy for HaplotypeCaller in the nf-core/sarek pipeline? #1771

Open
koraykaplan88 opened this issue Jan 13, 2025 · 4 comments
Labels
question Further information is requested

Comments

@koraykaplan88
Copy link

koraykaplan88 commented Jan 13, 2025

Hello,

I am using the nf-core/sarek pipeline to analyze genomic data. The pipeline works perfectly fine for my diploid plant samples, and I was able to run it without any issues.

However, I am now analyzing polyploid species (specifically tiploid and tetraploid species), and I need to ensure that the VCF files generated by HaplotypeCaller reflect the tetraploid nature of my samples. Unfortunately, I couldn't find a direct way to specify the --ploidy parameter for GATK HaplotypeCaller in the pipeline.

My goal is to ensure that the VCF files generated by HaplotypeCaller reflect the triploid/tetraploid nature of my samples. I have tried adding --haplotypecaller_options '--ploidy 4', but the pipeline flagged this as an invalid parameter:
WARN: The following invalid input values have been detected:

  • --haplotypecaller_options: --ploidy 4

Here is the command I used to run the pipeline:
nextflow run nf-core/sarek -profile singularity -c nextflow.config -params-file params.yaml
--input 4n_input.csv --fasta my_reference_genome.fasta
--tools haplotypecaller --skip_tools baserecalibrator

Also params.yml file content:
genome: null
dbsnp: null
known_indels: null
known_snps: null
germline_resource: null
sentieon_dnascope_model: null

Is there a way to specify the ploidy for HaplotypeCaller in the current version of nf-core/sarek?

@FriederikeHanssen
Copy link
Contributor

You can specify tool parameters that are not exposed in the pipeline using a custom.config: https://nf-co.re/docs/usage/getting_started/configuration#customising-tool-arguments

I assume you already set something in the nextflow.config?

@koraykaplan88
Copy link
Author

My config file is actually pretty simple:

process {
withSingularity = true
singularity.enabled = true
}

I just want to add haplotypecaller --ploidy 4 to it. Could you help with that? Where can I find the specific name of the haplotypecaller step in the workflow?

@FriederikeHanssen
Copy link
Contributor

You can find it here:

You'd want something like:

process {
    withName: 'GATK4_HAPLOTYPECALLER' {
        ext.args   = { params.joint_germline ? "-ERC GVCF --ploidy 4" : "--ploidy 4" }
    }
}

@FriederikeHanssen FriederikeHanssen added the question Further information is requested label Jan 14, 2025
@koraykaplan88
Copy link
Author

Yes, thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants