Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should sites with no depth coverage by included or excluded from VCF files while using Whole genome data for Dxy calculation? #119

Open
vincianem opened this issue Oct 25, 2024 · 1 comment
Labels
help wanted Extra attention is needed

Comments

@vincianem
Copy link

Morning,

I have a general question related to the formatting of the input VCF file for dxy and pi calculation using Pixy.

I generated VCF files from whole genome data that include invariants, variants and positions without called genotypes. I wonder if the positions without called genotypes (./.) have to be included in the VCFs for the calculations?

Thanks a lot for your answer!
Best,
Vinciane

@vincianem vincianem added the help wanted Extra attention is needed label Oct 25, 2024
@ksamuk
Copy link
Owner

ksamuk commented Jan 24, 2025

Hi there, you can leave missing genotypes, including sites with all missing data, in your VCF if you'd like. You can also remove sites where all individuals are missing if that is easier for you, but it should not affect the calculations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants