-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Systematic IFFY tag when indels are not trimmed #3
Comments
The problem here is that for those indels (and this is true for the majority of indels), it is not possible to understand which allele is the reference and which allele is the alternate by comparing the two alleles to the reference, as both alleles match the reference. This is one of the main reasons many summary statistics formats are flawed and their format should be replaced by something like the GWAS-VCF standard. In the case of your summary statistics file, do you think you would have information encoded in the file to allow recognition of which allele is the reference allele and which allele is the alternate allele? If so, show me what it looks like and I will update the BCFtools/munge plugin to handle this correctly |
Thank you for your reply |
This problem is not related to left aligning. Those three IFFY variants in your example could correspond to multiple variants encoded by the VCF specification as follows:
These are all different variants. How do you figure out which ones are those matching your summary statistics file? |
@freeseek thank you for the reply. You are right that there are no obvious ways to resolve this ambiguity as such. Only external infos like reference allele frequency data could potentially resolve this (if the variants in the reference vcf file in the region are not ambiguous). But this could be a very expansive solution even if only 10% of the variants (mostly imputed indels ) are affected. |
Hi @freeseek ,
Thanks a lot for the tools provided in this repo.
I have a question with regard to the munging plugin, it appears that indel variants are systematically flagged when the alleles are not trimmed (examples attached)
is this an expected behavior of the tool ?
Thanks
The text was updated successfully, but these errors were encountered: