You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have 384 bam aligned samples from GBS technology. I ran the Thermofish Variant Caller (TVC) on them in order to get the genotypes of my variants. That seemed to work well but I wanted to use bcftools in order to compare both vcf files.
Here are 4 variants called with TVC at a specific locus for one single sample :
h2tg000001l 385471 A G 0/1
h2tg000001l 385474 A G 0/1
h2tg000001l 385508 G A 0/1
h2tg000001l 385517 A G 1/1
I usually use -q 10 -Q 30 to take in account only the robust variants, and --max-depth 1000 because some loci have high coverage. I got :
h2tg000001l 385471 A G 1/1
h2tg000001l 385517 A G 1/1
The differencies between TVC and bcftools :
4 variants detected by TVC, 2 by bcftools
2 are common. 1 out of the 2 has a different genotype
Here are the alignements at the locus :
There are the four potential variants (in the coverage track), and the black line representing the MAPQ10 threshold. bcftools removes the <MAPQ10 (-q 10) and then consider only the variants 1 and 4, as homozygous. I tried to trick the parameters of TVC to remove the <MAPQ10 alignments but nothing happens, and I got the same genotypes.
The variant 1 for example is given as 0/1 for TVC and 1/1 for bcftools. I did a test with -q 0 -Q 0 to consider all the reads. bcftools read the allele depth as 50%/50% for the variant 1 (then 0/1), but still outputed it as 1/1. Does bcftools do not consider the low mapq alignments even if I set up -q 10 ?
Also, in your opinion, do the soft-clipped reads could correspond to orthologs/paralogs that bcftools do not consider? Because we can see they could carry potentiel alleles at the variants 2 and 3.
Best
The text was updated successfully, but these errors were encountered:
Dear,
I have 384 bam aligned samples from GBS technology. I ran the Thermofish Variant Caller (TVC) on them in order to get the genotypes of my variants. That seemed to work well but I wanted to use
bcftools
in order to compare both vcf files.Here are 4 variants called with TVC at a specific locus for one single sample :
I ran bcftools on that sample with :
bcftools mpileup \ --threads 128 \ -q 10 -Q 30 --max-depth 1000 \ --skip-indels \ -f reference.fasta sample1.bam \ --annotate FORMAT/AD,FORMAT/DP,INFO/AD | \ bcftools call \ --threads 128 \ -mv -Oz -o sample1.vcf.gz
I usually use
-q 10 -Q 30
to take in account only the robust variants, and--max-depth 1000
because some loci have high coverage. I got :The differencies between
TVC
andbcftools
:Here are the alignements at the locus :
There are the four potential variants (in the coverage track), and the black line representing the MAPQ10 threshold.
bcftools
removes the <MAPQ10 (-q 10
) and then consider only the variants 1 and 4, as homozygous. I tried to trick the parameters of TVC to remove the <MAPQ10 alignments but nothing happens, and I got the same genotypes.The variant 1 for example is given as 0/1 for TVC and 1/1 for bcftools. I did a test with
-q 0 -Q 0
to consider all the reads.bcftools
read the allele depth as 50%/50% for the variant 1 (then 0/1), but still outputed it as 1/1. Does bcftools do not consider the low mapq alignments even if I set up-q 10
?Also, in your opinion, do the soft-clipped reads could correspond to orthologs/paralogs that bcftools do not consider? Because we can see they could carry potentiel alleles at the variants 2 and 3.
Best
The text was updated successfully, but these errors were encountered: