You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using pixy to calculate dxy.
However, all of the avg_dxy is NA. I believe that I may prepared my file incorrectly.
Please forgive me if my question is too naive.
This is how I prepared my merge VCF: bcftools mpileup -f ../ref.genome.fa -b ./BAM.list -r HiC_scaffold_39 | bcftools call -m -Oz -f GQ -o merged.vcf.gz
Base on the mannual, I believe I generated an all site VCF file.
All bam files had been sorted, and I only use HiC_scaffold_39 scaffold (length=173364) for testing.
Next, I indexing my vcf.gz file by tabix: tabix merged.Zp2.Oe2.vcf.gz
After running the above command, I got the follow output.
It seems like I have some invalid lines, but I don't know how to solve it.
[pixy] pixy 1.2.10.beta2
[pixy] See documentation at https://pixy.readthedocs.io/en/latest/
/home/why/.local/lib/python3.8/site-packages/allel/io/vcf_read.py:1732: UserWarning: invalid INFO header: '##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias for filtering splice-site artefacts in RNA-seq data (bigger is better)",Version="3">\n'
warnings.warn('invalid INFO header: %r' % header)
[pixy] Validating VCF and input parameters...
[pixy] Checking write access...OK
[pixy] Checking CPU configuration...OK
[pixy] Checking for invariant sites...OK
[pixy] Checking chromosome data...OK
[pixy] Checking intervals/sites...OK
[pixy] Checking sample data...OK
[pixy] All initial checks past!
[pixy] Preparing for calculation of summary statistics: pi, fst, dxy
[pixy] Using Weir and Cockerham (1984)'s estimator of FST.
[pixy] Data set contains 2 population(s), 1 chromosome(s), and 4 sample(s)
[pixy] Window size: 10000 bp
[pixy] Started calculations at 17:15:03 on 2024-07-19
[pixy] Using 4 out of 88 available CPU cores
[pixy] Processing chromosome/contig HiC_scaffold_39...
[pixy] Calculating statistics for region HiC_scaffold_39:1-173364...
/home/why/.local/lib/python3.8/site-packages/allel/io/vcf_read.py:1732: UserWarning: invalid INFO header: '##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias for filtering splice-site artefacts in RNA-seq data (bigger is better)",Version="3">\n'
warnings.warn('invalid INFO header: %r' % header)
/home/why/.local/lib/python3.8/site-packages/allel/io/vcf_read.py:1732: UserWarning: invalid INFO header: '##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias for filtering splice-site artefacts in RNA-seq data (bigger is better)",Version="3">\n'
warnings.warn('invalid INFO header: %r' % header)
/home/why/.local/lib/python3.8/site-packages/allel/io/vcf_read.py:1248: UserWarning: 'DP' FORMAT header not found
warnings.warn('%r FORMAT header not found' % name)
/home/why/.local/lib/python3.8/site-packages/allel/io/vcf_read.py:1248: UserWarning: 'DP' FORMAT header not found
warnings.warn('%r FORMAT header not found' % name)
[pixy] WARNING: pixy failed to find any valid gentoype data to calculate the following summary statistics: fst. No output file will be created for these statistics.
[pixy] All calculations complete at 17:15:14 on 2024-07-19
[pixy] Time elapsed: 00:00:10
[pixy] Output files written to: /home/why/Juihung/Candidia_barbatus_newAssembly2/gIMble/1_mapping/pixy.Zp2.Oe2/
[pixy] If you use pixy in your research, please cite the following paper:
[pixy] Korunes, KL and K Samuk. pixy: Unbiased estimation of nucleotide diversity and divergence in the presence of missing data. Mol Ecol Resour. 2021 Jan 16. doi: 10.1111/1755-0998.13326.
Here is avg_dxy file, pixy didn't find any difference:
Hi there,
I am using pixy to calculate dxy.
However, all of the avg_dxy is NA. I believe that I may prepared my file incorrectly.
Please forgive me if my question is too naive.
This is how I prepared my merge VCF:
bcftools mpileup -f ../ref.genome.fa -b ./BAM.list -r HiC_scaffold_39 | bcftools call -m -Oz -f GQ -o merged.vcf.gz
Base on the mannual, I believe I generated an all site VCF file.
All bam files had been sorted, and I only use HiC_scaffold_39 scaffold (length=173364) for testing.
Next, I indexing my vcf.gz file by tabix:
tabix merged.Zp2.Oe2.vcf.gz
And here is my population file:
and my command of using pixy:
After running the above command, I got the follow output.
It seems like I have some invalid lines, but I don't know how to solve it.
Here is avg_dxy file, pixy didn't find any difference:
This is my VCF file looks like, I can't find any problems.
Through bcftools view, there's definitely have some differents among populations in my VCF file.
bcftools view -v snps merged.vcf.gz
Thank you for your time and support.
I look forward to a solution to this problem.
Best,
Jui-Hung
The text was updated successfully, but these errors were encountered: