Overestimation of number of reads from nanopore data (flagstat) #31

rebeelouise · 2021-02-19T14:14:17Z

Same issue as mentioned on the minimap2 tool: lh3/minimap2#236 (comment)

For example nanopore reads aligned to the host transcriptome the flagstat output is:

5953480 + 0 in total (QC-passed reads + QC-failed reads)
2961480 + 0 secondary
22696 + 0 supplementary
0 + 0 duplicates
4195469 + 0 mapped (70.47% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

However, the number of actual reads is: 2969304 - the read length of these are about 750nt. I am assuming this over reporting is due to the presence of long reads, is there a more appropriate way of calculating the number of reads and the % of reads mapped in an alignment file? Can the % of reads mapped still be a trusted value?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overestimation of number of reads from nanopore data (flagstat) #31

Overestimation of number of reads from nanopore data (flagstat) #31

rebeelouise commented Feb 19, 2021

Overestimation of number of reads from nanopore data (flagstat) #31

Overestimation of number of reads from nanopore data (flagstat) #31

Comments

rebeelouise commented Feb 19, 2021