-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use median instead of average values if possible #119
Comments
Hello and thanks so much for your feedback! You are right,
|
Hi @joerivandervelde , I am sorry things are stacking up in my mailbox ... OK, so you fixed the first part already, the https://thegenomefactory.blogspot.com/2013/08/paired-end-read-confusion-library.html The field you keep in ontology should describe how the sequencing library was prepared and how long the DNA fragements were, on average or better on median. Unfortunately, people tend to discriminate fragment size and insert size, depending whether adapter have been already added or not. Practically, different SW tools calculate either outer or inner distance. I assume goal of the catalogue is to either collect either of the two values of to decently push users to calculate a single/intended value again using the right software. See https://broadinstitute.github.io/picard/picard-metric-definitions.html#InsertSizeMetrics In other words, this annotation term is supposedly about While at it, probably you want to add also terms for https://broadinstitute.github.io/picard/picard-metric-definitions.html#JumpingLibraryMetrics . |
Hello @mmokrejs we've been having internal discussions on how to tackle this but haven't quite sorted it out - could you perhaps clarify the change you are proposing? If metrics are not generic for all sequencing platforms, we might also consider to model it something like this, might that make sense ?
|
Hi,
I incidentally poked over your project and I wonder why you keep track of
Average read depth
and ofObserved insert size
. The former would be better replaced withMedian read depth
and the latter probably calledOuter mate median distance
instead? Depends on the tool used to analyze the data. Seems too much Illumina-technology oriented. How will this work for PacBio and Oxford Nanopore sequencing projects? And for older Roche/454 and IonTorrent-based projects which used totally different types of library prep. protocols (RF vs. FR read orientations, etc)? Likewise, Sanger-based genome sequencing?The text was updated successfully, but these errors were encountered: