Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency between NCBI taxonomy versions in the kraken and the custom bowtie databases #153

Open
mikemc opened this issue Jan 22, 2025 · 3 comments
Labels
documentation Improvements or additions to documentation

Comments

@mikemc
Copy link
Member

mikemc commented Jan 22, 2025

Since NCBI taxonomy evolves over time (e.g.), it's possible that the NCBI taxonomy at the time that the index workflow is run will differ from that that was used in whatever kraken database is being pulled in the index workflow.

How I'd imagine handling this would be to do something like: make sure we're using a GenBank release that is compatible with whatever RefSeq release is used in the Kraken database, and then make sure that we have the correct NCBI taxonomy dump for these releases.

Alternatively, there are probably ways to determine what the changes are, but it might not be easy to figure out what to do with the differences.

@willbradshaw , can you say something as to how discrepancies are/aren't handled currently?

@willbradshaw
Copy link
Contributor

I think this is essentially a documentation question, rather than a pipeline feature per se. It's up to the user generating the index to make sure that the versions of the various resources they're pulling in are compatible. I agree that the documentation should reflect this, though.

@willbradshaw willbradshaw added the documentation Improvements or additions to documentation label Jan 22, 2025
@jeffkaufman
Copy link
Member

It's up to the user generating the index

Is this something you and/or Harmon are currently doing when you create indexes?

@willbradshaw
Copy link
Contributor

Not as diligently as I'd like. We'd probably do it better if a procedure was laid out properly in the documentation as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants