You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there! Wanted to get a conversation started for implementing a batch getter for the metakb API.
In contexts like processing a large number of variant IDs from a VCF, it could be helpful to send multiple IDs at once. For reference, on my 2017 MacBook Pro (2.3 GHz Dual-Core Intel Core i5 cores, 8GB memory) with multiprocessing, I max out at ~25 calls per second. Using this, working with a 16MB VCF (one patient, one chromosome) with 330k variants would take ~3.5 hours. The compute doesn't seem to be a problem, as I'm far from maxing out my CPU or memory. Considering this is a relatively small VCF, scaling this to the multi-patient, larger VCFs, and all chromosomes could end up being lengthy regardless of the machine.
Just wanted to provide some info for one use case, but happy to provide more info or discuss this more in detail.
The text was updated successfully, but these errors were encountered:
@quinnwai Thanks for creating an issue for this. MetaKB V2 is still in early stages of development. This is a reasonable ask, I think the question is more on where it lives on the roadmap and which team would implement this. We can discuss this more at the MetaKB meeting on Tuesday and report back here afterwards.
Hi there! Wanted to get a conversation started for implementing a batch getter for the metakb API.
In contexts like processing a large number of variant IDs from a VCF, it could be helpful to send multiple IDs at once. For reference, on my 2017 MacBook Pro (2.3 GHz Dual-Core Intel Core i5 cores, 8GB memory) with multiprocessing, I max out at ~25 calls per second. Using this, working with a 16MB VCF (one patient, one chromosome) with 330k variants would take ~3.5 hours. The compute doesn't seem to be a problem, as I'm far from maxing out my CPU or memory. Considering this is a relatively small VCF, scaling this to the multi-patient, larger VCFs, and all chromosomes could end up being lengthy regardless of the machine.
Just wanted to provide some info for one use case, but happy to provide more info or discuss this more in detail.
The text was updated successfully, but these errors were encountered: