Should relationship on GenBank IDs be "self" or "same individual as"? #4736
Replies: 10 comments
-
https://arctos.database.museum/info/ctDocumentation.cfm?table=CTID_REFERENCES self==>this specimen record GenBank numbers are 'self.' What are the DWC implications of same individual as? |
Beta Was this translation helpful? Give feedback.
-
If we use this correctly in mapping, we can send the relationships to GBIF and iDigBio in a way that might end up making them actually discoverable and useful outside of Arctos. Mainly because of this:
I propose that we change ALL GenBank IDs to "same individual as" relationships so that we are making it clear that these IDs relate to some other specimen record. I also considered proposing a new relationship. Something like "derivative" to show that the GenBank sequences were made from a part of the specimen described in the Arctos record. |
Beta Was this translation helpful? Give feedback.
-
This makes no sense to me, and therefore to any existing code. I don't think it can make sense to an "Organism resolver" either.
They don't though??!? Shall we add this to the AWG agenda? |
Beta Was this translation helpful? Give feedback.
-
I think they do - GenBank records are added by third parties and are prone to error.
Already added the tag. |
Beta Was this translation helpful? Give feedback.
-
AHA! (maybe...) Mostly unrelated, you can control much of that - you can direct folks who use your specimens to use Arctos tools (which aren't susceptible to things like copy-paste errors), you can ask submitters to alter GB records, and failing all of that you can usually get GB to make changes on your behalf. It's technically trivial from my perspective to do much more there, but it would take some changes in GenBank - eg, they'd need to periodically refresh from Arctos. ANYWAY - I don't see GB records as specimen records. They're metadata (from the perspective of Arctos), and the system just happens to be a little cache-happy. From that viewpoint, the relationship is clearly "self" - we're pointing to data which refers to the same THING and just happens to live in another system, same as anything else with a base_url. Any sort of differences are not independent assertions, they're just lag in the cache (which is currently set to never update.....). If GB records ARE independent assertions, then I don't think I see anything immediately and obviously fatal about viewing them as equal independent assertions, in which case a "same as" relationship probably does make more sense. |
Beta Was this translation helpful? Give feedback.
-
The difficulty in getting authors to submit properly formatted GUIDs (or
correct existing problems) is a global challenge that we are not going to
be able to resolve until GenBank forces them to use a template. This
template would have to accommodate different collection's preferences or
link to a unique organism ID, so we are a long way from that,
unfortunately. However, this is a topic of community discussion.
I agree with Teresa that we should try to map the relationships properly to
DwC. Same individual as is not inappropriate given that the sequence data
are third party generated. We use same individual as for other
cross-institutional relationships, such as AMNH-MSB tissue/voucher
relationships, or UAM/MSB shared tissues. Arctos/NCBI is another
cross-institutional relationship.
The only problem I see is having the GenBank IDs in the relationship
dropdown instead of in the ID table. Visually that would make them harder
to find. We may need to adjust the interface.
…On Mon, Jun 17, 2019 at 11:19 AM Teresa Mayfield-Meyer < ***@***.***> wrote:
GenBank IDs relate to some other specimen record.
They don't though??!?
I think they do - GenBank records are added by third parties and are prone
to error.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2121?email_source=notifications&email_token=ADQ7JBBYXZJ2N2FTYGQUBSTP27BTHA5CNFSM4HX57LRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODX33TNA#issuecomment-502774196>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADQ7JBDZVGUWH6MKT7K5RQTP27BTHANCNFSM4HX57LRA>
.
|
Beta Was this translation helpful? Give feedback.
-
That is only a partial solution for any specimen with multiple parts. Arctos provides the capacity to do better, and loan agreements would be a simple way to push your users towards that. Organisms would lose an additional level of specificity; I'd see that as a long step backwards. I don't think you'll get GB to do what we'd like them to do, but I'll enthusiastically support it if you can.
Can you be more specific?
That's essentially the 'self' assertion. 'Same individual as' would imply that the locality, identification, etc. data in GenBank are equally as valid as those in Arctos. I don't think that's arguable for AMNH - they're (presumably) actively managing that stuff in the same way we are. I don't think that's true for GB, but perhaps there's something I don't know.
I don't know what this means. IDs that refer to the same specimen and IDs that refer to other specimens differ only in the references value. |
Beta Was this translation helpful? Give feedback.
-
You can, but we need to hire someone because we don't actually have time for that.
They are and will potentially be used in research as if they are primary data. Sequences are identified with a taxon, which may or may not be the same taxon we have recorded in Arctos. Mariel is talking about this: when the other ID relationship is anything other than self, it gets moved into the "relationships" section, which can become crowded. We might take a look at using an expand function to skinny it down. Would need another issue I think. |
Beta Was this translation helpful? Give feedback.
-
I am leaving this open, because it does seem that some of our GenBank Ids should have a relationship other than "self" (the virus examples in the GloBI issue are one example). This would be a good project for an intern. |
Beta Was this translation helpful? Give feedback.
-
I still don't understand the purpose of this discussion. |
Beta Was this translation helpful? Give feedback.
-
I bulk added a bunch of GenBank IDs to MSB records today, then had to add some manually. This led me to think about the relationship of "self" which we have been using with these IDs. Because the GenBank record is an external source and "same individual as" has implications in Darwin Core that self does not address, I think it would be better to use "same individual as" for the relationship.
Either way - I think it would pay for all Arctos collections to be consistent which means good documentation and an explanation for our choice.
Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions