-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add initiator methionine? #29
Comments
I would say yes even though it’s supported by most common search algorithms
by default
Eugene
…On Wed, 31 Oct 2018 at 3:59 pm, Eric Deutsch ***@***.***> wrote:
We do not currently have a term for initiator methionine. Is this an
oversight or intentional?
Do we want to add this term and be specific about initiator methionines in
the PEFF? I would guess yes, but what do you think?
Example in neXtProt:
https://www.nextprot.org/entry/P60484/sequence
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#29>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ASe4LJxTjBoCvIGK7ANllAlABV9gKe0gks5uqS47gaJpZM4YDWgn>
.
|
There is a PSI-MOD term for this: MOD:01643 Or \ModResUnimod=(1|UNIMOD:765|Met-loss) Also valid: \VariantComplex=(1|1|) but might not be as ideal: VariantComplex might be neglected by search engine whereas ModResPsi is easier to handle (like phospho, oxydation or other PTMs) I would not add a specific term (exception for this) |
Lydie wrote in an email on the topic:I think that having initiator methionine would be redundant with having "Mature protein" starting at position 2 .There do seem to be several ways to do it, but we should pick one and recommend it in the spec. I would vote against the \VariantComplex solution because it may not be implemented until late by many programs. And variant in my mind generally implies a genetic difference from the reference, although this is not strict. The \ModResXxxx way seems a bit weird because then there would start the precedent of displaying and listing peptides with sequences longer than what's there with some masses set to 0. i.e. if a search engine got a match for this, it would be tempted to display it with something like M[Met-loss]ELVIS, and sequence mapping software would map it with the M, but then it would presumably not map to another protein with .....KELVISR..... or .....PELVIS...... because there's no M. I don't think that's a good solution. Dealing with it in \Processed seems like the right way to handle it. As Lydie said, it is already implicit with a \Processed=(2|542|PEFF:0001020|mature protein). For a search engine that wants to treat mature protein boundaries as fully tryptic, this is all it really needs to know. Also having (1|1|PEFF:000nnnn|initiator methionine) would be very explicit in the annotations, and would avoid annotater software needing to infer this information, but perhaps not necessary for the searching task. But, of course, by that argument, why bother having 'signal peptide' and 'propeptide' if search engines are just going to make inferences from 'mature protein'? |
Based on the PTEN example (P60484 in nextprot) I would say yes we need an explicit "Initiator Met" which in many cases is tied to “Modified residue” since the 2nd residue is acetylated. I don’t think “Mature protein” covers all bases (correct me if I am wrong on this). Having specific boundaries which allows a protein sequence to be broken into parts would help intact protein analysis (no digestion) - I would keep signal peptide, propeptide etc plus have mature protein under \Processed
Eugene
… On 13 Dec 2018, at 5:10 am, Eric Deutsch ***@***.***> wrote:
Lydie wrote in an email on the topic:
I think that having initiator methionine would be redundant with having "Mature protein" starting at position 2 .
There do seem to be several ways to do it, but we should pick one and recommend it in the spec.
I would vote against the \VariantComplex solution because it may not be implemented until late by many programs. And variant in my mind generally implies a genetic difference from the reference, although this is not strict.
The \ModResXxxx way seems a bit weird because then there would start the precedent of displaying and listing peptides with sequences longer than what's there with some masses set to 0. i.e. if a search engine got a match for this, it would be tempted to display it with something like M[Met-loss]ELVIS, and sequence mapping software would map it with the M, but then it would presumably not map to another protein with .....KELVISR..... or .....PELVIS...... because there's no M. I don't think that's a good solution.
Dealing with it in \Processed seems like the right way to handle it. As Lydie said, it is already implicit with a \Processed=(2|542|PEFF:0001020|mature protein). For a search engine that wants to treat mature protein boundaries as fully tryptic, this is all it really needs to know. Also having (1|1|PEFF:000nnnn|initiator methionine) would be very explicit in the annotations, and would avoid annotater software needing to infer this information, but perhaps not necessary for the searching task. But, of course, by that argument, why bother having 'signal peptide' and 'propeptide' if search engines are just going to make inferences from 'mature protein'?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ASe4LMr8AgfYmkKeFMM9w-SqwR7YbRxxks5u4UaagaJpZM4YDWgn>.
|
Realistically if you imagine writing search engine code, I suspect all the code needs to look at are the "mature protein" boundaries. Everything else could be ignored? Maybe not. in my experience, you never see "signal peptide" sequence, but you do sometimes see "propeptide" sequence. In the end you can search it all parts of the protein and just treat the boundaries as hard boundaries. I agree that including a term for init met makes it easier to annotate a sequence without making a hard-coded assumption (if (mature_protein.start == 2) then position_1 = init_met) Maybe we can voice a few more opinions and then have a vote... |
With all the comments above, I would also support a \Processed=(1|1|PEFF:nnn|initiator methionine) term. |
We do not currently have a term for initiator methionine. Is this an oversight or intentional?
Do we want to add this term and be specific about initiator methionines in the PEFF? I would guess yes, but what do you think?
Example in neXtProt:
https://www.nextprot.org/entry/P60484/sequence
The text was updated successfully, but these errors were encountered: