-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transcript analysis (Grch37/38) - Log #31
Comments
This is great! Thank you so much! |
Thanks again @zhx828 ! A few questions
|
if you look at the *_info.txt files, the genes that do not have protein length, they do not have pfam data either. So we just need to look at one factor.
I just pulled the genes from the portal and run through both versions to see whether these entrez genes are in cbioportal. Didn't compare between 37 and 38 though. They might be identical. I think this is mainly for Ramya to finalize the portal gene table.
I didn't not check y. Only the x to see whether they are the same. |
Thanks so much @zhx828 !
I see so there are a few corner cases where even though the id matches the length might not be the same. Yeah it's weird 🙂. So it's good to check if the length matches as well. For starters at least for OncoKB annotated genes |
Cool, will do. Thanks! |
This is related to genome-nexus/genome-nexus#306 |
biomart mapping file, genes do no have entrez
genes not in cBioPortal
hugo symbol does not match with cBioPortal
Problem(mismatch) transcripts
problem_transcripts.txt
gene protein length check (ones without protein length do not have pfam, vice versa)
OncoKB issues
Good thing is, for both 37/38, they are using the same transcript.
But there are still two issues
grch37_mismatch_gn_oncokb.txt
The text was updated successfully, but these errors were encountered: