Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locate public URL for gencode gtf file #3

Open
jpdna opened this issue Oct 1, 2016 · 2 comments
Open

Locate public URL for gencode gtf file #3

jpdna opened this issue Oct 1, 2016 · 2 comments
Assignees

Comments

@jpdna
Copy link
Contributor

jpdna commented Oct 1, 2016

The file

gencode.v19.annotation.gtf.gz

from solveBio S3 turns out to not be identical to:

ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_19/gencode.v19.annotation.gtf.gz

We need to determine if either this file was post-processed after download or if a matching version is available for download from the gencode project.

@dandanxu could you take a look at this?

@dandanxu
Copy link
Member

dandanxu commented Oct 3, 2016

loooking into this

@dandanxu
Copy link
Member

dandanxu commented Oct 3, 2016

to be honest, i'm still not sure. It looks like the spacing is occasionally occasionally different but otherwise the data looks the same. In any case, I don't think this matters toooo much - we don't use actually use these files anywhere since we stopped actively supporting ensembl transcripts in favor of refseq transcripts (at one point we supported both, but we haven't for quite a while and it won't work without a bit of tweaking). Probably just focusing on refseq is good for now, since that's the harder case, and hopefully UTA integration will take care of ensembl =).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants