Skip to content

Commit

Permalink
issue #79 - make fake transcripts for RefSeq MT
Browse files Browse the repository at this point in the history
  • Loading branch information
davmlaw committed Aug 3, 2024
1 parent 40d9423 commit 6d037a0
Show file tree
Hide file tree
Showing 2 changed files with 133 additions and 0 deletions.
5 changes: 5 additions & 0 deletions generate_transcript_data/gff_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -322,6 +322,11 @@ def _get_transcript_accession(feature, version_key) -> Optional[str]:
else:
# print(f"warning: Couldn't get out {version_key} from {feature.type=} {feature.attr=}")
transcript_accession = transcript_id
else:
# In RefSeq there are no transcript_ids for MT genes/mRNAs
# The proteins have "YP_" prefix (no corresp. NM_ transcript, so we will create fake ones
if feature.type == "mRNA":
transcript_accession = "fake-" + feature.attr.get("ID")
return transcript_accession

@staticmethod
Expand Down
Loading

0 comments on commit 6d037a0

Please sign in to comment.