Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vv ensembl dev susmi #660

Open
wants to merge 82 commits into
base: develop
Choose a base branch
from
Open

Vv ensembl dev susmi #660

wants to merge 82 commits into from

Conversation

Peter-J-Freeman
Copy link
Collaborator

@John-F-Wagstaff. The main part of this commit is handling variants like NW_011332691.1(NM_012234.6):c.335+1G>C. See issue #657.

This transcript is on the reverse strand and is working. I have added in some tests too.

Copy link

gitguardian bot commented Dec 5, 2024

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
1081033 Triggered Generic Password d08f447 db_dockerfiles/vvta/Dockerfile View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Peter-J-Freeman and others added 24 commits December 5, 2024 14:30
…3 prime UTRs in uncertain positions for the LOVD paper
This is used internally for validation as well as when presenting VCF
data to users.
We attempt to push gap variants both left and right when validating,
this used to result in up to 50 separate SeqRepo seq fetch calls. This
commit replaces this with 1 50 base SeqRepo call. In the cases where we
trigger gap movement attempts but only attempt 1 move Micro-benchmarking
puts the 50bp fetch at about 4% slower than the 1bp fetch, but this
commit will at least nearly half the time spent in SeqRepo seq fetch
calls in every other case.  The net effect on our input tests is a
~22% decrease in time taken to run, so this is probably a net gain.
This reduces the calls to parse hgvs objects from strings in mappings,
by using the existing versions, where they exist already. First pass for
improving efficiency/performance of genomic to transcript and transcript
to genome mappings.
This will allow us to simplify the logic where we use report_hgvs2vcf
to generate output later.
We used to call report_hgvs2vcf once per valid mapping genome, this
change reduces this to just once and simplifies the logic around the
genomic variant mapping output at the same time. Normally this will half
the number of report_hgvs2vcf calls.
This also includes features to use existing validated VCF data when
provided.
Don't double validate the same variant when doing fallback for RSG
mapping recovery during hgvs to vcf conversion.
This patch reduces parsing hgvs objects from text by creating them
directly from the available parts, in the function of
hgvs_to_delins_hgvs in hgvs_utils, by replacing the function of
vcfcp_to_hgvsstr with an new function of vcfcp_to_hgvs_obj.
Add a new hgvs_delins_parts_to_hgvs_obj function and use it instead of
parse for the hgvs delins creation inside hgvs_utils.py. This is
particularly used in hgvs<->vcf functions.
This avoids going from object to text only to then go back to object
output almost all the time, allowing us to skip the unnecessary parser
step.
Upgrade delins creation in gapped_mapping.py by using helper functions
to directly create delins hgvs objects, as opposed to parsing the
objects from text, which has performance costs.
This function hgvs_obj_from_existing_edit, is intended for when we are
re-creating a variant with a new location, but the same edit.
Reduce the re-parsing of variants into hgvs delins objects via strings
in vvMixinConverters.py, use hgvs_delins_parts_to_hgvs_obj instead.
John-F-Wagstaff and others added 30 commits March 2, 2025 21:21
We need slightly different bracket handling on predicted proteins, and
different handling on */Ter= too. This allows us to fix the issue
centrally rather than re-applying the edits in multiple places.
As yet unused, needed for object handling of protein variant mappings.
Expanded repeat formatting needs to happen before the final hgvs string
to hgvs object conversion step, move it in preparation for switching
this conversion to happening only once, inside initial format
conversion.
Input formatting that depends on having a text quibble from user input
needs to be completed before we parse into a hgvs object. Prepare for
centralising input parsing by moving this kind of formatting before the
intended hgvs string->hgvs object parsing point (at the end of the
initial_format_conversions function).
Methyl syntax handling needs to happen before the first conversion into
a hgvs object as the parser does not currently handle it.
Harden some initial input handling functions to cope with objects
instead of strings, and add improved type aware posedit parsing.
Add required variables to handle methylation to the VVPosEdit, as well
as a simple PosEdit to VVPosEdit function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants