Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gamma b21 QC #1669

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Conversation

clararehmann
Copy link
Contributor

@clararehmann clararehmann commented Jan 23, 2025

Addresses #1649

I've replicated the Gamma DFE from Booker et al. (2021), using the shape and mean parameters from Table S2 and basing the neutral:negative proportion off of the transition:transversion ratio in Table S1.

The authors use PolyDFE to estimate the scaled selection coefficient:

Using polyDFE, the DFE is estimated in terms of the scaled selection coefficient for deleterious mutations, 2Nesd, where sd is the reduction in fitness experienced by an individual homozygous for the mutation (which is assumed to be semi-dominant).

I divided the scaled mean of the gamma distribution by the effective population size quoted in the paper (420,000) to obtain 2sd, then divided by two to get the s coefficient used for homozygotes in SLiM.

@clararehmann
Copy link
Contributor Author

It seems like the tests are failing bc I included more sig figs - is there a standard for how many to use? I just copy and pasted values from the provided supplementary tables

@clararehmann clararehmann marked this pull request as ready for review January 23, 2025 01:49
Copy link

codecov bot commented Jan 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.85%. Comparing base (237d101) to head (7a7adfc).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1669   +/-   ##
=======================================
  Coverage   99.85%   99.85%           
=======================================
  Files         136      136           
  Lines        4690     4702   +12     
  Branches      470      470           
=======================================
+ Hits         4683     4695   +12     
  Misses          3        3           
  Partials        4        4           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@petrelharp
Copy link
Contributor

Ah, I see: the second one is yours.

FAILED tests/test_dfes.py::TestMusMusGamma_B21::test_mutation_types_match - AssertionError: assert False
 +  where False = <function allclose at 0x7fc47dd367b0>([-0.0596, 0.186], [-0.05957688452380952, 0.18617976])

We don't have a standard. In general I'd think "as many digits as they report"? But, the third digit or so here is not going to make a difference, so what they had isn't wrong. I'd say change the main implementation to agree with yours, after checking they agree up to rounding? The other consideration is if the catalog looks ugly - but I think the values printed there are rounded? If not, maybe we shoudl round?

@petrelharp
Copy link
Contributor

That's not the isuse now:

=========================== short test summary info ============================
FAILED tests/test_dfes.py::TestMusMusGamma_B21::test_proporitions_match - assert False
 +  where False = <function allclose at 0x104c3b1b0>([0.334, 0.666], [np.float64(0.772), np.float64(0.228)])
 +    where <function allclose at 0x104c3b1b0> = np.allclose
===== 1 failed, 2601 passed, 53 skipped, 20 warnings in 278.46s (0:04:38) ======

@petrelharp
Copy link
Contributor

If you look at your parameters again and still don't agree with what the original implementer put in, then ping them here (and maybe elsewhere too) to consult?

@clararehmann
Copy link
Contributor Author

Okay I figured out where the 0.33:0.66 ratio for neutral and negative mutations is coming from, it's in the original way they parse DFE info, line 74 in https://github.com/TBooker/MuridRodentProject/blob/master/bin/SlimFunctions.py:

## Now, we read in a file that contains the DFE information for each of the genomic elements featured in your BED file
## This parser assumes that there are only deleterious mutations and that all the DFE are gamma distributions
## The parse also assumes that there are 1/3 neutral sites in the CDS class. However, this is a variable and you can change it if you wish

So I'll fix that up in my QC and things should be good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants