-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stability calculation #105
Conversation
This is a rough heuristic method for stability calculations. It is fast & intuitive to calculate but it may not provide the most elegant measure and it makes many simplifying assumptions. It can be useful for exploration, and it can provide a baseline for more advanced measures. It would be good to mention these reservations in the documentation. Reference should be calculated per feature if there is some idea how that should be done. By default, there is no information how reference point should be defined. In such case, median of the entire data matrix is as good as any other value. A good feature-specific reference is the median value. That seemed to approximate the unstable region in our experiments with HITChip Atlas in 2014 and before. User should be able to define the reference point per feature. This can also help to scan the stability landscape for rough differences regarding this. Left and right stability: the alternative basins of attraction might have different strengths. It is potentially interesting to see whether there is a difference between the directionality of stability. It can be ignored if there is a need to simplify but it can be interesting on its own right in some cases. Or one could calculate both left, right, and combined. |
Now one can also provide own reference values (single value or value for each feature). However, there is a problem:
It could be added but requires more thinking. Perhaps the easiest solution is to allow these reference samples through rowData. The biggest problem here is probably the added complexity from user-perspective. TODO:
|
Alright, now it should work so that user can also specify reference samples for each system separately, i.e.:
... although, not sure if there is any use-case for this,. But good thing is that it did not add much complexity to code, just couple of lines |
TODO:
|
This PR is now ready. |
Related to this: #24 and #63
This PR implements draft for stability analysis. I tried to follow this publication in the implementation:
Lahti, L., Salojärvi, J., Salonen, A. et al. Tipping elements in the human intestinal ecosystem. Nat Commun 5, 4344 (2014). https://doi.org/10.1038/ncomms5344
This is from the current documentation of the implementation in this PR:
The stability is calculated by first defining the reference point,$(R_{f})$ ,
for each feature. This is defined by taking the mean of the range of values:
where$(f)$ denotes a single feature and $(x)$ is the abundance.
Then, the difference between consecutive time points,$(\Delta_{f, x})$ ,
and the difference between the previous time point and the reference,$(\Delta_{f, R})$ , are calculated:
where$(t)$ denotes the time point.
Optionally, the time difference$(\Delta_{f, t})$ is calculated:
The stability coefficient$(s_{f})$ is calculated with the correlation:
Alternatively, the stability coefficient can be calculated using a linear model as follows:
microbiome::intermediate_stability.R seems to calculate global reference point, i.e, the same reference point is used for all features:
https://github.com/microbiome/microbiome/blob/497576a6380bbe2334c09086ae9fcc374da59fc7/R/intermediate_stability.R#L161
Is this meaningful as the abundances can differ for each feature? Should the reference be calculated for each feature individually (as the function in this PR is doing)?
The function in this PR does not yet allow user to specify reference point; should user be able to do that (as in
microbiome::intermediate_stability
) or can/should we standardize this?Moreover,
microbiome::intermediate_stability
seems to calculate the stability separately for values that are under and over the reference value: https://github.com/microbiome/microbiome/blob/497576a6380bbe2334c09086ae9fcc374da59fc7/R/intermediate_stability.R#L196What is the justification of this, should we implement this also?
@antagomir