-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V0.4.2 #61
Conversation
zepid/causal/gformula/TimeFixed.py
Outdated
p=g[self._weights]/np.sum(g[self._weights])) | ||
else: # Conditional probabilities based on eval() | ||
# TODO Checking conditionals are mutually exclusive | ||
if self._weights is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I squint as these two pieces of code, they are about identical. I don't know if you have additional plans for these branches, but what about:
treated = []
for c, prop in zip(conditional, p):
gs = g.loc[eval(c)].copy()
p = gs[self._weights] / np.sum(gs[self._weights]) if self._weights is not None else None
tr = np.random.choice(gs.index, size=int(prop*gs.shape[0]), replace=False, p=p)
treated.extend(tr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto with the conditional above too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh good call. I remember now that is how lifelines
is set up. It is much more condensed and avoids my needless repitition
zepid/causal/gformula/TimeFixed.py
Outdated
if self._weights is None: | ||
treated = [] | ||
for c, prop in zip(conditional, p): | ||
gs = g.loc[eval(c)].copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts on this: it's going to fail if your conditional doesn't use the magic "g"
variable. Maybe you want that? It's not obvious to the user though. An option to consider, if you haven't already, is the native query
method in Pandas. That has a) documentation that people can rely on b) probably some cleaner error handling/debugging for users.
There could be a even better solution that doesn't involve passing in strings, but nothing is obvious to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah so the magic-g is something baked into custom treatments for both TimeFixedGFormula
and TimeVaryGFormula
but I agree that it is not immediately obvious. It can get confusing in TimeVaryGFormula
(see the Keil et al 2014 replication for how the magic-g is consistently used across the process).
query
looks to be what I am looking for. I was previously unaware of its existence. I think that is useful for what I am doing
There could be a even better solution that doesn't involve passing in strings, but nothing is obvious to me.
I haven't found one especially since I would like to retain maximal flexibility in the potential interventions that could be used in the g-formula. For example, you should be able to specify a treatment that only is applied to females at least 30 years old.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #63 for this issue. As it currently stands, all the custom treatments/stochastic treatments use this functionality. I plan on looking into better options. My bandage for now is to make the documentation better and try to handle errors better. pd.query
does not return exactly what I need. pd.eval
looks promising though
This would be updated as part of a more major update (maybe 0.5.0 if pd.eval
is what I want)
…edGFormula.fit_stochastic()
…llows for categorical variables
Standardized differences now calculate for all variables #59. However, the online docs and examples need to be updated.... |
Build failure is due to table 1 generator function. Not related to IPTW or other updates |
Updating to version 0.4.2
TMLE
calculates all implemented measures (RD, RR, OR) at once. Saves computation time if the user wants multiple measures and is fitting a time-intensive ML algorithm TMLE additions #39Fixing
standard_mean_difference
for categorical variables. Will need to use patsy to parse out categorical variables viaC(...)
, otherwise "I" cannot know which variables are categorical Update standardized difference calculator #59Adding stochastic treatments for binary exposures for
TimeFixedGFormula
viafit_stochastic()
Allows for both unconditional and conditional probablities Stochastic Treatment g-formula #58Add continuous outcomes todelaying this again... Support for continuous treatments in TimeFixedGFormula #49TimeFixedGFormula
but this time actually get it to work as intended. Need to think through the set up for the two potential interventions (threshold vs. shift)Also learn ReadTheDocs for the Reference papge (still not doing what I hoped it would)