Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow static covariates in BGNBDModel #1390
base: main
Are you sure you want to change the base?
Allow static covariates in BGNBDModel #1390
Changes from 21 commits
5f567eb
1897d62
042780e
bc83ddc
f0cb25e
59d6826
b33b56b
e74745c
ffced16
02b7c11
dc7be23
dedb36e
91d0168
4af6482
28c15c0
6ef492f
77dad3e
857c885
f796170
3877662
2e83e3a
8902431
187176a
b056f7a
d71d2b5
74ae6a1
cf8d67a
a8c1328
6186be0
d13da11
411df2f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add an in-line citation for this reference in the top-level of the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the
gamma1
suffix being used here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible to use model coordinates instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change these
_gamma%
suffixes to_alpha
and_beta
? Gamma is a confusing term because it pertains to the purchasing process in the research.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to follow the convention here
. There is no
beta
, buta
andb
(I can call them coefficient_a, and coefficient_b if you like)purchase_coefficient
,dropout_coefficient
to follow the implementation inParetoNBDModel
. But then thegamma2
andgamma3
coefficients must be equal. This is in fact how the implementation in R's CLVTools is done btw.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my mistake - I meant
_a
and_b
.Are you saying CLVTools fixes these coefficients to be equal to each other? They share the same data, but this doesn't seem to align with the research note.
Also, which implementation is not helping with convergence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is indeed the case. See here and here
Using the same implementation as CLVTools, fixing gamma2=gamma3. It did not help with the test issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My best guess as to why the CLVTools developers did this was for easier interpretability and/or to speed up model fits. What's weird is why they still went with it if convergence is negatively impacted. This could be a good selling point for
pymc-marketing
compared to other open-source tools.Explaining covariate impacts on overall dropout in terms of separate
a
andb
coefficients will be tricky, but not impossible. Generally ifa >b
, thenp
increases, and vice-versa. The greater both values are, the narrower the distribution.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested the phi/kappa hierarchical pooling for
a_scale
andb_scale
with covariates? Any major differences in results and/or fit times compared to specifying a prior for these params?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can play with it in #1430
docs/source/notebooks/clv/dev/bg_nbd_covariates.ipynb
. By removing the custom prior, and usingdefault_model_config
as in:It works. The model fits, but we get quite biased estimates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So specifying
a_prior
andb_prior
in the model config is recommended when using covariates? We should probably mention this somewhere.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better if you introduce sensible priors with small dataset (which is the case in #1430) than the default
kappa
,gamma
priors. I did not try different priors forkappa
,gamma
.I would differ advising on particular priors until we study default priors in the newly opened #1496 issue.
Check warning on line 280 in pymc_marketing/clv/models/beta_geo.py
pymc_marketing/clv/models/beta_geo.py#L280
Check warning on line 286 in pymc_marketing/clv/models/beta_geo.py
pymc_marketing/clv/models/beta_geo.py#L286
Check warning on line 289 in pymc_marketing/clv/models/beta_geo.py
pymc_marketing/clv/models/beta_geo.py#L289
Check warning on line 292 in pymc_marketing/clv/models/beta_geo.py
pymc_marketing/clv/models/beta_geo.py#L292
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comment.
This hierarchical pooling conditional block may be worth its own internal method because it appears in several models, but we can leave that as a separate PR.
Check warning on line 296 in pymc_marketing/clv/models/beta_geo.py
pymc_marketing/clv/models/beta_geo.py#L296
Check warning on line 299 in pymc_marketing/clv/models/beta_geo.py
pymc_marketing/clv/models/beta_geo.py#L299
Check warning on line 303 in pymc_marketing/clv/models/beta_geo.py
pymc_marketing/clv/models/beta_geo.py#L303
Check warning on line 306 in pymc_marketing/clv/models/beta_geo.py
pymc_marketing/clv/models/beta_geo.py#L306
Check warning on line 312 in pymc_marketing/clv/models/beta_geo.py
pymc_marketing/clv/models/beta_geo.py#L312
Check warning on line 320 in pymc_marketing/clv/models/beta_geo.py
pymc_marketing/clv/models/beta_geo.py#L320