-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature request: allow user to pass in a vector of values for the residual error in the gaussian outcomes case #76
Comments
@mikejacktzen: Thank you for your suggestion. It seems to me that the easiest way to do this would be to
Unfortunately, the second point (telling GPBoost to hold the error variance fixed and not estimate it) is currently not possible. This would require some software engineering. I might implement it in the future, but it is not at the top of my priority list. Contributions are welcome. As an alternative you might try my above-suggested approach (1.) without (2.) and check how large the estimated error variance is. Note that the \xi values themselves are anyway not estimated. Note on the above-suggested approach: even if you assume that xi is known exactly and that the only source of randomness are the random effects Zb, you usually need to add a small number to the diagonal of the random effects covariance matrix in order to ensure that the covariance matrix is "numerically" strictly positive definite and can be inverted with the Cholesky decomposition without numerical problems. And this approach is equivalent to setting the error variance to a very small known value. |
Thanks for the thoughtful reply. I understand to your agreeable points. Those numerical stability checks are always a fun surprise. I think I rushed my original post (excited about the possibility), instead of editing it (so it doesn't confuse other readers when they get to your reply), i'll continue here. My rushed OP confused xi (the residual terms) with the covariance term of its distribution. What I really mean is, for the normal outcomes case only, let the user provide the covariance matrix D of the residual errors xi y = F(X) + Zb + xi
Maybe this is the limitation your make in point (2) in your reply above, which sounds like what I need.
With this new corrected detail, do your points still apply? When xi ~ Normal(0,D) , Are the gpboost internal algorithms crucially reliant on working with D = scalar .* Identity, (how I understand the current set up) If not, I hope a easy next feature upgrade would allow for D = vector .* Identity. where the vector only has positive entries If so, I can continue this via email, with some very important use case contexts that I think is ripe for co authoring papers. |
Thanks for the clarification. Providing the covariance matrix of the error term instead of the error term itself makes much more sense. This could be implemented but requires some software engineering. I would be willing to work on this if the goal is a joint publication in a collaboration. Just write me an email. You can find my email address on my public university homepage. |
hi @fabsig
From the readme,
https://github.com/fabsig/GPBoost#gpboost-and-lagaboost-algorithms
Focusing on the specific case of gaussian outcomes, Using the convention
where xi is an independent error term and X are predictor variables (aka covariates or features).
Is it possible to enable a feature where the user could supply in the values of xi. Therefore xi is not estimated, the user passes in a vector of values 'xi = c(1, ...., 10)' where the length of xi is equal to the length of the outcome
as,
If you think this is do-able, I could elaborate further on the motivation
Thank you.
The text was updated successfully, but these errors were encountered: