-
Notifications
You must be signed in to change notification settings - Fork 0
Priors
GradVI can be used with a wide range of prior distributions. It is designed to obtain the prior family as an input module from the user. Before running any task, the user needs to specify which prior to use. They can use one of the priors already provided with the software, or they can define their own choice of prior.
In our software, we have already implemented a few prior distributions. In the following, we discuss how to use those prior distributions nd the available options.
The adaptive shrinkage (ash) prior was introduced by Matthew Stephens in this paper. It is defined as a scale mixture of normals, specifically,
Here,
import numpy as np
from gradvi.priors import Ash
sk = np.abs(np.power(2.0, np.arange(20) / 20) - 1)
prior = Ash(sk)
There are several optional flags which can used to modify the prior family, as discussed below in the respective subsections.
Initialization. The solution may depend on the initialization of
import numpy as np
from gradvi.priors import Ash
k = 20
wk = np.zeros(k)
wk[0] = 0.9
wk[1:(k-1)] = np.repeat((1 - wk[0])/(k-1), (k - 2))
wk[k-1] = 1 - np.sum(wk) # to prevent overflow error leading to a sum greater than 1.
sk = np.abs(np.power(2.0, np.arange(k) / k) - 1)
prior = Ash(sk, wk = wk)
The above recipe is very commonly used
and the above code can be replaced simply by setting the sparsity
flag:
import numpy as np
from gradvi.priors import Ash
sk = np.abs(np.power(2.0, np.arange(20) / 20) - 1)
prior = Ash(sk, sparsity = 0.9)
In general, one can define any set of initial values
Scaling.
While solving a multiple regression problem scaled
flag:
prior = Ash(sk, scaled = True) # for scaled prior
prior = Ash(sk, scaled = False) # for unscaled prior
By default, we use scaled = True
. For regression, we suggest using the scaled prior
and for trendfiltering, we suggest using the unscaled prior.
Softmax parametrization.
In order to impose the constraints
and update smbase
flag:
prior = Ash(sk, smbase = 2.0)
The point-normal distribution is another popular choice of prior family in sparse variable selection, for example, see this paper. It is defined as,
where
from gradvi.priors import PointNormal
prior = PointNormal()
Initialization.
By default, the software initializes sparsity
flag, that is,
prior = PointNormal(sparsity = 0.9)
Similarly, the initial value of s2
flag, for example,
prior = PointNormal(sparsity = 0.9, s2 = 2.0)
The user can define and use any prior distribution if it is analytically tractable in the Normal Means model. If you are defining a prior family, please reach out to us so that we can test and include the prior in the software, saving time and resources for other users in the future. The Python class for the prior family primarily includes:
- The number of parameters that needs to be estimated,
- A real bound for each of those parameters
- A function for the gradient descent step for those parameters