Basic implementation of a simple auto-differentiation engine in C/C++. Includes forward-mode (using the dual
type) and reverse-mode (using the Var
type).
As a practical example to test our code, let us consider the log-likelihood function of a Normal (Gaussian) distribution. For an observed variable
where
- Define the log-likelihood function in terms of elementary operations:
dual log_norm_likelihood(dual x, dual mu, dual sigma) {
dual a = dual_div(dual_sub(x, mu), sigma);
dual b = dual_pow2(a);
dual c = dual_mul(b, dual_new(-0.5, 0.0));
dual d = dual_sub(c, dual_log(sigma));
dual e = dual_sub(d, dual_new(0.5 * log(2 * M_PI), 0.0));
return e;
}
- Initialize the variables of our function (note that in forward mode we set to 1 the directional derivative of the value of interest, in this case
$\mu$ as we wish to compute$\frac{\partial f}{\partial \mu}$ ):
dual x = dual_new(10.0, 0.0);
dual mu = dual_new(5.0, 1.0); // Variable of interest
dual sigma = dual_new(2.0, 0.0);
- Run the forward-mode auto-differentiation:
dual f = log_norm_likelihood(x, mu, sigma);
-
f.val
contains the evaluation of the function at the point of interest, andf.dv
contains$\frac{\partial f}{\partial \mu}$ .
- Define the log-likelihood function in terms of elementary operations (note that in this case, the usual operators have been overloaded for easier function definition):
dual log_normal_likelihood(dual x, dual mu, dual sigma) {
dual a = (x - mu) / sigma;
dual b = pow2(a);
dual c = b * dual(-0.5); // We don't need to specify null gradients :)
dual d = c - log(sigma);
dual e = d - dual(0.5 * log(2 * M_PI));
return e;
}
- Initialize the variables of our function (note that in forward mode we set to 1 the directional derivative of the value of interest, in this case
$\mu$ as we wish to compute$\frac{\partial f}{\partial \mu}$ ):
dual x = dual(10.0, "x"); // We don't need to include the gradient for the other variables :)
dual mu = dual(5.0, 1.0, "mu"); // Variable of interest
dual sigma = dual(2.0, "sigma");
- Run the forward-mode auto-differentiation:
dual f = log_normal_likelihood(x, mu, sigma);
f.name = "log_normal_likelihood"; // Of course we can still name the dual :)
-
f.val
contains the evaluation of the function at the point of interest, andf.dv
contains$\frac{\partial f}{\partial \mu}$ .
- We initialize the gradient tape:
init_tape();
- Similarly to the forward-mode, we define the log-likelihood in terms of its elementary operations:
Var* log_norm_likelihood_rev(Var *x, Var *mu, Var *sigma) {
Var *a = var_div(var_sub(x, mu), sigma);
Var *b = var_pow2(a);
Var *c = var_mul(b, var_new(-0.5));
Var *d = var_sub(c, var_log(sigma));
Var *e = var_sub(d, var_new(0.5 * log(2 * M_PI)));
return e;
}
- Initialize the variables of our interest (optionally, we can name them):
Var *x = var_new(10.0);
strcpy(x->name, "x");
Var *mu = var_new(5.0);
strcpy(mu->name, "mu");
Var *sigma = var_new(2.0);
strcpy(sigma->name, "sigma");
- Run the reverse-mode auto-differentiation:
Var *f = log_norm_likelihood_rev(x, mu, sigma);
backward_pass(f); //Execute backwards pass
- The derivatives w.r.t. each variable are now available under the variables themselves. For example,
$\frac{\partial f}{\partial \mu}$ is in:
mu->grad;
- Clear the tape and release the nodes in the expression graph:
free_tape();
Not yet implemented.
- To build the
C
version:
make
- To build the
C++
version:
make cpp
- Implement a parser to generate the expression graph based on any function definition (to overcome lack of operator overloading in C).
- Make a Python interface to deploy the library.