Skip to content

Commit

Permalink
improvements and typo fixes in docs
Browse files Browse the repository at this point in the history
  • Loading branch information
EetuReijonen committed Aug 21, 2024
1 parent df299d2 commit d0c2cdf
Show file tree
Hide file tree
Showing 11 changed files with 58 additions and 128 deletions.
17 changes: 8 additions & 9 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,23 @@ using Gogeta

makedocs(
modules=[Gogeta],
authors="Eetu Reijonen",
authors="Eetu Reijonen, Milana Begantsova",
sitename="Gogeta.jl",
format=Documenter.HTML(),
pages=[
"Introduction" => "index.md",
"Tutorials" => [
"Features" => [
"Neural networks" => [
"Practicalities related to NNs" => "nns_introduction.md",
"Big-M formulation of NNs" => "neural_networks.md",
"Psplit formulation of NNs" => "psplit_nns.md",
"General" => "nns_introduction.md",
"Big-M formulation" => "neural_networks.md",
"Partition-based formulation" => "psplit_nns.md",
"Optimization" => "optimization.md",
"Neural networks in larger optimization problems" => "nns_in_larger.md",
"Input convex neural networks" => "icnns.md",
"Use as surrogates" => "nns_in_larger.md"
],
"CNNS" => "cnns.md",
"Input convex neural networks" => "icnns.md",
"Convolutional neural networks" => "cnns.md",
"Tree ensembles" => "tree_ensembles.md",
],
"Public API" => "api.md",
"Literature" => "literature.md",
"Reference" => "reference.md",
],
Expand Down
68 changes: 0 additions & 68 deletions docs/src/api.md

This file was deleted.

3 changes: 1 addition & 2 deletions docs/src/cnns.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# Formulation of CNNs

With our library, you can also formulate CNNs.
The convolutional neural network requirements can be found in the [`CNN_formulate!`](@ref) documentation. See [this jupyter notebook](https://github.com/gamma-opt/Gogeta.jl/blob/main/examples/cnns/conv_neural_networks.ipynb) for a more detailed example.

First, create some kind of input (or load an image from your computer).
Expand Down Expand Up @@ -35,7 +34,7 @@ cnns = get_structure(CNN_model, input);
CNN_formulate!(jump, CNN_model, cnns)
```

Check that the `JuMP` model produces the same outputs as the `Flux.Chain`.
It can be checked that the `JuMP` model produces the same outputs as the `Flux.Chain`.

```julia
vec(CNN_model(input)) image_pass!(jump, input)
Expand Down
4 changes: 4 additions & 0 deletions docs/src/icnns.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Input convex neural networks (ICNNs)

!!! warning

ICNNs are an experimental feature. It is unclear when the LP formulation will produce feasible solutions and when ICNNs are useful in general. Currently, there is no better method to see the compatibility other than to check the feasibility after obtaining a solution.

In input convex neural networks, the neuron weights are constrained to be nonnegative and weighted skip connections are added from the input layer to each layer. More details can be found in [Amos et al. (2017)](literature.md). These changes make the network output convex with respect to the inputs. A convex piecewise linear function can be formulated as a linear programming problem (LP) which is much more computationally efficient than the MILP formulations of "regular" neural networks. This is the reason for implementing ICNN functionality into this package. ICNNs are a viable option when the data or function being modeled is approximately convex and/or some prediction accuracy must be sacrificed for computational performance.

## Training
Expand Down
6 changes: 3 additions & 3 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,13 @@ Replace `<branch-name>` with the name of the branch you want to add.

## How can this package be used?

Formulating trained machine learning (ML) models as mixed-integer linear programming (MILP) problems opens up multiple possibilities. Firstly, it allows for global optimization - finding the input that probably maximizes or minimizes the ML model output. Secondly, changing the objective function in the MILP formulation and/or adding additional constraints makes it possible to solve problems related to the ML model, such as finding adversarial inputs. Lastly, the MILP formulation of a ML model can be incorporated into a larger optimization problem. This is useful in a surrogate modeling context where an ML model can be trained to approximate a complex function that itself cannot be used in an optimization problem.
Formulating trained machine learning (ML) models as mixed-integer linear programming (MILP) problems opens up multiple possibilities. Firstly, it allows for global optimization - finding the input that probably maximizes or minimizes the ML model output. Secondly, changing the objective function in the MILP formulation and/or adding additional constraints makes it possible to solve problems related to the ML model, such as finding adversarial inputs. Lastly, the MILP formulation of a ML model can be embedded into a larger optimization problem. This is useful in a surrogate modeling context where an ML model is trained to approximate a complex function that itself cannot be used in an optimization problem.

Despite its usefulness, modeling ML models as MILP problems has significant limitations. The biggest limitation is the capability of MILP solvers which limits the ML model size. With neural networks, for example, only models with at most hundreds of neurons can be effectively formulated as MILPs and then optimized. In practice, formulating into MILPs and optimizing all large modern ML models such as convolutional neural networks and transformer networks is computationally infeasible. However, if small neural networks are all that is required for the specific application, the methods implemented in this package can be useful. Secondly, only piecewise linear ML models can be formulated as MILP problems. For example, with neural networks this entails using activation functions such as $ReLU$.

Input convex neural networks (ICNNs) are a special type of machine learning model that can be formulated as linear optimization problems (LP). The convexity limits the expressiveness of the ICNN but the LP formulation enables fast optimization of even very large ICNNs. If the data or the function being modeled is approximately convex, ICNNs can provide similar accuracy to regular neural networks. If a ML model is used in some of the contexts mentioned in the first paragraph, ICNNs can be used instead of neural networks without the computational limitations of MILP models.

## Getting started

The following sections [Tree ensembles](tree_ensembles.md), [Neural networks](nns_introduction.md), [Neural networks in larger optimization problems](nns_in_larger.md) and [Input convex neural networks](icnns.md) give simple demonstrations on how to use the package.
Examples on multiprocessing features as well as more detailed code can be found in the `examples/`-folder of the [package repository](https://github.com/gamma-opt/Gogeta.jl).
The following Features-section gives simple demonstrations on how to use the package.
More examples and detailed code can be found in the `examples/`-folder of the [package repository](https://github.com/gamma-opt/Gogeta.jl).
1 change: 1 addition & 0 deletions docs/src/literature.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ The most important papers for our work are listed here. In these works, more in-

* *Tong, J., Cai, J., & Serra, T. (2024). Optimization Over Trained Neural Networks: Taking a Relaxing Walk. arXiv preprint arXiv:2401.03451.*
* **Partition based formulation of NN:**

* *Calvin Tsay, Jan Kronqvist, Alexander Thebelt, & Ruth Misener. (2021). Partition-based formulations for mixed-integer optimization of trained ReLU neural networks.*

* **Tree ensembles:**
Expand Down
27 changes: 14 additions & 13 deletions docs/src/neural_networks.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Formulation of NN with big-M approach
# Formulation of NNs with the big-M approach

The first way to formulate NN as a MIP is to use function [`NN_formulate!`](@ref). This formulation is based on the following paper: [Fischetti and Jo (2018)](literature.md). For more detailed information with examples, please see next [jupyter notebook](https://github.com/gamma-opt/Gogeta.jl/blob/main/examples/neural_networks/example_1_neural_networks.ipynb).
Neural networks can be formulated as MIPs using the function [`NN_formulate!`](@ref). The formulation is based on the following paper: [Fischetti and Jo (2018)](literature.md). For more detailed information with examples, please see the [jupyter notebook](https://github.com/gamma-opt/Gogeta.jl/blob/main/examples/neural_networks/example_1_neural_networks.ipynb).

Suppose you have a trained neural network `NN_model` with known boundaries for input variables (`init_U`, `init_L`), then a trained NN can be formulated as `JuMP` model:
Assuming you have a trained neural network `NN_model` with known boundaries for input variables (`init_U`, `init_L`), a trained NN can be formulated as `JuMP` model:

```julia
using Flux
Expand All @@ -21,15 +21,15 @@ set_silent(jump_model) # set desired parameters
bounds_U, bounds_L = NN_formulate!(jump_model, NN_model, init_U, init_L; bound_tightening="standard", compress=true)
```

The function returns boundaries for each neuron and the `jump_model` is updated in the function. By default objective function of the `jump_model` is set to the dummy function *"Max 1"*.
The function returns boundaries for each neuron and the `jump_model` is updated by the function. By default, the objective function of the `jump_model` is set to the dummy function *"Max 1"*.

This formulation enables compression by setting `compress=true`. Compression drops inactive neurons (or dead neurons) and decreases size of the MILP.
With this function, compression can be enabled by setting `compress=true`. Compression drops inactive neurons (dead neurons) and thus decreases size of the MILP.

Possible bound-tightening strategies include: `fast` (default), `standard`, `output`, and `precomputed`.

!!! note

When you use `precomputed` bound-tightening, you should also provide upper and loswer boundaries for the neurons (`U_bounds`, `L_bounds`) and nothing is returned.
When you use `precomputed` bound-tightening, you should also provide upper and lower boundaries for the neurons (`U_bounds`, `L_bounds`) and nothing is returned.

```julia
NN_formulate!(jump_model, NN_model, init_U, init_L; bound_tightening="precomputed", U_bounds=bounds_U, L_bounds=bounds_L, compress=true)
Expand All @@ -42,26 +42,26 @@ Possible bound-tightening strategies include: `fast` (default), `standard`, `out
```julia
NN_formulate!(jump_model, NN_model, init_U, init_L; bound_tightening="output", U_out=U_out, L_out=L_out, compress=true)
```
## Compression of the NN using bounds
## Compression of the NN using precomputed bounds

Given lower and upper bounds (`bounds_U`, `bounds_L`) for neurons, the NN can be compressed. The function [`NN_compress`](@ref) will return modified compressed NN along with indexes of dropped neurons.
Given lower and upper bounds (`bounds_U`, `bounds_L`) for neurons, the NN can be compressed. The function [`NN_compress`](@ref) will return the modified compressed NN along with indexes of dropped neurons.

```julia
compressed, removed = NN_compress(NN_model, init_U, init_L, bounds_U, bounds_L)
```

## Calculation of the formulation output
## Calculation of the model output

When you have a ready formulation of the neural network, you can calculate the output of `JuMP` model with a function [`forward_pass!`](@ref)
When you have a ready formulation of the neural network, you can calculate the output of `JuMP` model with the function [`forward_pass!`](@ref)

```julia
forward_pass!(jump_model, [-1.0, 0.0])
```
## Running the formulation in parallel
## Performing the formulation in parallel

!!! tip

If formulation with `standard` bound-tightening takes too slow, you can reduce computation time by running formulation in parallel. For this you need to innitialize 'workers' and set `parallel = true`. See next [jupyter notebook](https://github.com/gamma-opt/Gogeta.jl/tree/main/examples/neural_networks/example_2_neural_networks_parallel) for a more detailed explanation.
If formulation with `standard` bound-tightening is too slow, computational time can be reduced by running the formulation in parallel. For this, workers need to be initialized and `parallel`-argument set to true. See the [jupyter notebook](https://github.com/gamma-opt/Gogeta.jl/tree/main/examples/neural_networks/example_2_neural_networks_parallel) for a more detailed explanation.

```julia
# Create the workers
Expand Down Expand Up @@ -95,4 +95,5 @@ end
jump = NN_model()
@time U, L = NN_formulate!(jump_model, NN_model, init_U, init_L; bound_tightening="standard", silent=false, parallel=true);
```
In the next section, we will look at the Psplit formulation of NNs.

Here Gurobi is used. For other solvers this procedure might be simpler, since an environment doesn't have to be created for each of the workers.
4 changes: 1 addition & 3 deletions docs/src/nns_in_larger.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,4 @@ In addition to `Flux.Chain` neural networks, [`NN_incorporate!`](@ref) also acce
NN_incorporate!(jump_model, "folder/parameters.json", output, x, y; U_in=init_U, L_in=init_L)
```

Where "folder/parameters.json" is the relative path of the JSON file containing the neural network parameters.

In the next section, we look at the special case of the neural networks.
Where "folder/parameters.json" is the relative path of the JSON file containing the neural network parameters.
Loading

0 comments on commit d0c2cdf

Please sign in to comment.