Skip to content

Commit

Permalink
Merge pull request #26 from gamma-opt/relaxing-walk
Browse files Browse the repository at this point in the history
Relaxing walk and new version
  • Loading branch information
EetuReijonen authored Apr 22, 2024
2 parents d6feaae + 883d763 commit 7d4183f
Show file tree
Hide file tree
Showing 23 changed files with 934 additions and 272 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,6 @@
*/Manifest.toml
/docs/Manifest.toml
/docs/build/
/playground
.vscode/
.DS_Store
6 changes: 3 additions & 3 deletions CITATION.bib
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
@misc{Gogeta.jl,
author = {Nikita Belyak, Joonatan Linkola, Eetu Reijonen, Vilhelm Toivonen},
author = {Eetu Reijonen, Vilhelm Toivonen, Nikita Belyak, Joonatan Linkola},
title = {Gogeta.jl},
url = {https://github.com/gamma-opt/Gogeta.jl},
version = {v1.0.0-DEV},
version = {v0.2.0},
year = {2024},
month = {2}
month = {4}
}
4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Gogeta"
uuid = "8b0c908c-0e18-4590-8d3d-9c2483fd76bf"
authors = ["Nikita Belyak, Joonatan Linkola, Eetu Reijonen, Vilhelm Toivonen"]
version = "0.1.0"
authors = ["Eetu Reijonen, Vilhelm Toivonen, Nikita Belyak, Joonatan Linkola"]
version = "0.2.0"

[deps]
Distributed = "8ba89e20-285c-5b6f-9357-94700520ee1b"
Expand Down
19 changes: 2 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,6 @@

*"Gogeta was the result of the Saiyan warriors Son Goku and Vegeta successfully performing the Fusion Dance. Vegeta and Goku usually fused into Gogeta to counteract a significant threat, as Gogeta's power exponentially surpassed the sum of his parts."* [source](https://hero.fandom.com/wiki/Gogeta)

Gogeta.jl (pronounced "Go-gee-ta") enables the user to represent machine-learning models with mathematical programming, more specifically as mixed-integer optimization problems. This, in turn, allows for "fusing" the capabilities of mathematical optimisation solvers and machine learning models to solve problems that neither could solve on their own.
Gogeta.jl (pronounced "Go-gee-ta") enables the user to represent trained machine learning models with mathematical programming, more specifically as mixed-integer optimization problems. This, in turn, allows for "fusing" the capabilities of mathematical optimization solvers and machine learning models to solve problems that neither could solve on their own.

Currently supported models include neural networks and convolutional neural networks using ReLU activation and tree ensembles.

## Package features

### Tree ensembles
* **tree ensemble to MIP conversion** - obtain an integer optimization problem from a trained tree ensemble model
* **tree ensemble optimization** - optimize a trained decision tree model, i.e., find an input that maximizes the ensemble output

### Neural networks
* **neural network to MIP conversion** - formulate integer programming problem from a neural network
* **bound tightening** - improve computational feasibility by tightening bounds in the formulation according to input/output bounds
* **neural network compression** - reduce network size by removing inactive or stably active neurons
* **neural network optimization** - find the input that maximizes the neural network output

### Convolutional neural networks
* **convolutional neural network to MIP conversion** - formulate a mixed-integer programming problem from a convolutional neural network
Currently supported models are tree ensembles and neural networks and convolutional neural networks using ReLU activation.
3 changes: 2 additions & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ makedocs(
format= Documenter.HTML(),
pages=[
"Introduction" => "index.md",
"Bound tightening" => "bound_tightening.md",
"Tree ensembles" => "tree_ensembles.md",
"Neural networks" => "neural_networks.md",
"Public API" => "api.md",
"Literature" => "literature.md",
"Reference" => "reference.md",
Expand Down
9 changes: 8 additions & 1 deletion docs/src/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ These are all of the functions and data structures that the user needs to know i
### Forward pass
* [`forward_pass!`](@ref) - fix the input variables and optimize the model to get the output

### Sampling-based optimization
* [`optimize_by_sampling!`](@ref) - optimize the JuMP model by using a sampling-based approach
* [`optimize_by_walking!`](@ref) - optimize the JuMP model by using a more sophisticated sampling-based approach

## Convolutional neural networks

### Data structures
Expand All @@ -43,4 +47,7 @@ These are all of the functions and data structures that the user needs to know i
* [`CNN_formulate!`](@ref) - formulate a `JuMP` model from the CNN

### Forward pass
* [`image_pass!`](@ref) - fix the input variables and optimize the model to get the ouput
* [`image_pass!`](@ref) - fix the input variables and optimize the model to get the ouput

### Sampling-based optimization
* [`optimize_by_walking_CNN!`](@ref) - optimize the JuMP model by using a more sophisticated sampling-based approach
12 changes: 0 additions & 12 deletions docs/src/bound_tightening.md

This file was deleted.

214 changes: 7 additions & 207 deletions docs/src/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Gogeta.jl

[Gogeta](https://gamma-opt.github.io/Gogeta.jl/) is a package that enables the user to formulate machine-learning models as mathematical programming problems.
[Gogeta](https://gamma-opt.github.io/Gogeta.jl/) is a package that enables the user to formulate trained machine learning models as mathematical optimization problems.

Currently supported models are `Flux.Chain` ReLU-activated neural networks (dense and convolutional) and `EvoTrees` tree ensemble models.

Expand All @@ -9,213 +9,13 @@ Currently supported models are `Flux.Chain` ReLU-activated neural networks (dens
julia> Pkg.add("Gogeta")
```

## Getting started

The following sections give a very simple demonstration on how to use the package.
Multiprocessing examples and more detailed code can be found in the `examples/`-folder of the [package repository](https://github.com/gamma-opt/Gogeta.jl).

### Tree ensembles

First, one must create and train an `EvoTrees` tree ensemble model.

```julia
using EvoTrees

config = EvoTreeRegressor(nrounds=500, max_depth=5)
evo_model = fit_evotree(config; x_train, y_train)
```

Then the parameters can be extracted from the trained tree ensemble and used to create a `JuMP` model containing the tree ensemble MIP formulation.

```julia
using Gurobi
using Gogeta

# Extract data from EvoTrees model

universal_tree_model = extract_evotrees_info(evo_model)

# Create jump model and formulate
jump = Model(() -> Gurobi.Optimizer())
set_attribute(jump, "OutputFlag", 0) # JuMP or solver-specific attributes can be changed

TE_formulate!(jump, universal_tree_model, MIN_SENSE)
```

There are two ways of optimizing the JuMP model: either by 1) creating the full set of split constraints before optimizing, or 2) using lazy constraints to generate only the necessary ones during the solution process.

1\) Full set of constraints

```julia
add_split_constraints!(jump, universal_tree_model)
optimize!(jump)
```

2\) Lazy constraints

```julia
# Define callback function. For each solver this might be slightly different.
# See JuMP documentation or your solver's Julia interface documentation.
# Inside the callback 'tree_callback_algorithm' must be called.
## How can this package be used?

function split_constraint_callback_gurobi(cb_data, cb_where::Cint)
Formulating trained machine learning (ML) models as mixed-integer programming (MIP) problems opens up multiple possibilities. Firstly, it allows for global optimization - finding the input that provably maximizes or minimizes the ML model output. Secondly, changing the objective function in the MIP formulation and/or adding additional constraints makes it possible to solve problems related to the ML model, such as finding adversarial inputs. Lastly, the MIP formulation of a ML model can be included into a larger optimization problem. This is useful in surrogate contexts where an ML model can be trained to approximate a complicated function that itself cannot be used in an optimization problem.

# Only run at integer solutions
if cb_where != GRB_CB_MIPSOL
return
end

Gurobi.load_callback_variable_primal(cb_data, cb_where)
tree_callback_algorithm(cb_data, universal_tree_model, jump)

end

jump = direct_model(Gurobi.Optimizer())
TE_formulate!(jump, universal_tree_model, MIN_SENSE)

set_attribute(jump, "LazyConstraints", 1)
set_attribute(jump, Gurobi.CallbackFunction(), split_constraint_callback_gurobi)

optimize!(jump)
```

The optimal solution (minimum and maximum values for each of the input variables) can be queried after the optimization.

```julia
get_solution(opt_model, universal_tree_model)
objective_value(opt_model)
```

### Neural networks

With neural networks, the hidden layers must use the $ReLU$ activation function, and the output layer must use the identity activation.

These neural networks can be formulated into mixed-integer optimization problems.
Along with formulation, the neuron activation bounds can be calculated, which improves computational performance as well as enables compression.

The network is compressed by pruning neurons that are either stabily active or inactive. The activation bounds are used to identify these neurons.

First, create a neural network model satisfying the requirements:

```julia
using Flux

model = Chain(
Dense(2 => 10, relu),
Dense(10 => 20, relu),
Dense(20 => 5, relu),
Dense(5 => 1)
)
```

Then define the bounds for the input variables. These will be used to calculate the activation bounds for the subsequent layers.

```julia
init_U = [-0.5, 0.5];
init_L = [-1.5, -0.5];
```
Despite its usefulness, modeling ML models as MIP problems has significant limitations. The biggest limitation is the capability of MIP solvers which limits the ML model size. With neural networks, for example, only models with at most hundreds of neurons can be effectively tackled. In practice, formulating into MIPs and optimizing all large modern models such as convolutional neural networks and transformer networks is computationally infeasible. However, if small neural networks are all that is required for the specific application, the techniques implemented in this package can be useful. Secondly, only piecewise linear ML models can be formulated as MIP problems. For example, with neural networks this entails using only ReLU as the activation function.

Now the neural network can be formulated into a MIP. Here optimization-based bound tightening is also used.

```julia
jump_model = Model(Gurobi.Optimizer)
set_silent(model) # set desired parameters

bounds_U, bounds_L = NN_formulate!(jump_model, model, init_U, init_L; bound_tightening="standard")
```

Using these bounds, the model can be compressed.

```julia
compressed, removed = NN_compress(model, init_U, init_L, bounds_U, bounds_L)
```

Compression can also be done without precomputed bounds.

```julia
bounds_U, bounds_L = NN_formulate!(jump_model, model, init_U, init_L; bound_tightening="standard", compress=true)
```

Use the `JuMP` model to calculate a forward pass through the network (input at the center of the domain).

```julia
forward_pass!(jump_model, [-1.0, 0.0])
```

#### Sampling

Instead of just solving the MIP, the neural network can be optimized (finding the output maximizing/minimizing input) by using a sampling approach.

```julia
using QuasiMonteCarlo

jump_model = Model(Gurobi.Optimizer)
set_silent(jump_model)
NN_formulate!(jump_model, NN_model, init_U, init_L; bound_tightening="fast");

# set objective function as the last layer output
last_layer, _ = maximum(keys(jump_model[:x].data))
@objective(jump_model, Max, jump_model[:x][last_layer, 1])

samples = QuasiMonteCarlo.sample(1000, init_L, init_U, LatinHypercubeSample());
x_opt, optimum = optimize_by_sampling!(jump_model, samples);
```

#### Convolutional neural networks

The convolutional neural network requirements can be found in the [`CNN_formulate!`](@ref) documentation.

First, create some kind of input (or load an image from your computer).

```julia
input = rand(Float32, 70, 50, 1, 1) # BW 70x50 image
```

Then, create a convolutional neural network model satisfying the requirements:

```julia
using Flux

CNN_model = Flux.Chain(
Conv((4,3), 1 => 10, pad=(2, 1), stride=(3, 2), relu),
MeanPool((5,3), pad=(3, 2), stride=(2, 2)),
MaxPool((3,4), pad=(1, 3), stride=(3, 2)),
Conv((4,3), 10 => 5, pad=(2, 1), stride=(3, 2), relu),
MaxPool((3,4), pad=(1, 3), stride=(3, 2)),
Flux.flatten,
Dense(20 => 100, relu),
Dense(100 => 1)
)
```

Then, create an empty `JuMP` model, extract the layer structure of the CNN model and finally formulate the MIP.

```julia
jump = Model(Gurobi.Optimizer)
set_silent(jump)
cnns = get_structure(CNN_model, input);
CNN_formulate!(jump, CNN_model, cnns)
```

Check that the `JuMP` model produces the same outputs as the `Flux.Chain`.

```julia
vec(CNN_model(input)) image_pass!(jump, input)
```

## How to use?

Using the tree ensemble optimization from this package is quite straightforward. The only parameter the user can change is the solution method: with initial constraints or with lazy constraints.
In our computational tests, we have seen that the lazy constraint generation almost invariably produces models that are computationally easier to solve.
Therefore we recommend primarily using it as the solution method, but depending on your use case, trying the initial constraints might also be worthwhile.

Conversely, the choice of the best neural network bound tightening and compression procedures depends heavily on your specific use case.
Based on some limited computational tests of our own as well knowledge from the field, we can make the following general recommendations:

* Wide but shallow neural networks should be preferred. The bound tightening gets exponentially harder with deeper layers.
* For small neural network models, using the "fast" bound tightening option is probably the best, since the resulting formulations are easy to solve even with loose bounds.
* For larger neural networks, "standard" bound tightening will produce tighter bounds but take more time. However, when using the `JuMP` model, the tighter bounds might make it more computationally feasible.
* For large neural networks where the output bounds are known, "output" bound tightening can be used. This bound tightening is very slow but might be necessary to increase the computational feasibility of the resulting `JuMP` model.
* If the model has many so-called "dead" neurons, creating the JuMP model by using compression is beneficial, since the formulation will have fewer constraints and the bound tightening will be faster, reducing total formulation time.
## Getting started

These are only general recommendations based on limited evidence, and the user should validate the performance of each bound tightening and compression procedure in relation to her own work.
The following sections [Tree ensembles](tree_ensembles.md) and [Neural networks](neural_networks.md) give a very simple demonstration on how to use the package.
Multiprocessing examples and more detailed code can be found in the `examples/`-folder of the [package repository](https://github.com/gamma-opt/Gogeta.jl).
4 changes: 3 additions & 1 deletion docs/src/literature.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Literature

The mathematical optimization methods implemented in this package are based on the work of many brilliant researchers.
The most important papers for our work are listed here. In these works, more in-depth information about the formulations and various algorithms we use in can also be found.
The most important papers for our work are listed here. In these works, more in-depth information about the formulations and various algorithms we use can also be found.

* **(Convolutional) neural network formulation:**

Expand All @@ -21,6 +21,8 @@ The most important papers for our work are listed here. In these works, more in-

* *Perakis, G., & Tsiourvas, A. (2022). Optimizing Objective Functions from Trained ReLU Neural Networks via Sampling. arXiv preprint arXiv:2205.14189.*

* *Tong, J., Cai, J., & Serra, T. (2024). Optimization Over Trained Neural Networks: Taking a Relaxing Walk. arXiv preprint arXiv:2401.03451.*

* **Tree ensembles:**

* *Mišić, V. V. (2020). Optimization of tree ensembles. Operations Research, 68(5), 1605-1624.*
Loading

2 comments on commit 7d4183f

@EetuReijonen
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator register

Release notes:
Added relaxing walk heuristic optimization algorithm and improved documentation.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/105360

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.2.0 -m "<description of version>" 7d4183f45a5f8032a6c6d1259207365b2aa48aee
git push origin v0.2.0

Please sign in to comment.