improvements and typo fixes in docs

gamma-opt · Aug 21, 2024 · d0c2cdf · d0c2cdf
1 parent df299d2
commit d0c2cdf
Show file tree

Hide file tree

Showing 11 changed files with 58 additions and 128 deletions.
diff --git a/docs/make.jl b/docs/make.jl
@@ -3,24 +3,23 @@ using Gogeta
 
 makedocs(
     modules=[Gogeta],
-    authors="Eetu Reijonen",
+    authors="Eetu Reijonen, Milana Begantsova",
     sitename="Gogeta.jl",
     format=Documenter.HTML(),
     pages=[
         "Introduction" => "index.md",
-        "Tutorials" => [
+        "Features" => [
             "Neural networks" => [
-                "Practicalities related to NNs" => "nns_introduction.md",
-                "Big-M formulation of NNs" => "neural_networks.md",
-                "Psplit formulation of NNs" => "psplit_nns.md",
+                "General" => "nns_introduction.md",
+                "Big-M formulation" => "neural_networks.md",
+                "Partition-based formulation" => "psplit_nns.md",
                 "Optimization" => "optimization.md",
-                "Neural networks in larger optimization problems" => "nns_in_larger.md",
-                "Input convex neural networks" => "icnns.md",
+                "Use as surrogates" => "nns_in_larger.md"
             ],
-            "CNNS" => "cnns.md",
+            "Input convex neural networks" => "icnns.md",
+            "Convolutional neural networks" => "cnns.md",
             "Tree ensembles" => "tree_ensembles.md",
         ],
-        "Public API" => "api.md",
         "Literature" => "literature.md",
         "Reference" => "reference.md",
     ],

diff --git a/docs/src/api.md b/docs/src/api.md
diff --git a/docs/src/cnns.md b/docs/src/cnns.md
@@ -1,6 +1,5 @@
 # Formulation of CNNs
 
-With our library, you can also formulate CNNs.
 The convolutional neural network requirements can be found in the [`CNN_formulate!`](@ref) documentation. See [this jupyter notebook](https://github.com/gamma-opt/Gogeta.jl/blob/main/examples/cnns/conv_neural_networks.ipynb) for a more detailed example.
 
 First, create some kind of input (or load an image from your computer).
@@ -35,7 +34,7 @@ cnns = get_structure(CNN_model, input);
 CNN_formulate!(jump, CNN_model, cnns)
 ```
 
-Check that the `JuMP` model produces the same outputs as the `Flux.Chain`.
+It can be checked that the `JuMP` model produces the same outputs as the `Flux.Chain`.
 
 ```julia
 vec(CNN_model(input)) ≈ image_pass!(jump, input)

diff --git a/docs/src/icnns.md b/docs/src/icnns.md
@@ -1,5 +1,9 @@
 # Input convex neural networks (ICNNs)
 
+!!! warning
+
+    ICNNs are an experimental feature. It is unclear when the LP formulation will produce feasible solutions and when ICNNs are useful in general. Currently, there is no better method to see the compatibility other than to check the feasibility after obtaining a solution.
+
 In input convex neural networks, the neuron weights are constrained to be nonnegative and weighted skip connections are added from the input layer to each layer. More details can be found in [Amos et al. (2017)](literature.md). These changes make the network output convex with respect to the inputs. A convex piecewise linear function can be formulated as a linear programming problem (LP) which is much more computationally efficient than the MILP formulations of "regular" neural networks. This is the reason for implementing ICNN functionality into this package. ICNNs are a viable option when the data or function being modeled is approximately convex and/or some prediction accuracy must be sacrificed for computational performance.
 
 ## Training

diff --git a/docs/src/index.md b/docs/src/index.md
@@ -26,13 +26,13 @@ Replace `<branch-name>` with the name of the branch you want to add.
 
 ## How can this package be used?
 
-Formulating trained machine learning (ML) models as mixed-integer linear programming (MILP) problems opens up multiple possibilities. Firstly, it allows for global optimization - finding the input that probably maximizes or minimizes the ML model output. Secondly, changing the objective function in the MILP formulation and/or adding additional constraints makes it possible to solve problems related to the ML model, such as finding adversarial inputs. Lastly, the MILP formulation of a ML model can be incorporated into a larger optimization problem. This is useful in a surrogate modeling context where an ML model can be trained to approximate a complex function that itself cannot be used in an optimization problem.
+Formulating trained machine learning (ML) models as mixed-integer linear programming (MILP) problems opens up multiple possibilities. Firstly, it allows for global optimization - finding the input that probably maximizes or minimizes the ML model output. Secondly, changing the objective function in the MILP formulation and/or adding additional constraints makes it possible to solve problems related to the ML model, such as finding adversarial inputs. Lastly, the MILP formulation of a ML model can be embedded into a larger optimization problem. This is useful in a surrogate modeling context where an ML model is trained to approximate a complex function that itself cannot be used in an optimization problem.
 
 Despite its usefulness, modeling ML models as MILP problems has significant limitations. The biggest limitation is the capability of MILP solvers which limits the ML model size.  With neural networks, for example, only models with at most hundreds of neurons can be effectively formulated as MILPs and then optimized. In practice, formulating into MILPs and optimizing all large modern ML models such as convolutional neural networks and transformer networks is computationally infeasible. However, if small neural networks are all that is required for the specific application, the methods implemented in this package can be useful. Secondly, only piecewise linear ML models can be formulated as MILP problems. For example, with neural networks this entails using activation functions such as $ReLU$.
 
 Input convex neural networks (ICNNs) are a special type of machine learning model that can be formulated as linear optimization problems (LP). The convexity limits the expressiveness of the ICNN but the LP formulation enables fast optimization of even very large ICNNs. If the data or the function being modeled is approximately convex, ICNNs can provide similar accuracy to regular neural networks. If a ML model is used in some of the contexts mentioned in the first paragraph, ICNNs can be used instead of neural networks without the computational limitations of MILP models.
 
 ## Getting started
 
-The following sections [Tree ensembles](tree_ensembles.md), [Neural networks](nns_introduction.md), [Neural networks in larger optimization problems](nns_in_larger.md) and [Input convex neural networks](icnns.md) give simple demonstrations on how to use the package. 
-Examples on multiprocessing features as well as more detailed code can be found in the `examples/`-folder of the [package repository](https://github.com/gamma-opt/Gogeta.jl).
+The following Features-section gives simple demonstrations on how to use the package. 
+More examples and detailed code can be found in the `examples/`-folder of the [package repository](https://github.com/gamma-opt/Gogeta.jl).
diff --git a/docs/src/literature.md b/docs/src/literature.md
@@ -23,6 +23,7 @@ The most important papers for our work are listed here. In these works, more in-
 
     * *Tong, J., Cai, J., & Serra, T. (2024). Optimization Over Trained Neural Networks: Taking a Relaxing Walk. arXiv preprint arXiv:2401.03451.*
 * **Partition based formulation of NN:**
+
     * *Calvin Tsay, Jan Kronqvist, Alexander Thebelt, & Ruth Misener. (2021). Partition-based formulations for mixed-integer optimization of trained ReLU neural networks.*
 
 * **Tree ensembles:**

diff --git a/docs/src/neural_networks.md b/docs/src/neural_networks.md
@@ -1,8 +1,8 @@
-# Formulation of NN with big-M approach
+# Formulation of NNs with the big-M approach
 
-The first way to formulate NN as a MIP is to use function [`NN_formulate!`](@ref). This formulation is based on the following paper: [Fischetti and Jo (2018)](literature.md). For more detailed information with examples, please see next [jupyter notebook](https://github.com/gamma-opt/Gogeta.jl/blob/main/examples/neural_networks/example_1_neural_networks.ipynb).
+Neural networks can be formulated as MIPs using the function [`NN_formulate!`](@ref). The formulation is based on the following paper: [Fischetti and Jo (2018)](literature.md). For more detailed information with examples, please see the [jupyter notebook](https://github.com/gamma-opt/Gogeta.jl/blob/main/examples/neural_networks/example_1_neural_networks.ipynb).
 
-Suppose you have a trained neural network `NN_model` with known boundaries for input variables (`init_U`, `init_L`), then a trained NN can be formulated as `JuMP` model:
+Assuming you have a trained neural network `NN_model` with known boundaries for input variables (`init_U`, `init_L`), a trained NN can be formulated as `JuMP` model:
 
 ```julia
 using Flux
@@ -21,15 +21,15 @@ set_silent(jump_model) # set desired parameters
 bounds_U, bounds_L = NN_formulate!(jump_model, NN_model, init_U, init_L; bound_tightening="standard", compress=true)
 ```
 
-The function returns boundaries for each neuron and the `jump_model` is updated in the function. By default objective function of the `jump_model` is set to the dummy function *"Max 1"*.
+The function returns boundaries for each neuron and the `jump_model` is updated by the function. By default, the objective function of the `jump_model` is set to the dummy function *"Max 1"*.
 
-This formulation enables compression by setting `compress=true`. Compression drops inactive neurons (or dead neurons) and decreases size of the MILP.
+With this function, compression can be enabled by setting `compress=true`. Compression drops inactive neurons (dead neurons) and thus decreases size of the MILP.
 
 Possible bound-tightening strategies include: `fast` (default), `standard`, `output`, and `precomputed`.
 
 !!! note
 
-    When you use `precomputed` bound-tightening, you should also provide upper and loswer boundaries for the neurons (`U_bounds`, `L_bounds`) and nothing is returned.
+    When you use `precomputed` bound-tightening, you should also provide upper and lower boundaries for the neurons (`U_bounds`, `L_bounds`) and nothing is returned.
 
 ```julia
  NN_formulate!(jump_model, NN_model, init_U, init_L; bound_tightening="precomputed", U_bounds=bounds_U, L_bounds=bounds_L, compress=true)
@@ -42,26 +42,26 @@ Possible bound-tightening strategies include: `fast` (default), `standard`, `out
 ```julia
  NN_formulate!(jump_model, NN_model, init_U, init_L; bound_tightening="output", U_out=U_out, L_out=L_out, compress=true)
 ```
-## Compression of the NN using bounds 
+## Compression of the NN using precomputed bounds 
 
-Given lower and upper bounds (`bounds_U`, `bounds_L`) for neurons, the NN can be compressed. The function [`NN_compress`](@ref) will return modified compressed NN along with indexes of dropped neurons.
+Given lower and upper bounds (`bounds_U`, `bounds_L`) for neurons, the NN can be compressed. The function [`NN_compress`](@ref) will return the modified compressed NN along with indexes of dropped neurons.
 
 ```julia
 compressed, removed = NN_compress(NN_model, init_U, init_L, bounds_U, bounds_L)
 ```
 
-## Calculation of the formulation output
+## Calculation of the model output
 
-When you have a ready formulation of the neural network, you can calculate the output of `JuMP` model with a function [`forward_pass!`](@ref)
+When you have a ready formulation of the neural network, you can calculate the output of `JuMP` model with the function [`forward_pass!`](@ref)
 
 ```julia
 forward_pass!(jump_model, [-1.0, 0.0])
 ```
-## Running the formulation in parallel
+## Performing the formulation in parallel
 
 !!! tip
 
-    If formulation with `standard` bound-tightening takes too slow, you can reduce computation time by running formulation in parallel. For this you need to innitialize 'workers' and set `parallel = true`.  See next [jupyter notebook](https://github.com/gamma-opt/Gogeta.jl/tree/main/examples/neural_networks/example_2_neural_networks_parallel) for a more detailed explanation.
+    If formulation with `standard` bound-tightening is too slow, computational time can be reduced by running the formulation in parallel. For this, workers need to be initialized and `parallel`-argument set to true.  See the [jupyter notebook](https://github.com/gamma-opt/Gogeta.jl/tree/main/examples/neural_networks/example_2_neural_networks_parallel) for a more detailed explanation.
 
 ```julia
 # Create the workers
@@ -95,4 +95,5 @@ end
 jump = NN_model()
 @time U, L = NN_formulate!(jump_model, NN_model, init_U, init_L; bound_tightening="standard", silent=false, parallel=true);
 ```
-In the next section, we will look at the Psplit formulation of NNs.
+
+Here Gurobi is used. For other solvers this procedure might be simpler, since an environment doesn't have to be created for each of the workers.
diff --git a/docs/src/nns_in_larger.md b/docs/src/nns_in_larger.md
@@ -49,6 +49,4 @@ In addition to `Flux.Chain` neural networks, [`NN_incorporate!`](@ref) also acce
 NN_incorporate!(jump_model, "folder/parameters.json", output, x, y; U_in=init_U, L_in=init_L)
 ```
 
-Where "folder/parameters.json" is the relative path of the JSON file containing the neural network parameters.
-
-In the next section, we look at the special case of the neural networks.
+Where "folder/parameters.json" is the relative path of the JSON file containing the neural network parameters.