Add files via upload

april-tools · Sep 29, 2023 · b8c8030 · b8c8030
commit b8c8030
Show file tree

Hide file tree

Showing 94 changed files with 9,035 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -0,0 +1,136 @@
+# Squared Non-monotonic Probabilistic Circuits
+
+This repository contains the official implementation of _squared non-monotonic PCs_,
+as well as relevant information to reproduce the experiments of our paper "Subtractive Mixture Models via Squaring: Representation and Learning".
+
+## How to Run Experiments?
+
+### Download the Data
+
+Each data set should be downloaded and placed in the ```./datasets``` directory,
+which is the default one.
+
+#### UCI Datasets
+
+The continuous UCI data sets that are commonly used in the _normalizing flow_ literature, i.e.,
+Power, Gas, Hepmass, MiniBoone and also BSDS300,
+can be downloaded from [zenodo](https://zenodo.org/record/1161203#.Wmtf_XVl8eN).
+
+#### Sentences sampled from GPT2
+
+The sentences sampled from GPT2 four our experiments on model distillation can be downloaded from []().
+After downloading it, you need to decompress it in the ```./datasets/language/gpt2_commongen``` directory.
+
+### Run the same hyperparameters grid searches
+
+The directory ```econfigs/``` contains configuration files of the same hyperparameter grid searches we performed for all our experiments.
+See below section about running grids of experiments for details.
+
+### Run simple experiments
+
+Simple experiments can be run by executing the Python module ```scripts.experiment.py```.
+For a complete overview of the parameters to pass to it, it is suggested to read its code. 
+
+For example, to run an experiment with a ```MonotonicPC``` having input layers computing splines over 32 knots
+and on the synthetic dataset ```cosine```, you can execute
+```shell
+python -m scripts.experiment --dataset cosine --model MonotonicPC --num-components 8 --splines --spline-knots 32 \
+    --optimizer Adam --learning-rate 1e-3 --batch-size 128 --verbose
+```
+The ```--num-components``` argument is used to provide the number of components of each sum unit in the
+tensorized circuit architecture built.
+
+Note that the flag ```--verbose``` will enable terminal logging (e.g., to show the loss).
+All the models are learned by minimizing the negative log-likelihood on the
+training data with gradient descent.
+
+In addition, to run an experiment with a squared non-monotonic PC -- ```BornPC``` --
+on the artificially-constructed data set ```cosine```, you can execute
+```shell
+python -m scripts.experiment --dataset cosine --model BornPC --num-components 4 \
+    --optimizer Adam --learning-rate 1-3
+    --batch-size 128 --verbose --num-samples 10000
+```
+The ```--num-samples``` argument is used to specify the number of samples to draw from the
+artificial distribution to construct the training split.
+An additional 10%/20% amount of samples will be drawn to construct the validation/test split.
+
+#### Logging Metrics and Models
+
+To log metrics locally such as the test average log-likelihood or to observe training curves,
+one can use either ```tensorboard```.
+For ```tensorboard``` it is sufficient to specify
+```--tboard-path /path/to/tboard-directory```
+with an arbitrarily chosen path that will contain Tensorboard files.
+
+It is possible to save the best checkpoint of the model, that will be updated only upon
+an improvement of the loss on the validation data.
+To enable this, you can specify
+```--save-checkpoint``` and ```--checkpoint-path /path/to/checkpoints```
+with a path that will contain the model's weights in the ```.pt``` PyTorch format.
+
+In the checkpoints path it will be also saved additional information, e.g.,
+the (quantized) probability density/mass functions estimated by the models on the
+artificial continuous/discrete 2D data sets (with the name ```pdf.npy``` or ```pmf.npy```). 
+
+### Run a Grid of Experiments
+
+To run a batch of experiments, e.g., as to do a hyperparameters grid search,
+you can use the ```scripts.grid``` module by specifying a grid configuration JSON file.
+The directory ```./econfigs``` contains some examples of such configuration file.
+
+The fields to specify are the following:
+
+- ```common``` contains parameters to pass to ```scripts.experiment```
+  that are common to each experiment of the batch. 
+- ```datasets```contains the list of data sets on which each experiment instance will be executed on.
+- ```grid.common``` contains a grid of hyperparameters.
+  Each hyperparameter is a pair ```"name": value``` where value can be either a single value or a list of values.
+  A products of lists will be performed as to retrieve all the possible configurations of hyperparameters in the grid.
+- ```grid.models``` contains additional hyperparameter configurations that are specific for some
+  dataset or some model. Each entry in ```grid.models``` is a dictionary from a single dataset or a list of datasets,
+  to a set of maps from model names to hyperparameter configurations. The semantic is that the hyperparameters specified
+  in ```grid.models``` will overwrite the ones in ```grid.common``` for some specific combination of datasets and models.
+
+To run a batch of experiments, you can execute
+```shell
+python -m scripts.grid path/to/config.json
+```
+You can also use the flag ```--dry-run``` to just print the list of generated commands, without running them.
+This is particularly useful in combination with job schedulers on clusters, e.g., Slurm.
+
+Additionally, one can specify a number of experiments that will be distatched in parallel
+(by default only one experiment will be runned) by specifying ```--num-jobs k```, where k is the maximum number
+of experiments that will be "alive" at each time.
+
+Instead of specifying parallel jobs that will be runned on the same device,
+you can also specify multiple devices on which the experiments will be dispatched on.
+This can be done with ```--multi-devices```.
+For instance, ```--multi-devices cuda:0 cuda:2 cpu``` will dispatch three experiment at a time,
+respectively on devices ```cuda:0```, ```cuda:2``` and ```cpu```.
+
+Finally, you can specify independent repetition for each experiment of the batch,
+which will append a different ```--seed``` argument for each experiment command to launch.
+This can be done with, for instance, ```--num-repetitions 5```.
+
+Disclaimer: in case of repeated runs the checkpoints that are saved are not reliable,
+as they can be overwritten by repeated run.
+
+#### Run a Grid of Experiments (on Slurm)
+
+To run a grid of experiments on a Slurm cluster, we first need to configure some constants in the ```slurm/launch.sh```
+utility scripts, such as the Slurn partition to use, the maximum number of parallel jobs, the needed resources,
+and the path to a local directory of nodes in order to save model checkpoints and tensorboard logs.
+
+Then, we need to generate the commands to dispatch and save it to a text file.
+For this purpose, it is possible to use the script ```scripts.grid``` (see above) with the argument ```--dry-run```.
+For instance, to generate the commands to execute for the experiments on UCI data sets, it suffices to run the command
+```shell
+python -m scripts.grid econfigs/uci-data-splines.json --dry-run > exps-uci-data-splines.txt
+```
+Finally, the Bash script ```slurm/launch.sh``` will automatically dispatch an array of Slurm jobs to execute.
+```shell
+EXPS_ID=uci-data bash slurm/launch.sh exps-uci-data-splines.txt
+```
+The Slurm jobs should now appear somewhere in the queue, which can be viewed by running ```squeue```.
+
diff --git a/econfigs/artificial-data-continuous.json b/econfigs/artificial-data-continuous.json
@@ -0,0 +1,45 @@
+{
+  "common": {
+    "tboard-path": "tboard-runs/artificial-continuous",
+    "checkpoint-path": "checkpoints/artificial-continuous",
+    "save-checkpoint": true,
+    "num-epochs": 1200,
+    "device": "cpu",
+    "num-workers": 2,
+    "patience-threshold": 1e-3,
+    "early-stop-patience": 50
+  },
+  "datasets": ["mring", "cosine", "funnel", "banana"],
+  "grid": {
+    "common": {
+      "num-components": [4, 8, 12],
+      "optimizer": "Adam",
+      "compute-layer": "cp",
+      "batch-size": 256,
+      "splines": true,
+      "spline-order": 2,
+      "spline-knots": 32,
+      "verbose": true,
+      "standardize": true,
+      "init-scale": 1.0,
+      "learning-rate": 1e-3
+    },
+    "models": {
+      "banana|cosine|mring|funnel": {
+        "MonotonicPC": {
+          "init-method": "uniform"
+        },
+        "BornPC": {
+          "non-monotonic": {
+            "init-method": "uniform",
+            "exp-reparam": false
+          },
+          "monotonic": {
+            "init-method": "uniform",
+            "exp-reparam": true
+          }
+        }
+      }
+    }
+  }
+}
diff --git a/econfigs/artificial-data-discrete-binomials.json b/econfigs/artificial-data-discrete-binomials.json
@@ -0,0 +1,45 @@
+{
+  "common": {
+    "tboard-path": "tboard-runs/artificial-discrete-binomials",
+    "checkpoint-path": "checkpoints/artificial-discrete-binomials",
+    "save-checkpoint": true,
+    "num-epochs": 1200,
+    "device": "cpu",
+    "num-workers": 2,
+    "patience-threshold": 1e-3,
+    "early-stop-patience": 50
+  },
+  "datasets": ["mring", "banana", "funnel", "cosine"],
+  "grid": {
+    "common": {
+      "num-components": [32, 64, 96],
+      "optimizer": "Adam",
+      "compute-layer": "cp",
+      "batch-size": 256,
+      "binomials": true,
+      "verbose": true,
+      "standardize": true,
+      "init-scale": 1.0,
+      "learning-rate": 1e-3,
+      "discretize": true,
+      "discretize-bins": 32
+    },
+    "models": {
+      "banana|cosine|mring|funnel": {
+        "MonotonicPC": {
+          "init-method": "uniform"
+        },
+        "BornPC": {
+          "non-monotonic": {
+            "init-method": "uniform",
+            "exp-reparam": false
+          },
+          "monotonic": {
+            "init-method": "uniform",
+            "exp-reparam": true
+          }
+        }
+      }
+    }
+  }
+}
diff --git a/econfigs/artificial-data-discrete-categoricals.json b/econfigs/artificial-data-discrete-categoricals.json
@@ -0,0 +1,45 @@
+{
+  "common": {
+    "tboard-path": "tboard-runs/artificial-discrete-categoricals",
+    "checkpoint-path": "checkpoints/artificial-discrete-categoricals",
+    "save-checkpoint": true,
+    "num-epochs": 1200,
+    "device": "cpu",
+    "num-workers": 2,
+    "patience-threshold": 1e-3,
+    "early-stop-patience": 50
+  },
+  "datasets": ["mring", "banana", "funnel", "cosine"],
+  "grid": {
+    "common": {
+      "num-components": [4, 8, 12],
+      "optimizer": "Adam",
+      "compute-layer": "cp",
+      "batch-size": 256,
+      "binomials": false,
+      "verbose": true,
+      "standardize": true,
+      "init-scale": 1.0,
+      "learning-rate": 1e-3,
+      "discretize": true,
+      "discretize-bins": 32
+    },
+    "models": {
+      "banana|cosine|mring|funnel": {
+        "MonotonicPC": {
+          "init-method": "uniform"
+        },
+        "BornPC": {
+          "non-monotonic": {
+            "init-method": "uniform",
+            "exp-reparam": false
+          },
+          "monotonic": {
+            "init-method": "uniform",
+            "exp-reparam": true
+          }
+        }
+      }
+    }
+  }
+}
diff --git a/econfigs/gaussian-ring-pcs.json b/econfigs/gaussian-ring-pcs.json
@@ -0,0 +1,28 @@
+{
+  "common": {
+    "tboard-path": "tboard-runs/gaussian-ring",
+    "checkpoint-path": "checkpoints/gaussian-ring",
+    "save-checkpoint": true,
+    "num-epochs": 1200,
+    "device": "cpu",
+    "num-workers": 0,
+    "early-stop-patience": 100
+  },
+  "datasets": ["ring"],
+  "grid": {
+    "common": {
+      "optimizer": "Adam",
+      "batch-size": 64,
+      "learning-rate": 5e-3,
+      "verbose": true,
+      "standardize": true,
+      "init-scale": 1.0
+    },
+    "models": {
+      "ring": {
+        "MonotonicPC": { "num-components": [2, 16], "init-method": "normal" },
+        "BornPC": { "num-components": 2, "init-method": "normal" }
+      }
+    }
+  }
+}
diff --git a/econfigs/gpt2-commongen.json b/econfigs/gpt2-commongen.json
@@ -0,0 +1,29 @@
+{
+  "common": {
+    "tboard-path": "tboard-runs/gpt2-commongen",
+    "checkpoint-path": "checkpoints/gpt2-commongen",
+    "save-checkpoint": true,
+    "num-epochs": 200,
+    "device": "cpu",
+    "num-workers": 2,
+    "early-stop-patience": 3
+  },
+  "datasets": ["gpt2_commongen"],
+  "grid": {
+    "common": {
+      "num-components": [32, 64, 128, 256, 512, 1024],
+      "optimizer": "Adam",
+      "region-graph": "linear-vtree",
+      "batch-size": 4096,
+      "verbose": true,
+      "init-scale": 1.0,
+      "dtype": "float32"
+    },
+    "models": {
+      "gpt2_commongen": {
+        "MonotonicPC": { "init-method": ["log-normal", "uniform", "dirichlet"], "learning-rate": [5e-3, 1e-2, 5e-2] },
+        "BornPC": { "init-method": ["normal", "positive-skewed-normal", "uniform"], "learning-rate": [5e-3, 1e-2, 5e-2] }
+      }
+    }
+  }
+}
diff --git a/econfigs/image-binomials.json b/econfigs/image-binomials.json
@@ -0,0 +1,51 @@
+{
+  "common": {
+    "tboard-path": "tboard-runs/image-binomials",
+    "checkpoint-path": "checkpoints/image-binomials",
+    "save-checkpoint": false,
+    "num-epochs": 250,
+    "device": "cpu",
+    "num-workers": 2,
+    "patience-threshold": 1e-3,
+    "early-stop-patience": 3
+  },
+  "datasets": [
+    "MNIST"
+  ],
+  "grid": {
+    "common": {
+      "num-components": [
+        64,
+        128,
+        256
+      ],
+      "optimizer": "Adam",
+      "compute-layer": "cp",
+      "batch-size": 256,
+      "binomials": true,
+      "verbose": true,
+      "init-scale": 1.0,
+      "learning-rate": [
+        5e-3,
+        1e-2,
+        5e-2
+      ],
+      "region-graph": "quad-tree",
+      "input-mixture": true,
+      "dtype": "float32"
+    },
+    "models": {
+      "MNIST": {
+        "MonotonicPC": {
+          "init-method": "uniform"
+        },
+        "BornPC": {
+          "non-monotonic": {
+            "init-method": "uniform",
+            "exp-reparam": false
+          }
+        }
+      }
+    }
+  }
+}